Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: bv_decide regression; proof that worked before now times out, and LRAT file sizes are much larger #5664

Closed
3 tasks done
alexkeizer opened this issue Oct 9, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@alexkeizer
Copy link
Contributor

Prerequisites

Please put an X between the brackets as you perform the following steps:

Description

Consider the following MWE:

import Std.Tactic.BVDecide

example
  (a k n : BitVec 64) :
  n < 18446744073709551615#64 - k →
    ((¬a + k + 1#64 - a ≤ a + k - a ∧ ¬a + k + 1#64 + n - a ≤ a + k - a) ∧
        ¬a - (a + k + 1#64) ≤ a + k + 1#64 + n - (a + k + 1#64)) ∧
      ¬a + k - (a + k + 1#64) ≤ a + k + 1#64 + n - (a + k + 1#64) := by
  bv_decide

On nightly-2024-09-10 this proof worked. On nightly-2024-10-08 it times out instead, even if we double the time limit to 120 seconds.

Context

Example extracted from LNSym, specifically, from Arm/Memory/SeparateProofs.lean.

Steps to Reproduce

  1. Open the MWE in nightly-2024-09-10, observe that the proof goes through
  2. Now change the toolchain to nightly-2024-10-08, and observe that it times out

Expected behavior: I expect the proof to go through

Actual behavior: It times out

Versions

nightly-2024-10-08 is the version with the bad behaviour.

Additional Information

If we use bv_decide? to generate an LRAT in the working version, we get a roughly 12MB file.
Using bv_check with that LRAT in the new version gives a type mismatch, but more surprisingly, if we then use bv_decide? in the new Lean version, it manages to spit out 300+ MB of LRAT before it gets killed (with the 120 second timeout).

We see a similar thing happen in other examples from that same PR, which do manage to go through in the new toolchain, but where the LRAT had to be regenerated and the new file is 10x larger than the old LRAT (in particular, now exceeding GitHub's limit for filesize).

Impact

Add 👍 to issues you consider important. If others are impacted by this issue, please ask them to add 👍 to it.

@alexkeizer alexkeizer added the bug Something isn't working label Oct 9, 2024
@alexkeizer alexkeizer changed the title bug: bv_decide regression bug: bv_decide regression; proof that worked before now times out, and LRAT file sizes are much larger Oct 9, 2024
@hargoniX
Copy link
Contributor

hargoniX commented Oct 10, 2024

The hard sub problem here is:

import Std.Tactic.BVDecide

set_option bv.ac_nf false in
set_option trace.Meta.Tactic.bv true in
set_option trace.profiler true in
example (a k n : BitVec 64) (h : n < 18446744073709551615#64 - k) :
    ¬a - (a + k + 1#64) ≤ a + k + 1#64 + n - (a + k + 1#64) := by
  bv_decide

This regression is because of the introduction of the ac_nf pass, which is rather unfortunate as it helps a lot in other situations :( You can disable it with set_option bv.ac_nf false but I hope I'll be able to modify it such that this passes.

@hargoniX
Copy link
Contributor

I tried running this with Bitwuzla and it takes a total of 48 seconds on this problem, it appears if you have normalization up to commutativity enabled this is just really hard unless you happen to hit the form in which you have already found the problem :( So the best solution here is most likely to just accept that we have to disable ac_nf here.

@alexkeizer
Copy link
Contributor Author

It's a shame we'll have to get rid of the ac_nf pass, but thanks for investigating!

alexkeizer added a commit to leanprover/LNSym that referenced this issue Oct 10, 2024
This is needed to work around leanprover/lean4#5664
TL;DR: the normalization pass actively made it harder for the SAT solver to prove our goals, causing much larger LRAT files and even timeouts for some pathological goals
alexkeizer added a commit to leanprover/LNSym that referenced this issue Oct 10, 2024
This is needed to work around leanprover/lean4#5664
TL;DR: the normalization pass actively made it harder for the SAT solver to prove our goals, causing much larger LRAT files and even timeouts for some pathological goals
github-merge-queue bot pushed a commit that referenced this issue Oct 15, 2024
This takes a few standalone bitvector problems, about inequalties, from
LNSym, and adds them as a benchmark to prevent further regressions with
bv_decide.

These problems are particularly interesting, because they've previously
had a bad interaction with bv_decides normalization pass, see
#5664.

---------

Co-authored-by: Henrik Böving <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants