Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: asynchronous kernel checking #6368

Draft
wants to merge 195 commits into
base: kernel-env-base
Choose a base branch
from
Draft

Conversation

Kha
Copy link
Member

@Kha Kha commented Dec 11, 2024

This PR implements executing kernel checking in parallel to elaboration, which is a prerequisite for parallelizing elaboration itself.

Stacked on #6214

mhuisi and others added 30 commits November 19, 2024 09:26
Co-Authored-By: Sebastian Ullrich <[email protected]>
leanprover#5835 contains a brittle test that uses an FVar ID, which caused a
failure on master. This PR changes that test to use a declaration
instead.
This is syntactically more general than before, though up to eta
expansion it make no difference.
…eanprover#5837)

This PR fixes an old auto-completion bug where `x.` would issue
nonsensical completions when `x.` could not be elaborated as a dot
completion.
…5840)

This PR changes `lean_sharecommon_{eq,hash}` to only consider the
salient bytes of an object, and not any bytes of any
unspecified/uninitialized unused capacity.

Accessing uninitialized storage results in undefined behaviour.

This does not seem to have any semantics disadvantages: If objects
compare equal after this change, their salient bytes are still equal. By
contrast, if the actual identity of allocations needs to be
distinguished, that can be done by just comparing pointers to the
storage.

If we wanted to retain the current logic, we would need initialize the
otherwise unused parts to some specific value to avoid the undefined
behaviour.

Closes leanprover#5831
This PR fixes a bug at `isDefEq` when `zetaDelta := false`. See new test
for a small example that exposes the issue.
This PR adds a case to `Level.geq` that is present in the kernel's level
`is_geq` procedure, making them consistent with one another.

This came up during testing of `lean4lean`. Currently `Level.geq`
differs from `level::is_geq` in the case of `max u v >= imax u v`. The
elaborator function is overly pessimistic and yields `false` on this
while the kernel function yields true. This comes up concretely in the
`Trans` class:
```lean
class Trans (r : α → β → Sort u) (s : β → γ → Sort v) (t : outParam (α → γ → Sort w)) where
  trans : r a b → s b c → t a c
```
The type of this class is `Sort (max (max (max (max (max (max 1 u) u_1)
u_2) u_3) v) w)` (where `u_1 u_2 u_3` are the levels of `α β γ`), but if
you try writing that type explicitly then the `class` command fails.
Omitting the type leaves the `class` to infer the universe level (the
command assumes the level is correct, and the kernel agrees it is), but
including the type then the elaborator checks the level inequality with
`Level.geq` and fails.

---------

Co-authored-by: Kyle Miller <[email protected]>
…prover#6131)

This PR fixes a bug at the definitional equality test (`isDefEq`). At
unification constraints of the form `c.{u} =?= c.{v}`, it was not trying
to unfold `c`. This bug did not affect the kernel.

closes leanprover#6117
This PR improves the `#print` command for structures to show all fields
and which parents the fields were inherited from, hiding internal
details such as which parents are represented as subobjects. This
information is still present in the constructor if needed. The pretty
printer for private constants is also improved, and it now handles
private names from the current module like any other name; private names
from other modules are made hygienic.

Example output for `#print Monad`:
```
class Monad.{u, v} (m : Type u → Type v) : Type (max (u + 1) v)
number of parameters: 1
parents:
  Monad.toApplicative : Applicative m
  Monad.toBind : Bind m
fields:
  Functor.map : {α β : Type u} → (α → β) → m α → m β
  Functor.mapConst : {α β : Type u} → α → m β → m α
  Pure.pure : {α : Type u} → α → m α
  Seq.seq : {α β : Type u} → m (α → β) → (Unit → m α) → m β
  SeqLeft.seqLeft : {α β : Type u} → m α → (Unit → m β) → m α
  SeqRight.seqRight : {α β : Type u} → m α → (Unit → m β) → m β
  Bind.bind : {α β : Type u} → m α → (α → m β) → m β
constructor:
  Monad.mk.{u, v} {m : Type u → Type v} [toApplicative : Applicative m] [toBind : Bind m] : Monad m
resolution order:
  Monad, Applicative, Bind, Functor, Pure, Seq, SeqLeft, SeqRight
```

Suggested by Floris van Doorn [on
Zulip](https://leanprover.zulipchat.com/#narrow/channel/270676-lean4/topic/.23print.20command.20for.20structures/near/482503637).
…er#6120)

This PR adds theorems `BitVec.(getMsbD, msb)_(rotateLeft, rotateRight)`.

We follow the same strategy taken for `getLsbD`, constructing the
necessary auxilliary theorems first (relying on different hypotheses)
and then generalizing.

---------

Co-authored-by: Siddharth <[email protected]>
Co-authored-by: Tobias Grosser <[email protected]>
…#6132)

This PR duplicates the verification API for
`List.attach`/`attachWith`/`pmap` over to `Array`.
…er#6133)

This PR replaces `Array.feraseIdx` and `Array.insertAt` with
`Array.eraseIdx` and `Array.insertIdx`, both of which take a `Nat`
argument and a tactic-provided proof that it is in bounds. We also have
`eraseIdxIfInBounds` and `insertIdxIfInBounds` which are noops if the
index is out of bounds. We also provide a `Fin` valued version of
`Array.findIdx?`. Together, these quite ergonomically improve the array
indexing safety at a number of places in the compiler/elaborator.
This PR adds support for displaying multiple threads in the trace
profiler output.

`TraceState.tid` needs to be adjusted for this purpose, which is not
done yet by the Lean elaborator as it is still single-threaded.
This PR fixes `trace.profiler.pp` not using the term pretty printer.

Fixes leanprover#5872
This PR fixes the run-time evaluation of `(default : Float)`.
This PR does the same fix as leanprover#6104, but such that it doesn't break the
test/the file in `Plausible`. This is done by not creating unused let
binders in metavariable types that are made by `elimMVar`. (This is also
a positive thing for users looking at metavariable types, for example in
error messages)

We get rid of `skipAtMostNumBinders`. This function was originally
defined for the purpose of making this test work, but it is a hack
because it allows cycles in the metavariable context.

It would make sense to split these changes into 2 PRs, but I combined
them here to show that the combination of them closes leanprover#6013 without
breaking anything

Closes leanprover#6013
This PR converts 3 doc-string to module docs since it seems that this is
what they were intended to be!
This PR avoids runtime array bounds checks in places where it can
trivially be done at compile time.

None of these changes are of particular consequence: I mostly wanted to
learn how much we do this, and what the obstacles are to doing it less.
This PR adds lemmas for extracting a given bit of a `BitVec` obtained
via `sub`/`neg`/`sshiftRight'`/`abs`.

---------

Co-authored-by: Kim Morrison <[email protected]>
This PR enables contributors to modify `changelog-*` labels simply by
writing a comment with the desired label.
…orem (leanprover#6146)

This PR fixes a non-termination bug that occurred when generating the
match-expression splitter theorem. The bug was triggered when the proof
automation for the splitter theorem repeatedly applied `injection` to
the same local declaration, as it could not be removed due to forward
dependencies. See issue leanprover#6065 for an example that reproduces this issue.

closes leanprover#6065
…nprover#6161)

This PR ensures whitespace is printed before `+opt` and `-opt`
configuration options when pretty printing, improving the experience of
tactics such as `simp?`.

Reported [on
Zulip](https://leanprover.zulipchat.com/#narrow/channel/270676-lean4/topic/Minor.20simp.3F.20annoyances/near/483736310)
This PR adds a slight performance improvement to reflection of `if`
statements that I noticed by profiling Leanwuzla against SMTCOMP's
`non-incremental/QF_BV/fft/Sz256_6616.smt2`.

In particular:
1. The profile showed about 4 percent of the total run time were spent
constructing these decidable instances in reflection of `if` statements.
We can construct them much quicker by hand as they always have the same
structure
2. This delays construction of these statements until we actually
generate the reflection proof that we wish to submit to the kernel. Thus
if we encounter a SAT instad of an UNSAT problem we will not spend time
generating these expressions anymore.

```
baseline
  Time (mean ± σ):     31.236 s ±  0.258 s
  Range (min … max):   30.899 s … 31.661 s    10 runs

after
  Time (mean ± σ):     30.671 s ±  0.288 s
  Range (min … max):   30.350 s … 31.156 s    10 runs
```
tydeu and others added 6 commits December 11, 2024 04:10
…over#3696)

This PR makes all message constructors handle pretty printer errors.

Prior to this change, pretty printer errors in messages were not
uniformly handled. In core, some printers capture their errors (e.g.,
`ppExprWithInfos` and `ppTerm` ) and some do not (e.g., `ppGoal` and
`ppSignature`) propagate them to whatever serializes the message (e.g.,
the frontend).

To resolve this inconsistency and uniformly handle errors, the signature
for `ofLazy` now uses `BaseIO`. As such, all printers been adapted to
capture any errors within them and print similar messages to
`ppExprWithInfos` and `ppTerm` on such errors.
This PR generalizes `DecidableRel` to allow a heterogeneous relation.
Fixes leanprover#4460 (and similar future changes) making prior messages
inaccessible to metaprograms such as linters
@Kha Kha added the changelog-language Language features, tactics, and metaprograms label Dec 11, 2024
@Kha Kha force-pushed the async-tc branch 3 times, most recently from 68aba49 to 08859a4 Compare December 11, 2024 23:09
@github-actions github-actions bot added the toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN label Dec 13, 2024
@leanprover-community-bot
Copy link
Collaborator

leanprover-community-bot commented Dec 13, 2024

Mathlib CI status (docs):

  • ❗ Batteries/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase b862e2d25163584322bcca54e5e56d671da1f258 --onto 48be424eaa2ae06972e9cfec4d355906b532204d. (2024-12-13 09:54:31)
  • ❗ Batteries/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase b862e2d25163584322bcca54e5e56d671da1f258 --onto fe45ddd6105078a0a3bd855e5d94673e794f6b88. (2024-12-28 17:23:57)
  • ❗ Batteries/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase b862e2d25163584322bcca54e5e56d671da1f258 --onto 24a8561ec4e302f4e0cba07632fddacd6f6e0323. (2024-12-30 11:43:03)

@Kha Kha changed the base branch from master to kernel-env-base December 28, 2024 16:59
@Kha
Copy link
Member Author

Kha commented Dec 28, 2024

!bench

@leanprover-bot
Copy link
Collaborator

Here are the benchmark results for commit 9b36754.
The entire run failed.
Found no significant differences.

@Kha
Copy link
Member Author

Kha commented Dec 30, 2024

!bench

@leanprover-bot
Copy link
Collaborator

Here are the benchmark results for commit ed17ab3.
There were significant changes against commit 4600bb1:

  Benchmark                  Metric                    Change
  ========================================================================
- big_do                     branches                    4.0%    (329.8 σ)
- big_do                     instructions                1.1%     (63.6 σ)
- big_do                     maxrss                      5.8%    (124.1 σ)
+ big_omega.lean             branch-misses             -64.9%   (-143.9 σ)
+ big_omega.lean             branches                  -70.2%  (-4059.2 σ)
+ big_omega.lean             instructions              -71.1%  (-3756.4 σ)
+ big_omega.lean             maxrss                    -54.7%   (-105.7 σ)
+ big_omega.lean             task-clock                -74.6%   (-347.4 σ)
+ big_omega.lean             wall-clock                -74.8%   (-327.5 σ)
+ big_omega.lean MT          branch-misses             -69.1%   (-546.0 σ)
+ big_omega.lean MT          branches                  -72.4%  (-9861.4 σ)
+ big_omega.lean MT          instructions              -72.9% (-13535.0 σ)
+ big_omega.lean MT          maxrss                    -66.4%    (-47.5 σ)
+ big_omega.lean MT          task-clock                -80.9%   (-135.4 σ)
+ big_omega.lean MT          wall-clock                -80.9%   (-135.2 σ)
+ binarytrees                instructions               -8.4%  (-3211.4 σ)
+ binarytrees.st             instructions               -8.4%  (-5177.1 σ)
- bv_decide_mul              branches                    1.5%    (382.5 σ)
+ bv_decide_mul              instructions               -1.2%   (-215.5 σ)
- bv_decide_mul              maxrss                      1.6%     (42.3 σ)
+ bv_decide_realworld        branches                   -1.5%   (-178.8 σ)
+ bv_decide_realworld        instructions               -3.9%   (-529.8 σ)
- bv_decide_realworld        maxrss                      4.1%     (15.4 σ)
+ bv_decide_realworld        task-clock                 -1.8%    (-11.5 σ)
+ bv_decide_realworld        wall-clock                 -2.1%    (-15.1 σ)
+ const_fold                 instructions               -3.5%    (-82.7 σ)
+ deriv                      instructions               -1.9%    (-45.7 σ)
+ ilean roundtrip            instructions               -3.3%   (-500.2 σ)
- import Lean                branches                  340.3%   (6011.9 σ)
- import Lean                instructions              238.0%   (3468.3 σ)
- import Lean                maxrss                      2.9%     (30.2 σ)
- import Lean                task-clock                 84.4%     (17.1 σ)
- import Lean                wall-clock                 84.1%     (17.1 σ)
- language server startup    branches                   50.5%     (88.1 σ)
- language server startup    instructions               38.5%     (61.2 σ)
- language server startup    wall-clock                 19.1%     (14.3 σ)
- liasolver                  instructions                3.5%   (1661.0 σ)
+ liasolver                  maxrss                     -1.4%    (-37.9 σ)
- libleanshared.so           binary size                 3.9%
+ nat_repr                   instructions               -5.0%  (-1756.3 σ)
+ parser                     instructions               -4.4%  (-1586.4 σ)
+ parser                     maxrss                     -1.4%    (-37.9 σ)
+ qsort                      instructions               -4.0%  (-1866.2 σ)
+ qsort                      maxrss                     -1.4%    (-16.6 σ)
- rbmap                      instructions                1.5%    (184.0 σ)
+ rbmap_1                    instructions               -1.8%   (-102.6 σ)
+ rbmap_fbip                 instructions              -10.3%   (-708.9 σ)
+ rbmap_library              instructions               -1.9%   (-171.1 σ)
- reduceMatch                instructions               17.1%    (675.1 σ)
- reduceMatch                maxrss                     19.8%    (820.9 σ)
- reduceMatch                task-clock                 12.8%     (14.3 σ)
- reduceMatch                wall-clock                 12.4%     (12.6 σ)
- simp_arith1                branch-misses               3.6%     (12.7 σ)
- simp_arith1                branches                  129.5%   (4425.6 σ)
- simp_arith1                instructions               84.0%   (2749.9 σ)
- simp_arith1                maxrss                      3.6%     (64.0 σ)
- simp_arith1                task-clock                 43.7%     (66.8 σ)
- simp_arith1                wall-clock                 43.3%     (55.9 σ)
- stdlib                     attribute application     199.1%     (90.3 σ)
- stdlib                     dsimp                      13.6%     (20.7 σ)
- stdlib                     fix level params           10.8%     (21.2 σ)
- stdlib                     instructions               13.0%   (4005.3 σ)
- stdlib                     maxrss                     39.4%    (163.9 σ)
- stdlib                     process pre-definitions   162.7%    (292.1 σ)
- stdlib                     share common exprs         12.5%    (131.1 σ)
- stdlib                     tactic execution           25.2%    (338.6 σ)
- stdlib                     task-clock                 16.7%   (1478.4 σ)
- stdlib                     type checking              38.8%     (56.0 σ)
- stdlib                     wall-clock                 14.9%     (77.2 σ)
- stdlib size                bytes .olean                2.7%
- stdlib size                lines                       2.3%
- stdlib size                lines C                     1.4%
- stdlib size                max dynamic symbols         1.4%
- tests/bench/ interpreted   instructions                3.1%    (515.5 σ)
- tests/bench/ interpreted   maxrss                      3.6%     (11.5 σ)
- tests/bench/ interpreted   task-clock                 32.8%     (35.8 σ)
- tests/bench/ interpreted   wall-clock                 12.2%     (21.4 σ)
- tests/compiler             sum binary sizes            4.7%
+ unionfind                  instructions               -8.2%  (-2666.8 σ)
+ unionfind                  task-clock                -11.6%    (-20.6 σ)
+ unionfind                  wall-clock                -11.6%    (-20.6 σ)
- workspaceSymbols           instructions                1.7%   (2541.3 σ)
- workspaceSymbols           maxrss                      3.5%     (73.0 σ)

@Kha
Copy link
Member Author

Kha commented Jan 4, 2025

!bench

@leanprover-bot
Copy link
Collaborator

Here are the benchmark results for commit ed17ab3.
There were significant changes against commit 4600bb1:

  Benchmark                  Metric                    Change
  ========================================================================
- big_do                     branches                    3.5%    (293.7 σ)
- big_do                     instructions                2.7%    (159.7 σ)
- big_do                     maxrss                      5.6%    (119.1 σ)
+ big_omega.lean             branch-misses             -65.2%   (-145.9 σ)
+ big_omega.lean             branches                  -70.4%  (-4095.9 σ)
+ big_omega.lean             instructions              -70.7%  (-3688.5 σ)
+ big_omega.lean             maxrss                    -54.7%   (-105.8 σ)
+ big_omega.lean             task-clock                -73.7%   (-331.5 σ)
+ big_omega.lean             wall-clock                -73.9%   (-312.6 σ)
+ big_omega.lean MT          branch-misses             -69.8%   (-564.6 σ)
+ big_omega.lean MT          branches                  -72.5%  (-9938.0 σ)
+ big_omega.lean MT          instructions              -72.8% (-13467.7 σ)
+ big_omega.lean MT          maxrss                    -66.4%    (-47.4 σ)
+ big_omega.lean MT          task-clock                -80.1%   (-129.2 σ)
+ big_omega.lean MT          wall-clock                -80.2%   (-129.0 σ)
+ binarytrees                task-clock                 -5.3%    (-15.3 σ)
- bv_decide_mul              branches                    1.2%    (317.9 σ)
- bv_decide_mul              maxrss                      1.4%     (38.1 σ)
+ bv_decide_realworld        branches                   -1.5%   (-178.5 σ)
+ bv_decide_realworld        instructions               -2.6%   (-350.6 σ)
- bv_decide_realworld        maxrss                      4.1%     (15.6 σ)
- import Lean                branch-misses               8.4%    (125.8 σ)
- import Lean                branches                  349.4%   (6047.4 σ)
- import Lean                instructions              236.1%   (3460.0 σ)
- import Lean                maxrss                      2.9%     (30.4 σ)
- import Lean                task-clock                 74.1%     (15.9 σ)
- import Lean                wall-clock                 74.0%     (15.9 σ)
- language server startup    branches                   50.2%     (87.7 σ)
- language server startup    instructions               37.2%     (59.7 σ)
- language server startup    wall-clock                 18.6%     (14.0 σ)
- libleanshared.so           binary size                 1.9%
- parser                     instructions                1.0%    (347.7 σ)
- reduceMatch                instructions               17.6%    (691.7 σ)
- reduceMatch                maxrss                     19.8%    (821.3 σ)
- reduceMatch                task-clock                 16.1%     (17.4 σ)
- reduceMatch                wall-clock                 15.7%     (15.5 σ)
- simp_arith1                branch-misses               8.3%     (28.3 σ)
- simp_arith1                branches                  129.8%   (4430.4 σ)
- simp_arith1                instructions               83.1%   (2733.9 σ)
- simp_arith1                maxrss                      3.5%     (62.6 σ)
- simp_arith1                task-clock                 40.1%     (62.9 σ)
- simp_arith1                wall-clock                 39.8%     (52.7 σ)
- stdlib                     attribute application     204.4%     (91.1 σ)
- stdlib                     dsimp                      13.9%     (21.0 σ)
- stdlib                     fix level params           10.3%     (20.4 σ)
- stdlib                     instructions               13.8%   (4219.1 σ)
- stdlib                     maxrss                     39.4%    (163.8 σ)
- stdlib                     process pre-definitions   164.7%    (293.5 σ)
- stdlib                     share common exprs         12.7%    (132.2 σ)
- stdlib                     tactic execution           26.8%    (355.2 σ)
- stdlib                     task-clock                 17.6%   (1544.3 σ)
- stdlib                     type checking              39.0%     (56.2 σ)
- stdlib                     wall-clock                 15.7%     (80.6 σ)
- stdlib size                bytes .olean                2.7%
- stdlib size                lines                       2.3%
- stdlib size                lines C                     1.4%
- stdlib size                max dynamic symbols         1.4%
- tests/bench/ interpreted   instructions                5.5%    (893.1 σ)
- tests/bench/ interpreted   maxrss                      3.5%     (11.2 σ)
- tests/bench/ interpreted   task-clock                 36.9%     (39.0 σ)
- tests/bench/ interpreted   wall-clock                 15.6%     (26.5 σ)
- tests/compiler             sum binary sizes            2.8%
- workspaceSymbols           instructions                5.3%   (7607.4 σ)
- workspaceSymbols           maxrss                      3.5%     (72.3 σ)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog-language Language features, tactics, and metaprograms toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN
Projects
None yet
Development

Successfully merging this pull request may close these issues.