Split `function` into `constructor`/`relation`/(custom)`function`; Remove `default`; Disallow `function` lookup in the RHS of a rule #461

FTRobbin · 2024-11-06T23:35:56Z

This PR fixes Issue #420. Lookup actions in rules will now cause a type error LookupInRuleDisallowed.

Move specifically, this PR:

Removes -naive flag and related desugaring code due to being replaced by this change.
Fixes 'fail' failing due to not being identified as global in the remove_global rewrite pass.
Adds new positive and negative tests for this type error.
Rewrites the existing tests for compatibility with the new type error.

codspeed-hq · 2024-11-06T23:38:41Z

CodSpeed Performance Report

Merging #461 will not alter performance

_{Comparing haobinni-0904 (9163ac3) with haobinni-0904 (8a75e7e)}

Summary

✅ 10 untouched benchmarks
🆕 2 new benchmarks

Benchmarks breakdown

	Benchmark	`haobinni-0904`	`haobinni-0904`	Change
🆕	`merge_read`	N/A	286.8 µs	N/A
🆕	`set_sort_function`	N/A	369.6 µs	N/A

yihozhang · 2024-11-06T23:52:32Z

src/gj.rs

            // for the later atoms, we consider everything
            let mut timestamp_ranges =
                vec![0..u32::MAX; cq.query.funcs().collect::<Vec<_>>().len()];
-            if do_seminaive {


I believe we still want to keep the -naive flag as well as the code here, so the user can still do naive evaluation (useful for debugging, also have a different semantics than semi-naive for "unsafe" egglog).

Yeah, I agree that we should probably keep naive evaluation

I don't think we should keep it. It is unhelpful and adds complexity to the later passes as they need to support the naive semantics correctly. I am also against keeping it as a use-at-your-own-risk feature.

For Egglog users, if you don't use delete, semi-naive and naive are indistinguishable, so it is unhelpful for debugging. If you use delete, then you care much about performance, and there's no point in using naive. Even when you debug with unsafe features, you should probably debug the semi-naive case instead because that's what you want.

For Egglog developers, I see some value in being a sanity check for ensuring semi-naive is implemented correctly. But we are not doing this now, and it can also be done through stronger end-to-end test cases.

That was convincing to me. I've never personally used it before
@yihozhang what do you think?

What complexity does this add to later passes? I thought the only difference between seminaive and naive is that in the seminaive case, we split the original query into many small queries depending on the timestamps (i.e., what this code snippet does).

I strongly recommend that we keep the naive evaluation. We can view semi-naive as an optimization of the naive evaluation, and this optimization is not always semantic-preserving, when given bizarre programs that violate certain assumptions. Examples include

rules that use extract / user-defined primitives

rules where the merge function is not associative or idempotent

I'm also not confident that our semi-naive is implemented correctly- do we really update timestamp every time we update the table? I just looked at table.rs and it seems we don't update the timestamp for at least get_mut. The naive evaluation serves as a ground truth for this purpose. Personally, when I am debugging a primitive I wrote, the first thing I do is to disable semi-naive evaluation.

If we keep the naive flag, either we need to split the latter passes into two, which is unlikely, or each piece of downstream code must support both naive and semi-naive. I am skeptical about the claim that semi-naive code would just work for naive. For one thing, I don't see how semi-naive can be implemented as pure syntactic rewrites. As you pointed out, something more needs to happen to the timestamps in the semi-naive case. And yet, the naive flag is not used anywhere else in the codebase.

There are cases where the two give different semantics. However, the naive semantics is not more helpful to the users in those cases because they still need semi-naive to work in the end.

For your last point: Firstly, you still need to debug your new primitive for semi-naive. Secondly, I will only trust naive evaluation as a ground truth if it is well supported with a clear separation between the two semantics. Relying on your program to be tested to produce the test output is a terrible idea to me.

However, I do think this discussion raised a significant concern about the correctness of Egglog. We should investigate the issue.

Conclusion: Keep

Not comprising the comfort of -naive for a smaller core

Too much effort to actually implement -naive, we settle for the timestamp hack

Reconsider when merging the new backend

saulshanabrook · 2024-11-07T15:26:01Z

I'm a little worried about the time improvements, especially for lambda... That one is so dramatic I worry that maybe the semantics of the example changed?

Seeing all the changes, I also worry about the degradation for UX, it seems just more unwieldy with this change.

I know you said that automated desuguring had some issues, but I am wondering if that could be used to at least addressost of these cases? Where there particular issues with it for some cases or just in general?

This reverts commit 35e8532.

oflatt · 2024-11-19T21:49:52Z

src/typechecking.rs

+        //Disallowing Let/Set actions to look up non-constructor functions in rules
+        for action in head.iter() {
+            match action {
+                GenericAction::Let(_, _, Expr::Call(_, symbol, _)) => {


Don't you need to check if this is a function vs a constructor call here?

oflatt · 2024-11-19T21:50:25Z

tests/eggcc-extraction.egg

-      ((set (ival lhs) (IntI n n))))
+(rule ((= lhs (Node (PureOp (Const (IntT) (const) (Num n)))))
+       (= nval (IntI n n)))
+      ((set (ival lhs) nval)))


IntI is a constructor, not a function
So this isn't a lookup and doesn't need to be changed

FTRobbin · 2024-12-03T02:15:24Z

Bumping up this PR again for review:

Removed default keyword, resolving Removing :default keyword #421.
Reverted the -naive flag change as discussed.
Implemented splitting function into three subtypes: constructor/relation/(custom)function, resolving Disallowing looking up non-constructor functions #420 & Renaming function whose output is an E-class to constructor #422.
- function is not allowed in the RHS of a rule (merge functions are unchecked).
  - function can have eqSort as an output.
    - It does not have union as the default merge function
- constructor and relation are allowed
  - A constructor expression inserts a new enode
    - It has union as the default merge function
  - A relation expression inserts a new edge
- A constructor’s output type must be sort
Reverted the previous changes to tests and then fixed all the tests again
Added new negative and positive tests

yihozhang

Nice job! Code is clean, with detailed documentation and good tests.

Let us make a release after this PR is merged.

yihozhang · 2024-12-03T09:12:41Z

src/actions.rs

+                            function.insert(values, value, ts);
+                            value
+                        } else {
+                            return Err(Error::NotFoundError(NotFoundError(format!(


Nit: this should probably provide a different error message given this PR, since the only case this is possible is when there is a bug in our checker.

I think it can still be triggered by merge functions reading a table, e.g.:

(function foo () i64) (function bar () i64 :merge (foo)) (set (bar) 0) (fail (set (bar) 1))

yihozhang · 2024-12-03T09:17:02Z

src/ast/mod.rs

+    /// Now `MathVec` can be used as an input or output sort.
+    Sort(Span, Symbol, Option<(Symbol, Vec<Expr>)>),
+
+    /// Egglog supports three types of functions


nice documentation!

yihozhang · 2024-12-03T09:18:55Z

src/ast/mod.rs

+    /// A relation models a datalog-style mathematical relation
+    /// It can only be defined through the `relation` command
+    ///
+    /// A custom function is a map


The map part of the definition is a bit weird to me, but it's fine for now since we will need a big documentation refactor anyway.

yihozhang · 2024-12-03T09:19:37Z

src/ast/mod.rs

    /// ```text
-    /// (sort MathVec (Vec Math))
+    /// (Constructor Add (i64 i64) Math)


nit Constructor -> constructor

yihozhang · 2024-12-03T09:20:34Z

src/ast/mod.rs

-    /// ```
-    ///
-    /// However, this function is not:
+    /// Specifically, a custom function can also have an EqSort output type:


nit: this is not an example where a custom function has an EqSort output

yihozhang · 2024-12-03T09:27:50Z

tests/cykjson.egg

@@ -13,7 +13,7 @@
 (rule ((End a s)
       (= s (getString pos)))
      ((P 1 pos a)
-       (union (B 1 pos a) (T a s)))) 


Should B be a constructor so that union would still work?

union still works for functions whose output is an EqSort. I have added this case to the documentation.

yihozhang · 2024-12-03T09:32:43Z

tests/intersection.egg

@@ -27,8 +27,8 @@
 (let t2p (f (f b2)))
 (union t2 t2p)

-(union (intersect a1 a2) a3)
-(union (intersect b1 b2) b3)
+(set (intersect a1 a2) a3)


this could just be union?

yihozhang · 2024-12-03T09:35:16Z

tests/lambda.egg

@@ -62,20 +62,20 @@
 (function evals-to (Term) Value)

 (rule ((= e (Val val)))
-      ((union (evals-to e) val)))


Same here. It's an interesting choice to make evals-to a custom function instead of a constructor. This is a new pattern to me, but it seems to work.

This is indeed a new pattern. I'll explain more during the meeting.

FTRobbin added 7 commits October 23, 2024 12:48

Get rid of semi-naive flag

74999fb

Global lookup tests

3e55b0d

Merge branch 'main' of github.com:egraphs-good/egglog into haobinni-0904

82994eb

Add fail corner case to remove_global

dc69b30

Starting to rewrite tests

dee21e8

Merge branch 'main' of github.com:egraphs-good/egglog into haobinni-0904

9e10fac

Rewrote all failed tests

35e8532

FTRobbin requested a review from a team as a code owner November 6, 2024 23:35

FTRobbin requested review from mwillsey and removed request for a team November 6, 2024 23:35

FTRobbin added 2 commits November 6, 2024 15:42

Minor

a593155

Minor

dc42cd3

yihozhang reviewed Nov 6, 2024

View reviewed changes

FTRobbin added 2 commits November 15, 2024 13:09

Revert "Rewrote all failed tests"

2573722

This reverts commit 35e8532.

Merge branch 'main' of github.com:egraphs-good/egglog into haobinni-0904

c484c92

oflatt reviewed Nov 19, 2024

View reviewed changes

FTRobbin added 11 commits November 22, 2024 10:24

New typechecking pass forbidding lookups

204dd5c

Merge branch 'main' of github.com:egraphs-good/egglog into haobinni-0904

6c8cfbc

Fix array.egg

3ff74c7

Fix combined_nested.egg

3efe333

Fix cykjson.egg

94a64c0

Fix cyk.egg

a18063e

Revert previous fixes to tests

2abdd10

Fixing eggcc-extraction.egg in progress

a040ce9

Fix eggcc-extraction.egg

63fcbff

Fix fusion.egg

4cdeebe

FIx herbie.egg

59334fb

FTRobbin added 19 commits December 2, 2024 16:00

Fix unstable-fn.egg

f021915

Fix until.egg

76b6b75

Fix repro-duplicated-var.egg

cdfa007

Fix integration_test.rs

76c2b7f

Enforcing the output type of constructors to be sort

539b62c

Add negative test constructor_non_sort.egg

3d4e3b5

Fix python_array_optimize.egg

0a4cf65

Fix stresstest_large_expr.egg

5e544b0

Fix integration_test.rs

5c65ba6

Fix python_array_optimize.egg

9437825

Fix stresstest_large_expr.egg

3af6e07

Disable union merge for functions

8c76699

Add set_sort_function.egg

2cdfd40

Delete combinators_function.egg

27d646d

Fix intersection.egg

31d45b3

Fix unification-points-to.egg

76025d9

Fix cyk.egg

bee3b4c

Add negate test union_non_sort.egg

2c5f58f

Minor

768d943

FTRobbin changed the title ~~Delete -naive flag and disallow lookup actions in rules~~ Split function into constructor/relation/(custom)function; Remove default; Disallow function lookup in the RHS of a rule Dec 3, 2024

FTRobbin changed the title ~~Split function into constructor/relation/(custom)function; Remove default; Disallow function lookup in the RHS of a rule~~ Split function into constructor/relation/(custom)function; Remove default; Disallow function lookup in the RHS of a rule Dec 3, 2024

Fix eggcc-extraction.egg

aa35478

yihozhang approved these changes Dec 3, 2024

View reviewed changes

yihozhang reviewed Dec 3, 2024

View reviewed changes

FTRobbin added 3 commits December 3, 2024 10:02

Minor

6842b88

Fix intersection.egg

b8a1b2f

Add merge_read.egg

9163ac3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split `function` into `constructor`/`relation`/(custom)`function`; Remove `default`; Disallow `function` lookup in the RHS of a rule #461

Split `function` into `constructor`/`relation`/(custom)`function`; Remove `default`; Disallow `function` lookup in the RHS of a rule #461

FTRobbin commented Nov 6, 2024

codspeed-hq bot commented Nov 6, 2024 •

edited

Loading

yihozhang Nov 6, 2024

oflatt Nov 25, 2024

FTRobbin Nov 25, 2024

oflatt Nov 25, 2024

yihozhang Nov 25, 2024

FTRobbin Nov 26, 2024 •

edited

Loading

FTRobbin Nov 27, 2024

saulshanabrook commented Nov 7, 2024

oflatt Nov 19, 2024

oflatt Nov 19, 2024

FTRobbin commented Dec 3, 2024

yihozhang left a comment

yihozhang Dec 3, 2024

FTRobbin Dec 3, 2024

yihozhang Dec 3, 2024

yihozhang Dec 3, 2024

yihozhang Dec 3, 2024

yihozhang Dec 3, 2024

yihozhang Dec 3, 2024

FTRobbin Dec 3, 2024 •

edited

Loading

yihozhang Dec 3, 2024

yihozhang Dec 3, 2024

FTRobbin Dec 3, 2024

Split function into constructor/relation/(custom)function; Remove default; Disallow function lookup in the RHS of a rule #461

Are you sure you want to change the base?

Split function into constructor/relation/(custom)function; Remove default; Disallow function lookup in the RHS of a rule #461

Conversation

FTRobbin commented Nov 6, 2024

codspeed-hq bot commented Nov 6, 2024 • edited Loading

Merging #461 will not alter performance

Summary

Benchmarks breakdown

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

FTRobbin Nov 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saulshanabrook commented Nov 7, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

FTRobbin commented Dec 3, 2024

yihozhang left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

FTRobbin Dec 3, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Split `function` into `constructor`/`relation`/(custom)`function`; Remove `default`; Disallow `function` lookup in the RHS of a rule #461

Split `function` into `constructor`/`relation`/(custom)`function`; Remove `default`; Disallow `function` lookup in the RHS of a rule #461

codspeed-hq bot commented Nov 6, 2024 •

edited

Loading

FTRobbin Nov 26, 2024 •

edited

Loading

FTRobbin Dec 3, 2024 •

edited

Loading