-
-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove semi colon on all transforms #1022
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1022 +/- ##
==========================================
- Coverage 57.70% 57.69% -0.01%
==========================================
Files 109 109
Lines 14197 14199 +2
==========================================
Hits 8192 8192
- Misses 6005 6007 +2 ☔ View full report in Codecov by Sentry. |
@@ -628,16 +628,17 @@ async def _create_valid_sql( | |||
else: | |||
source = next(iter(sources)) | |||
|
|||
sql_transforms = [SQLLimit(limit=1_000_000)] | |||
for transform in sql_transforms: | |||
sql_query = transform.apply(sql_query) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confused, why iterate over a loop when there is only a single item? Also why bake the transform into the query?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was just keeping the original structure, in case there were more transforms. I'm fine with dropping the loop.
Applying it on the original query so that it shows up in the SQL code editor, in case there's a bug in the SQLTransform output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Baking it into the query means it falls outside of the programmable interface, e.g. if we want to toggle the limit on and off. So I'm -1 on that part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the flip side, if the transform (limit) isn't visible to the user, the user might think it's an issue with the data (unaware there's a limit to 1 million rows; the "Full data" checkbox is pretty tiny and I didn't notice it initially because I was looking at the query):

if we want to toggle the limit on and off
I don't follow; can't it still be toggled on and off (e.g. SELECT * FROM read_parquet('num.parquet') AS num_table LIMIT 1000000
, changes into SELECT * FROM read_parquet('num.parquet') AS num_table
?
By the way, clicking "Full data" checkbox doesn't work as expected:
I think at some point, I encountered some Lumen AI syntax error related to
;
or commentsNot sure if this will fix, but I think it should be applied to all transforms anyways.
Ideally, these transforms would be applied to the generated SQL for visibility, and not after under the hood in pipeline (okay applied)