Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove semi colon on all transforms #1022

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

ahuang11
Copy link
Contributor

@ahuang11 ahuang11 commented Jan 31, 2025

I think at some point, I encountered some Lumen AI syntax error related to ; or comments

WITH gold_medals AS (
    SELECT
        "athlete_full_name",
        COUNT(*) AS gold_count
    FROM read_csv('olympic_medals.csv')
    WHERE "medal_type" = 'GOLD'
      AND "slug_game" LIKE '%-2022'  -- Assuming Winter Olympics are identified by the year
    GROUP BY "athlete_full_name"
)
SELECT
    "athlete_full_name",
    gold_count
FROM gold_medals
ORDER BY gold_count DESC
LIMIT 1

Not sure if this will fix, but I think it should be applied to all transforms anyways.

Ideally, these transforms would be applied to the generated SQL for visibility, and not after under the hood in pipeline (okay applied)

@ahuang11 ahuang11 requested a review from philippjfr January 31, 2025 20:02
Copy link

codecov bot commented Jan 31, 2025

Codecov Report

Attention: Patch coverage is 33.33333% with 4 lines in your changes missing coverage. Please review.

Project coverage is 57.69%. Comparing base (60ea02f) to head (1b746ee).

Files with missing lines Patch % Lines
lumen/ai/agents.py 0.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1022      +/-   ##
==========================================
- Coverage   57.70%   57.69%   -0.01%     
==========================================
  Files         109      109              
  Lines       14197    14199       +2     
==========================================
  Hits         8192     8192              
- Misses       6005     6007       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -628,16 +628,17 @@ async def _create_valid_sql(
else:
source = next(iter(sources))

sql_transforms = [SQLLimit(limit=1_000_000)]
for transform in sql_transforms:
sql_query = transform.apply(sql_query)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confused, why iterate over a loop when there is only a single item? Also why bake the transform into the query?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was just keeping the original structure, in case there were more transforms. I'm fine with dropping the loop.

Applying it on the original query so that it shows up in the SQL code editor, in case there's a bug in the SQLTransform output.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Baking it into the query means it falls outside of the programmable interface, e.g. if we want to toggle the limit on and off. So I'm -1 on that part.

Copy link
Contributor Author

@ahuang11 ahuang11 Feb 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the flip side, if the transform (limit) isn't visible to the user, the user might think it's an issue with the data (unaware there's a limit to 1 million rows; the "Full data" checkbox is pretty tiny and I didn't notice it initially because I was looking at the query):

image

if we want to toggle the limit on and off

I don't follow; can't it still be toggled on and off (e.g. SELECT * FROM read_parquet('num.parquet') AS num_table LIMIT 1000000, changes into SELECT * FROM read_parquet('num.parquet') AS num_table?

By the way, clicking "Full data" checkbox doesn't work as expected:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants