Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQLAlchemy source for Lumen AI #1051

Open
amaloney opened this issue Feb 6, 2025 · 3 comments · May be fixed by #1062
Open

SQLAlchemy source for Lumen AI #1051

amaloney opened this issue Feb 6, 2025 · 3 comments · May be fixed by #1062
Assignees
Labels
enhancement New feature or request

Comments

@amaloney
Copy link
Collaborator

amaloney commented Feb 6, 2025

Include BigQuery as a source for Lumen AI.

@amaloney amaloney added the enhancement New feature or request label Feb 6, 2025
@droumis droumis added this to NIH-NCI Feb 6, 2025
@droumis droumis moved this to Todo in NIH-NCI Feb 6, 2025
@droumis droumis changed the title BigQuery source BigQuery source for Lumen AI Feb 6, 2025
@ahuang11
Copy link
Contributor

ahuang11 commented Feb 7, 2025

Ideally, we would simply have a SQLAlchemySource so we don't have to re-invent the wheel of wrapping all the SQL dialects.

@philippjfr
Copy link
Member

Can you expand on how you see SQLAlchemySource working in a way that avoids implementing different dialetcts? I can see having a source that you give a sqlalchemy Session however there's two major problems with that:

  • It wouldn't be serializable
  • Arbitrary SQL execution will require .execute(text(sql_expr)) which afaik does not have a translation layer for different dialects

@ahuang11
Copy link
Contributor

ahuang11 commented Feb 7, 2025

Arbitrary SQL execution will require .execute(text(sql_expr))

Yes, I was thinking this route.

which afaik does not have a translation layer for different dialects

I don't think this matters since the LLM is the translation layer, which is why we include this in the prompt: - Use only {{ dialect }} SQL syntax. If translation/transpiling layer is important, we can use sqlglot https://github.com/tobymao/sqlglot/tree/main?tab=readme-ov-file#examples

Image

The reason I suggested SQLAlchemy is because you can use URLs for credentials:

url = (
    "snowflake://<user_login_name>:<password>"
    "@<account_identifier>/<database_name>"
    "?warehouse=<warehouse_name>"
)

Or a pydantic/parameterized model for common keywords, like here:

ConnectionComponents(
    driver=AsyncDriver.POSTGRESQL_ASYNCPG,
    username="prefect",
    password="prefect_password",
    database="postgres"
)

https://github.com/PrefectHQ/prefect/blob/main/src/integrations/prefect-sqlalchemy/prefect_sqlalchemy/credentials.py#L88

@amaloney amaloney changed the title BigQuery source for Lumen AI SQLAlchemy source for Lumen AI Feb 11, 2025
@amaloney amaloney linked a pull request Feb 18, 2025 that will close this issue
@droumis droumis moved this from Todo to In Progress in NIH-NCI Feb 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: In Progress
Development

Successfully merging a pull request may close this issue.

3 participants