Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to parse SQL (Dialects.TSQL) #31183

Open
3 tasks done
Habeeb556 opened this issue Nov 27, 2024 · 7 comments
Open
3 tasks done

Unable to parse SQL (Dialects.TSQL) #31183

Habeeb556 opened this issue Nov 27, 2024 · 7 comments
Assignees
Labels
data:connect:mssql Related to MSSSQL sqllab Namespace | Anything related to the SQL Lab

Comments

@Habeeb556
Copy link
Contributor

Habeeb556 commented Nov 27, 2024

Bug description

After upgrade from 4.0.1 to 4.1.1 I encountered an issue with permission

DB engine Error
You may have an error in your SQL statement. Unable to parse script

This error occurred for users to whom I assigned specific permissions on the dbo schema only. When I changed the permissions to full access for the entire database id:, the issue was resolved. I noticed that this problem only happens with the mssql.py engine, while the others (MySQL, PostgreSQL, and Oracle) are working fine.

Log:

WARNING:superset.sql_parse:Unable to parse SQL (Dialects.TSQL): Template(body=[Output(nodes=[TemplateData(data='SELECT ...')])])

Screenshots/recordings

image

Superset version

master / latest-dev

Python version

3.11

Node version

I don't know

Browser

Not applicable

Additional context

No response

Checklist

  • I have searched Superset docs and Slack and didn't find a solution to my problem.
  • I have searched the GitHub issue tracker and didn't find a similar bug report.
  • I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
@dosubot dosubot bot added data:connect:mssql Related to MSSSQL sqllab Namespace | Anything related to the SQL Lab labels Nov 27, 2024
Copy link

dosubot bot commented Nov 27, 2024

The issue you're encountering seems to be related to SQL parsing for the TSQL dialect in the mssql.py engine after upgrading from Superset version 4.0.1 to 4.1.1. Superset uses the sqlglot library to handle SQL parsing for different database dialects, including TSQL, by mapping the "mssql" engine to the sqlglot dialect Dialects.TSQL [1].

The error message suggests that there might be a problem with how SQL statements are being parsed for users with specific permissions on the dbo schema. When you grant full access to the entire database, the issue is resolved, indicating that it might be related to permission handling or SQL parsing specific to the dbo schema.

To address this issue, you might want to:

  1. Verify that the SQL statements being executed are compatible with the TSQL dialect as interpreted by sqlglot.
  2. Check if there are any changes in the permission handling logic for the mssql.py engine between versions 4.0.1 and 4.1.1 that could affect schema-specific permissions.
  3. Review the Superset logs for any additional error messages or warnings that might provide more context on the parsing failure.

If the problem persists, consider testing with a simplified SQL query to isolate whether the issue is with specific SQL syntax or permission settings.

To continue talking to Dosu, mention @dosu.

@Habeeb556
Copy link
Contributor Author

@dosu How Can I handle this as before upgrade?

Copy link

dosubot bot commented Nov 27, 2024

To handle the issue as before the upgrade, you can try the following steps:

  1. Review SQL Syntax: Ensure that the SQL queries being executed are fully compatible with the TSQL dialect as interpreted by the sqlglot library. Simplify the queries if possible to identify any specific syntax causing the issue.

  2. Check Permissions: Since the issue seems to be related to permissions on the dbo schema, review the permission settings. You might need to adjust the permissions to match the previous configuration before the upgrade. If full access resolves the issue, consider incrementally adjusting permissions to identify the minimum required settings.

  3. Superset Logs: Examine the Superset logs for any additional error messages or warnings that could provide more context on the parsing failure. This might help identify specific SQL statements or permission settings causing the issue.

  4. Rollback: If the above steps do not resolve the issue, consider rolling back to version 4.0.1 temporarily while further investigating the changes between versions 4.0.1 and 4.1.1 that might have affected SQL parsing or permission handling.

These steps should help you manage the issue while maintaining the necessary permissions and SQL functionality.

To continue talking to Dosu, mention @dosu.

@Habeeb556
Copy link
Contributor Author

I have workaround this, using the old sql_parse.py script.

@geido
Copy link
Member

geido commented Dec 2, 2024

@betodealmeida this might also use your help! 🙏

@betodealmeida
Copy link
Member

This error occurred for users to whom I assigned specific permissions on the dbo schema only. When I changed the permissions to full access for the entire database id:, the issue was resolved. I noticed that this problem only happens with the mssql.py engine, while the others (MySQL, PostgreSQL, and Oracle) are working fine.

Right, we need to parse the query in order to determine which catalogs/schemas/tables are being accessed when the user doesn't have full access to the database. If the user has full access we just don't parse the query.

In the past this was done using sqlparse, which is very lenient, so it was much harder to get an error when parsing SQL. The problem was that it was much easier to trick the parser, and the parsing was not dialect-specific, so we had many security holes.

We're currently in the process of switching the parsing to sqlglot, which has dialect-specific parsing. Unfortunately some engines don't have corresponding sqlglot dialects, and we have to fallback to a generic parser.

For MSSQL we use the T-SQL dialect, which AFAIK is the correct one. If you can share the query that's causing problems (anonymized is fine) we can take a look, and file a ticket upstream with sqlglot.

@betodealmeida
Copy link
Member

Assigned it to you temporarily, @Habeeb556, since we need the query to investigate this more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data:connect:mssql Related to MSSSQL sqllab Namespace | Anything related to the SQL Lab
Projects
None yet
Development

No branches or pull requests

3 participants