-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AST-based SQLite driver #1
Conversation
Additionally, move the current SQLite driver prototype to a temporary class WP_SQLite_Driver_Prototype to be gradually moved to the new driver.
1dca9b6
to
e4df984
Compare
e4df984
to
beaa51d
Compare
@adamziel Thanks for the review! I think I addressed most of the issues and created tickets for the remaining ones. Please, let me know if I missed anything. |
|
||
/* | ||
* @TODO: Let's implement a more powerful AST-querying API. | ||
* See: https://github.com/WordPress/sqlite-database-integration/pull/164#discussion_r1855230501 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're refactoring the API, let's annotate each method we expect to change as @private
or @internal
and add a comment for the API consumers explaining the reasons not to use them yet. Otherwise, we'll introduce breaking changes without a prior notice. Maybe we could @deprecate
them right away – that way the IDEs will provide visual hints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adamziel Maybe, while we're behind the feature flag and have a bunch of TODO
s, we could still consider all the new APIs to be experimental and subject to change? Given we're under the Automattic org repo, maybe that would be enough? Or do you think it's safer to annotate all public methods for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be a separate ticket, we's just have to resolve it before handing the api over for consumption by another team
} | ||
|
||
if ( | ||
WP_MySQL_Lexer::ROLLBACK_SYMBOL === $token1->id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we bale out if we see ROLLBACK TO SAVEPOINT
until we can support that? Same for creating savepoints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually we do support them internally so maybe baling out would be as easy as preserving the savepoint semantics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ticket: #14
$alias_map[ $alias ] = $ref; | ||
} | ||
|
||
// 3. Compose the SELECT query to fetch ROWIDs to delete. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any chance we'll also see HAVING, GROUP BY, LEFT JOIN etc.? It's fine if we do, I'd just bale out if we see any unsupported criterion.
More broadly, it would be fantastic to have a catalog of things we do support (a whitelist/allowlist) and compare the incoming query against it rather than assume we know all the relevant parts of the incoming query and we can always cherry-pick the what's relevant. It could be as simple as a foreach loop that must make a bale/ignore/preserve/transform decision about every single node it sees. Thinking out loud, it could have a switch
statement with a catch-all default
case that bales out on anything we don't expect to see.
The mental model would be a version or "innocent until proven guilty" – "all queries error out unless they're explicitly supported".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any chance we'll also see HAVING, GROUP BY, LEFT JOIN etc.? It's fine if we do, I'd just bale out if we see any unsupported criterion.
Luckily, none of these is possible in a MySQL DELETE statement. That said, we don't have any specific handling of DELETE
with LIMIT
and SQLite seems to support it only with a specific compile flag.
More broadly, it would be fantastic to have a catalog of things we do support (a whitelist/allowlist)...
Do you mean the DELETE statement in particular, or more like a general rule? I'd like to be more restrictive in general and steer the driver more in the allowlist direction overall, but it can get pretty complicated — a foreach loop may not work very well, since nodes can be deeply nested, have different meanings in different context, order may change the meaning, etc.
That said, I do think we could steer it a bit more towards the "allowlist" approach overall, but it may be a different approach on a case-by-case basis, and maybe it's not absolutely necessary for all cases (e.g., SELECT and other read statements). Most importantly, all DDL statements should be allowlisted, so that we know what we support on the information schema side and so that the information schema data cannot become incorrect.
Considering how large this PR is and that at this stage it's still behind a feature flag, what do you think if I create tickets for these points and describe all of this? I'd like to get this initial PR done so that we can then work in smaller, more focused chunks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, separate issues seem fine for this. And sure, if a strict allowlist isn't an option then let's examine realistic options that move us in the right direction. Thank you for all the great work here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created a ticket: #15
return; | ||
} | ||
|
||
// @TODO: Translate DELETE with JOIN to use a subquery. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making an explicit decision about every AST node would a) make it apparent to the code reader b) prevent running a DELETE query with a condition different from what the developer supplied (and potentially leading to data loss)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is discussed in the comment above — I agree that we should go more in the "allowlist" direction, at least for DDL and write statements, but the particular implementation may differ on a case-by-case basis, and I'd create a separate ticket for that, if that makes sense to you.
Great work @JanJakes! There are specific directions to go from here, but regardless - this is such a huge improvement ❤️ |
Continuing the work from WordPress/sqlite-database-integration#164.