AST-based SQLite driver #1

JanJakes · 2025-01-13T15:21:05Z

Continuing the work from WordPress/sqlite-database-integration#164.

Additionally, move the current SQLite driver prototype to a temporary class WP_SQLite_Driver_Prototype to be gradually moved to the new driver.

JanJakes · 2025-02-07T15:54:10Z

@adamziel Thanks for the review! I think I addressed most of the issues and created tickets for the remaining ones. Please, let me know if I missed anything.

composer.json

grammar-tools/MySQLParser.g4

health-check.php

adamziel · 2025-02-11T14:27:27Z

wp-includes/parser/class-wp-parser-node.php

+
+	/*
+	 * @TODO: Let's implement a more powerful AST-querying API.
+	 *        See: https://github.com/WordPress/sqlite-database-integration/pull/164#discussion_r1855230501


If we're refactoring the API, let's annotate each method we expect to change as @private or @internal and add a comment for the API consumers explaining the reasons not to use them yet. Otherwise, we'll introduce breaking changes without a prior notice. Maybe we could @deprecate them right away – that way the IDEs will provide visual hints.

@adamziel Maybe, while we're behind the feature flag and have a bunch of TODOs, we could still consider all the new APIs to be experimental and subject to change? Given we're under the Automattic org repo, maybe that would be enough? Or do you think it's safer to annotate all public methods for now?

It could be a separate ticket, we's just have to resolve it before handing the api over for consumption by another team

wp-includes/sqlite-ast/class-wp-sqlite-driver.php

adamziel · 2025-02-11T14:34:22Z

wp-includes/sqlite-ast/class-wp-sqlite-driver.php

+					}
+
+					if (
+						WP_MySQL_Lexer::ROLLBACK_SYMBOL === $token1->id


Should we bale out if we see ROLLBACK TO SAVEPOINT until we can support that? Same for creating savepoints.

Actually we do support them internally so maybe baling out would be as easy as preserving the savepoint semantics

Ticket: #14

adamziel · 2025-02-11T15:25:00Z

wp-includes/sqlite-ast/class-wp-sqlite-driver.php

+				$alias_map[ $alias ] = $ref;
+			}
+
+			// 3. Compose the SELECT query to fetch ROWIDs to delete.


Any chance we'll also see HAVING, GROUP BY, LEFT JOIN etc.? It's fine if we do, I'd just bale out if we see any unsupported criterion.

More broadly, it would be fantastic to have a catalog of things we do support (a whitelist/allowlist) and compare the incoming query against it rather than assume we know all the relevant parts of the incoming query and we can always cherry-pick the what's relevant. It could be as simple as a foreach loop that must make a bale/ignore/preserve/transform decision about every single node it sees. Thinking out loud, it could have a switch statement with a catch-all default case that bales out on anything we don't expect to see.

The mental model would be a version or "innocent until proven guilty" – "all queries error out unless they're explicitly supported".

Any chance we'll also see HAVING, GROUP BY, LEFT JOIN etc.? It's fine if we do, I'd just bale out if we see any unsupported criterion.

Luckily, none of these is possible in a MySQL DELETE statement. That said, we don't have any specific handling of DELETE with LIMIT and SQLite seems to support it only with a specific compile flag.

More broadly, it would be fantastic to have a catalog of things we do support (a whitelist/allowlist)...

Do you mean the DELETE statement in particular, or more like a general rule? I'd like to be more restrictive in general and steer the driver more in the allowlist direction overall, but it can get pretty complicated — a foreach loop may not work very well, since nodes can be deeply nested, have different meanings in different context, order may change the meaning, etc.

That said, I do think we could steer it a bit more towards the "allowlist" approach overall, but it may be a different approach on a case-by-case basis, and maybe it's not absolutely necessary for all cases (e.g., SELECT and other read statements). Most importantly, all DDL statements should be allowlisted, so that we know what we support on the information schema side and so that the information schema data cannot become incorrect.

Considering how large this PR is and that at this stage it's still behind a feature flag, what do you think if I create tickets for these points and describe all of this? I'd like to get this initial PR done so that we can then work in smaller, more focused chunks.

Sure, separate issues seem fine for this. And sure, if a strict allowlist isn't an option then let's examine realistic options that move us in the right direction. Thank you for all the great work here!

I created a ticket: #15

adamziel · 2025-02-11T15:30:44Z

wp-includes/sqlite-ast/class-wp-sqlite-driver.php

+			return;
+		}
+
+		// @TODO: Translate DELETE with JOIN to use a subquery.


Making an explicit decision about every AST node would a) make it apparent to the code reader b) prevent running a DELETE query with a condition different from what the developer supplied (and potentially leading to data loss)

I guess this is discussed in the comment above — I agree that we should go more in the "allowlist" direction, at least for DDL and write statements, but the particular implementation may differ on a case-by-case basis, and I'd create a separate ticket for that, if that makes sense to you.

adamziel · 2025-02-12T09:35:56Z

Great work @JanJakes! There are specific directions to go from here, but regardless - this is such a huge improvement ❤️

JanJakes added 30 commits November 19, 2024 16:05

Split, cleanup, and reorganize the SQLite driver prototype

aac00ce

Namespace-prefix the WIP SQLite driver for now to avoid naming conflicts

44641e1

Copy WP_SQLite_Translator_Tests and run them against the SQLite driver

be35587

Copy the core of WP_SQLite_Translator to WP_SQLite_Driver

c8ea855

Additionally, move the current SQLite driver prototype to a temporary class WP_SQLite_Driver_Prototype to be gradually moved to the new driver.

Add a base generic WP_Parser_Token class, add docs

62943d6

Complete WP_Parser_Node helper methods, add tests

3449b0b

Add basic support for SELECT statements, add unit tests

65afd67

Add basic support for INSERT, UPDATE, REPLACE, DELETE

00ec46d

Add basic support for CREATE TABLE, implement data types

b8b4500

Handle system variables

6be220d

Add support for UPDATE with ORDER BY and LIMIT

70744b7

Handle specifics of the CREATE TABLE statement

797c3b7

Add basic ALTER TABLE support

647eaae

Fix MySQL syntax errors in tests

c515605

Introduce information schema builder & create information schema tables

1e5c311

Record CREATE TABLE table info in information schema

b617ef9

Record CREATE TABLE column info in information schema

dbd50e6

Record CREATE TABLE constraint info in information schema

6bdb7aa

Record CREATE TABLE inline constraint info in information schema

afeeb18

Sync constraint info to columns table when constraints are modified

2eaf077

Record ALTER TABLE ADD COLUMN(s) in information schema

d03892d

Record ALTER TABLE ADD CONSTRAINT in information schema

93f536e

Record ALTER TABLE CHANGE/MODIFY COLUMN in information schema

6940a5b

Record ALTER TABLE DROP COLUMN in information schema

127efc2

Record ALTER TABLE DROP INDEX in information schema

71265ed

Execute CREATE TABLE using information schema

71e8d08

Execute ALTER TABLE using information schema

4b4fb8f

Implement SHOW CREATE TABLE using information schema

a0a67c0

Implement SHOW INDEX using information schema

837655c

Implement SHOW GRANTS

f11fb09

JanJakes added 6 commits February 7, 2025 11:30

Use custom exception class for driver errors

e961bf5

Ignore transaction rollback errors when an exception occurs

fda096c

Return true for transactional commands

e4df6b9

Move database directory creation outside of the driver, revamp it

96e0208

Extract debug mode setting to an option

cb4d6a3

Improve driver docs and property naming

54c27a2

JanJakes force-pushed the ast-sqlite-driver branch from 1dca9b6 to e4df984 Compare February 7, 2025 10:31

Revamp error handling, extract error notice creation out from the driver

beaa51d

JanJakes force-pushed the ast-sqlite-driver branch from e4df984 to beaa51d Compare February 7, 2025 10:35

JanJakes added 3 commits February 7, 2025 16:40

Improve docs and naming

09fbcd9

Check for mimimum SQLite version

4d9d432

Remove the AST driver files from .gitattributes

078e2ce

JanJakes mentioned this pull request Feb 7, 2025

Review query escaping and add test coverage #8

Open

JanJakes requested a review from adamziel February 7, 2025 15:54