Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For PostgreSQL, explicitly create index on SequencedLeafData(TreeId, LeafIdentityHash) #3695

Merged
merged 3 commits into from
Jan 6, 2025

Conversation

robstradling
Copy link
Contributor

@robstradling robstradling commented Dec 9, 2024

In the PostgreSQL storage backend, the selectLeavesByLeafIdentityHashSQL query in log_storage.go will currently do a full table scan on the SequencedLeafData table, because there's no index for joining on its LeafIdentifyHash and TreeId fields. This query is used by the QueueLeaves operation when any of the to-be-queued leaves already exist.

This PR adds an appropriate CREATE INDEX statement in each of the PostgreSQL, MySQL, and CockroachDB storage backends.

Since there's already a corresponding foreign key constraint on the SequencedLeafData table...

FOREIGN KEY(TreeId, LeafIdentityHash) REFERENCES LeafData(TreeId, LeafIdentityHash) ON DELETE CASCADE

...and since MySQL requires that foreign key columns be indexed; if you create a table with a foreign key constraint but no index on a given column, an index is created, it appears that this PR is effectively a no-op for MySQL/MariaDB.

PostgreSQL doesn't automatically create indexes for foreign keys though, so this PR is needed in that context. I'm not sure what the behaviour is for CockroachDB.

I think it makes sense to keep the PostgreSQL, MySQL, and CockroachDB schema scripts as similar as possible, but if it's preferred to reduce the scope of this PR to only update the PostgreSQL backend then I can do that.

Checklist

@robstradling robstradling requested review from mhutchinson, AlCutter and a team as code owners December 9, 2024 16:26
@roger2hk
Copy link
Contributor

roger2hk commented Dec 9, 2024

/gcbrun

@mhutchinson
Copy link
Contributor

Huh, we had discussion internally on this issue but none of it seems to have made its way to this thread. @AlCutter can you remember what we thought the preferred approach was? My recollection is:

The changes to the other schemas are essentially no-ops (they rename the index, but don't otherwise change functionality). That said, introducing this change to the other schemas is unnecessary and increases the maintenance complexity because some databases may be created with the named index, and some with the default index. Should anyone decide to create a query that relies on this named index in the future, it would break for users that created their DB before this change. Given that this change is functionally a no-op today, we don't want to enforce that users run any DDL to effectively rename this index.

Thus, the request for this PR is that the changes to the other schemas are either a) rolled back, or b) turned into comments with a comment saying that these are implicitly created and are included in comment form to maintain similarity between the sets of similar schemas. On balance, I'd probably go with (a) because (b) sets up some precedent that we'll keep doing stuff like this in the future, which I'm sure we'll fail on eventually.

How does that sound @robstradling ? Happy New Year, BTW!

@robstradling
Copy link
Contributor Author

Thanks @mhutchinson. (a) works for me. Happy New Year!

@robstradling robstradling changed the title Create index on SequencedLeafData(TreeId, LeafIdentityHash) For PostgreSQL, explicitly create index on SequencedLeafData(TreeId, LeafIdentityHash) Jan 6, 2025
@mhutchinson
Copy link
Contributor

/gcbrun

@mhutchinson mhutchinson merged commit 3c98e0d into google:master Jan 6, 2025
12 checks passed
@robstradling robstradling deleted the add_missing_index branch January 6, 2025 13:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants