Path-based file storage #3
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This converts the (large) Session storage server to use filesystem path-based storage for uploaded files instead of storing binary data in the database.
In the early file server days, file contents in the database was used as upload sizes were within a couple TB, and this (in theory) let us replicate all storage between two servers. In practice, however, that replication was too bandwidth heavy to use, and so it has never been properly supported or used.
Additionally, storing 10s of TB in a very busy postgresql database has not worked particularly well, requiring us to implement file rotation of tables and disable vacuuming because of absurdly long times required to vacuum. (It also potentially bottlenecks at 32TB stored in a single table, though with file rotation we're not close to hitting that).
This commit converts the storage to store all files on disk as either:
where the former (last three digits) is used with backwards-compat numeric IDs, and the latter (first two b64 chars) are used when backwards compat IDs are disabled. (Note that we deliberately use so-called "url safe" base64 encoding with _ and - instead of / and + already in generated IDs so that the straight id is an acceptable filename).
This interim commit contains code to load both the existing in-database values as well as on-disk values, but a future commit will come along to remove support for in-database values.
Related changes included here: