You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Discussing with @applio, we thought this PR should be changed so that the default was based on the store rather than the executor. Backup tasks would be off by default, except if the store was a well-known cloud store like S3 or GCS.
The text was updated successfully, but these errors were encountered:
TomNicholas
changed the title
Write to Icechunk as intermediate store?
Maybe useful? Could simply roll back to state before failed stage. Also then it's Icechunk 's problem to worry about atomic writes... Idea from:
A reason to not run backup tasks is if the filesystem does not support atomic writes. Cloud object stores generally are atomic (see https://cubed-dev.github.io/cubed/user-guide/reliability.html#stragglers), but local filesystems are not.
Write to Icechunk as intermediate store?
Oct 28, 2024
The interesting case is HPC, where there are multiple nodes (and hence the possibility of stragglers), but an intermediate store that uses a shared filesystem that does not support atomic writes. Icechunk might be useful to provide atomicity - but perhaps there are other ways too?
Also, I'd like to move to general blob stores (not just Zarr stores) for the intermediate store, so we have more control over the chunk access pattern (to control memory usage). Work like #464 will enable this.
Maybe useful? Could simply roll back to state before failed stage. Also then it's Icechunk 's problem to worry about atomic writes... Idea from:
A reason to not run backup tasks is if the filesystem does not support atomic writes. Cloud object stores generally are atomic (see https://cubed-dev.github.io/cubed/user-guide/reliability.html#stragglers), but local filesystems are not.
Discussing with @applio, we thought this PR should be changed so that the default was based on the store rather than the executor. Backup tasks would be off by default, except if the store was a well-known cloud store like S3 or GCS.
Originally posted by @tomwhite in #600 (comment)
The text was updated successfully, but these errors were encountered: