Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Key-Value Swallows Write Errors When Backing Impl Fails #2952

Open
rylev opened this issue Dec 13, 2024 · 8 comments
Open

Key-Value Swallows Write Errors When Backing Impl Fails #2952

rylev opened this issue Dec 13, 2024 · 8 comments

Comments

@rylev
Copy link
Collaborator

rylev commented Dec 13, 2024

The CachingStore completely swallows errors when the inner implementation fails on writes.

state.spawn(async move { inner.set(&key, &value).await });

This can lead to writes appearing like they succeed (even being reflected by reads within the same process) but not actually working.

@lann
Copy link
Collaborator

lann commented Dec 13, 2024

FWIW, it looks like this was an intentional tradeoff:

/// Note that, because writes are asynchronous and return immediately,
/// durability is _not_ guaranteed. I/O errors may occur asynchronously after
/// the write operation has returned control to the guest, which may result in
/// the write being lost without the guest knowing. In the future, a separate
/// `write-durable` function could be added to key-value.wit to provide either
/// synchronous or asynchronous feedback on durability for guests which need it.

@lann
Copy link
Collaborator

lann commented Dec 13, 2024

After some further investigation, it looks like some of the newer KV implementations (Azure, AWS) are using lazy connections that cannot fail on open, which makes this problem much worse.

@dicej
Copy link
Contributor

dicej commented Dec 13, 2024

I'm the one who implemented CachingStore and wrote the comment @lann referenced above. My intention was that the Spin implementation would support the weakest consistency and durability model allowed by the interface in key-value.wit so as to help the application developer incorporate that model into their design and implementation early on and give them confidence that their app would be portable to any KV implementation.

Unfortunately, the consistency and durability model was apparently never documented in key-value.wit, and that's definitely a problem. I'll open a PR to address that.

FWIW, the wasi-keyvalue docs do address this, although those docs could also be improved by adding an explicit discussion of durability as well as consistency.

@rylev
Copy link
Collaborator Author

rylev commented Dec 16, 2024

@dicej thanks for the context!

The linked wasi-keyvalue docs seem to me to indicate that this is a bug, but perhaps I'm miss understanding the meaning of the docs:

/// Data consistency in a key value store refers to the guarantee that once a write operation
/// completes, all subsequent read operations will return the value that was written.

I assume that these docs are not only talking about subsequent reads in the same component instance but also across component instances. Let me know if this assumption is incorrect. With that assumption, in the current implementation, this guarantee is not upheld. I can do a write operation and then do a later read operation (in another instance) and get an old value.

My understanding of the docs is that all writes must be guaranteed to be reflected eventually. So if I do a write of a value (and no other writes ever happen) all clients will eventually see that write. That's not the case here. I'd love your clarification on this so we can either update the wasi docs to reflect this or update the Spin implementation.

@dicej
Copy link
Contributor

dicej commented Dec 16, 2024

I assume that these docs are not only talking about subsequent reads in the same component instance but also across component instances. Let me know if this assumption is incorrect.

My reading is different, but I think that's because the docs aren't clear enough about the terms "client", "guest", and "caller" in the paragraph following the one you quoted. I've put up a PR that reflects my interpretation; I'm happy to refine and/or rewrite it based on feedback.

I think the key question we need to answer here is whether a host implementation of KV backed by an eventually-consistent distributed system spread across multiple failure domains with asynchronous, "best-effort" durability is (A) valid and (B) normal. For example, if we think it's valid but not normal, we can make the consistency and durability guarantees for both Spin KV and wasi-keyvalue much stronger, and create a new, niche interface which has weaker guarantees in exchange for wider applicability. And regardless of how we answer that question, we should make sure the docs are crystal clear so there are no surprises.

@dicej
Copy link
Contributor

dicej commented Dec 16, 2024

Another way to address the original issue: define set to have blocking durability (i.e. slow and safe) and add a set-async method which returns a resource which may be polled via wasi:io/poll for a result, which the caller can either discard or await as desired.

@ogghead
Copy link
Contributor

ogghead commented Dec 24, 2024

Another way to address the original issue: define set to have blocking durability (i.e. slow and safe) and add a set-async method which returns a resource which may be polled via wasi:io/poll for a result, which the caller can either discard or await as desired.

Would it be desirable to implement this as a runtime config (i.e. in spin.toml)? That would allow users to flexibly set strong/weak durability without changing the KV interface. Though perhaps it is better to split the interface as you suggest and encourage users to be explicit (and allow choice of durability for each database call in a component)

@kate-goldenring kate-goldenring moved this to 🆕 Triage Needed in Spin Triage Jan 13, 2025
dicej added a commit to dicej/spin that referenced this issue Jan 27, 2025
The semantic (non-)guarantees for wasi-keyvalue are still [under
discussion](WebAssembly/wasi-keyvalue#56), but meanwhile
the behavior of Spin's write-behind cache has caused [some
headaches](fermyon#2952), so I'm removing it
until we have more clarity on what's allowed and what's disallowed by the
proposed standard.

The original motivation behind `CachingStoreManager` was to reflect the
anticipated behavior of an eventually-consistent, low-latency, cloud-based
distributed store and, per [Hyrum's Law](https://www.hyrumslaw.com/) help app
developers avoid depending on the behavior of a local, centralized store which
would not match that of a distributed store.  However, the write-behind caching
approach interacts poorly with the lazy connection establishment which some
`StoreManager` implementations use, leading writes to apparently succeed even
when the connection fails.

Subsequent discussion regarding the above issue arrived at a consensus that we
should not consider a write to have succeeded until and unless we've
successfully connected to and received a write confirmation from at least one
replica in a distributed system.  I.e. rather than the replication factor (RF) =
0 we've been effectively providing up to this point, we should provide RF=1.
The latter still provides low-latency performance when the nearest replica is
reasonably close, but improves upon RF=0 in that it shifts responsibility for
the write from Spin to the backing store prior to returning "success" to the
application.

Note that RF=1 (and indeed anything less than RF=ALL) cannot guarantee that the
write will be seen immediately (or, in the extreme case of an unrecoverable
failure, at all) by readers connected to other replicas.  Applications requiring
a stronger consistency model should use an ACID-style backing store rather than
an eventually consistent one.

Signed-off-by: Joel Dice <[email protected]>
dicej added a commit to dicej/spin that referenced this issue Jan 27, 2025
The semantic (non-)guarantees for wasi-keyvalue are still [under
discussion](WebAssembly/wasi-keyvalue#56), but meanwhile
the behavior of Spin's write-behind cache has caused [some
headaches](fermyon#2952), so I'm removing it
until we have more clarity on what's allowed and what's disallowed by the
proposed standard.

The original motivation behind `CachingStoreManager` was to reflect the
anticipated behavior of an eventually-consistent, low-latency, cloud-based
distributed store and, per [Hyrum's Law](https://www.hyrumslaw.com/) help app
developers avoid depending on the behavior of a local, centralized store which
would not match that of a distributed store.  However, the write-behind caching
approach interacts poorly with the lazy connection establishment which some
`StoreManager` implementations use, leading writes to apparently succeed even
when the connection fails.

Subsequent discussion regarding the above issue arrived at a consensus that we
should not consider a write to have succeeded until and unless we've
successfully connected to and received a write confirmation from at least one
replica in a distributed system.  I.e. rather than the replication factor (RF) =
0 we've been effectively providing up to this point, we should provide RF=1.
The latter still provides low-latency performance when the nearest replica is
reasonably close, but improves upon RF=0 in that it shifts responsibility for
the write from Spin to the backing store prior to returning "success" to the
application.

Note that RF=1 (and indeed anything less than RF=ALL) cannot guarantee that the
write will be seen immediately (or, in the extreme case of an unrecoverable
failure, at all) by readers connected to other replicas.  Applications requiring
a stronger consistency model should use an ACID-style backing store rather than
an eventually consistent one.

Signed-off-by: Joel Dice <[email protected]>
@dicej
Copy link
Contributor

dicej commented Jan 27, 2025

I've opened #2995 to address this.

dicej added a commit that referenced this issue Jan 27, 2025
The semantic (non-)guarantees for wasi-keyvalue are still [under
discussion](WebAssembly/wasi-keyvalue#56), but meanwhile
the behavior of Spin's write-behind cache has caused [some
headaches](#2952), so I'm removing it
until we have more clarity on what's allowed and what's disallowed by the
proposed standard.

The original motivation behind `CachingStoreManager` was to reflect the
anticipated behavior of an eventually-consistent, low-latency, cloud-based
distributed store and, per [Hyrum's Law](https://www.hyrumslaw.com/) help app
developers avoid depending on the behavior of a local, centralized store which
would not match that of a distributed store.  However, the write-behind caching
approach interacts poorly with the lazy connection establishment which some
`StoreManager` implementations use, leading writes to apparently succeed even
when the connection fails.

Subsequent discussion regarding the above issue arrived at a consensus that we
should not consider a write to have succeeded until and unless we've
successfully connected to and received a write confirmation from at least one
replica in a distributed system.  I.e. rather than the replication factor (RF) =
0 we've been effectively providing up to this point, we should provide RF=1.
The latter still provides low-latency performance when the nearest replica is
reasonably close, but improves upon RF=0 in that it shifts responsibility for
the write from Spin to the backing store prior to returning "success" to the
application.

Note that RF=1 (and indeed anything less than RF=ALL) cannot guarantee that the
write will be seen immediately (or, in the extreme case of an unrecoverable
failure, at all) by readers connected to other replicas.  Applications requiring
a stronger consistency model should use an ACID-style backing store rather than
an eventually consistent one.

Signed-off-by: Joel Dice <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🆕 Triage Needed
Development

No branches or pull requests

4 participants