Skip to content

onSnapshot getting out of sync when useFetchStreams enabled and brittle network #8903

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
neelance opened this issue Apr 8, 2025 · 7 comments

Comments

@neelance
Copy link
Contributor

neelance commented Apr 8, 2025

Operating System

primarily Android, but not always

Environment (if applicable)

primarily on mobile devices, but not always

Firebase SDK Version

11.6.0

Firebase SDK Product(s)

Firestore

Project Tooling

Web app with Webpack

Detailed Problem Description

We're seeing that for some users, onSnapshot for a query does not keep up with changes on the server. It seems to miss a change and it does not catch up later.

Unfortunately we can not reproduce the issue at will. It seems to be some race condition that only happens on brittle internet connections and we do not have a setup to hammer a test with simulated connection issues until we could see it happening.

What we can see in our error tracking is that in many occurrences the browser has the error @firebase/firestore: Firestore (11.6.0): WebChannelConnection RPC 'Listen' stream 0x24eae879 transport errored: [object Object] earlier in its JS console logs.

We also found out that if we pass useFetchStreams: false to initializeFirestore, then the issue goes away completely.

Steps and code to reproduce issue

@neelance neelance added new A new issue that hasn't be categoirzed as question, bug or feature request question labels Apr 8, 2025
@google-oss-bot
Copy link
Contributor

I couldn't figure out how to label this issue, so I've labeled it for a human to triage. Hang tight.

@jbalidiong jbalidiong added api: firestore needs-attention Repro Needed and removed needs-triage new A new issue that hasn't be categoirzed as question, bug or feature request labels Apr 8, 2025
@milaGGL milaGGL self-assigned this Apr 8, 2025
@milaGGL
Copy link
Contributor

milaGGL commented Apr 8, 2025

Hi @neelance, thank you for reporting this issue. Could you please provide more context on "It seems to miss a change and it does not catch up later."? Does that miss a snapshot, but the next ones are still in sync with the server, or some changes on the server are completely missing?

Since the bug is not reproducible, it is hard to debug. Maybe we can extract some more context out of the @firebase/firestore: Firestore (11.6.0): WebChannelConnection RPC 'Listen' stream 0x24eae879 transport errored: [object Object] error message. Could you please try using a custom build from this branch?

@neelance
Copy link
Contributor Author

neelance commented Apr 9, 2025

Does that miss a snapshot, but the next ones are still in sync with the server, or some changes on the server are completely missing?

I can't really say. What we are seeing is that at a certain point we know in the frontend that the backend just wrote to a certain document (after processing a purchase). We added some additional code to log an error if this change did not become visible in the frontend after 30 seconds. This is the most clear indication of the bug that we are seeing (before this additional error logging, we only saw strange business logic states that "should not happen").

Additionally when using getDocsFromServer in such a situation, we still get the old documents even we are sure that the document got written. This is because getDocsFromServer does not really fetch again from the server if there is an active onSnapshot binding on the same query. Then Firestore seems to assume that it already knows about the latest data, so it does not query again (we were able confirm this behavior via local testing). But with this bug, it is not really the latest data, even after doing some other Firestore actions in the meantime.

Just to mention it again: Setting useFetchStreams: false resolves our issue, so it is unlikely that it is a bug in our own code.

Could you please try using a custom build from this #8907?

As I can only reproduce this in production, it is not easy to push a custom build into our CI pipeline. What I could do instead is to wait until #8907 landed in a proper release and then temporarily set useFetchStreams: true to capture a new error message from production.

@milaGGL
Copy link
Contributor

milaGGL commented Apr 9, 2025

getDocsFromServer sharing the existing stream is an intended behaviour. The underlying bug is still the real time listener missing changes from backend.

The #8907 is merged today, I will update the thread once it is released.

Would it be possible to set the log level to "debug" and collect the logs for same process when useFetchStreams is true/false. With debug level logging, we should be able to check what we are receiving from the server, and compare the differences.

It would be appreciated if you could provide a minimal repro app, so that we can debug it on our side.

@neelance
Copy link
Contributor Author

It would be appreciated if you could provide a minimal repro app, so that we can debug it on our side.

I'd love to, but as mentioned earlier it is not easy to come up with a test setup:

Unfortunately we can not reproduce the issue at will. It seems to be some race condition that only happens on brittle internet connections and we do not have a setup to hammer a test with simulated connection issues until we could see it happening.

Any ideas?

@wu-hui
Copy link
Contributor

wu-hui commented Apr 14, 2025

@neelance

Do you turn on multitab support in your app?

@neelance
Copy link
Contributor Author

Do you turn on multitab support in your app?

No. We are not using any persistence feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants