Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor retrieveParticipants() API #1176

Open
anthonypetersen opened this issue Dec 18, 2024 · 4 comments
Open

Refactor retrieveParticipants() API #1176

anthonypetersen opened this issue Dec 18, 2024 · 4 comments

Comments

@anthonypetersen
Copy link
Contributor

getParticipants API is relying on offset functionality from Firestore as a means of implementing pagination and is causing our usage to be much higher than necessary

Firestore offers startAfter function as a way of implementing pagination, though this will require incoming calls to pass in a document ID so our backend knows where they left off. This could be easily returned to a client in a response object if they aren't on the last page.

Potential drawbacks of this:

  1. No way to jump to a specific page as we wouldn't be using page numbers anymore, but document IDs, to instruct our query where to look.
  2. Calls will need to be made synchronously as each one relies on the previous call in order to get the startAfter document ID. This also means if one call fails, the rest of the calls won't happen. This could be resolved by clients implementing a system to try a call again.
@we-ai
Copy link
Collaborator

we-ai commented Dec 18, 2024

Not sure exactly how this API is used.
If slight data lag doesn't matter much, BQ can be used for (quick and inexpensive) paginations.

@JoeArmani
Copy link
Collaborator

Sounds like a great approach to me.
Firestore Cursors: https://firebase.google.com/docs/firestore/query-data/query-cursors

Pagination and SMDB:
This is also used in SMDB: Main menu -> Participants -> dropdown options such as 'verified', 'not yet verified', 'all', etc.

I think we can use Firestore's .count() feature, which costs 1 read per 1000 documents counted, to maintain pagination options in SMDB if we want to keep that.

I don't know how those are used in SMDB, so it would be good to discuss those use cases. Those calls may be doing more than necessary in addition to the costly offset operation.

Potentially costly offset operations (Warren flagged these previously in the codebase, I'm just adding them here for visibility):
retrieveParticipants - primary
retrieveRefusalWithdrawalParticipants
retrieveParticipantsEligibleForIncentives
getBoxesPagination

@anthonypetersen
Copy link
Contributor Author

retrieveParticipants offset reads account for about 97% of our reads, the others don't really seem to be much of an issue at the moment.

For the purpose of viewing on the SMDB, I feel like full pagination isn't totally necessary and returning 1000 records max would be fine. No one using that dashboard will truly go through pages of records to see thousands of documents, they would use a search functionality that is more precise.

@brotzmanmj
Copy link
Collaborator

Added Amelia for when you are ready for ops input on use cases, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants