Refactor retrieveParticipants() API #1176

anthonypetersen · 2024-12-18T13:43:52Z

getParticipants API is relying on offset functionality from Firestore as a means of implementing pagination and is causing our usage to be much higher than necessary

Firestore offers startAfter function as a way of implementing pagination, though this will require incoming calls to pass in a document ID so our backend knows where they left off. This could be easily returned to a client in a response object if they aren't on the last page.

Potential drawbacks of this:

No way to jump to a specific page as we wouldn't be using page numbers anymore, but document IDs, to instruct our query where to look.
Calls will need to be made synchronously as each one relies on the previous call in order to get the startAfter document ID. This also means if one call fails, the rest of the calls won't happen. This could be resolved by clients implementing a system to try a call again.

The text was updated successfully, but these errors were encountered:

we-ai · 2024-12-18T14:25:52Z

Not sure exactly how this API is used.
If slight data lag doesn't matter much, BQ can be used for (quick and inexpensive) paginations.

JoeArmani · 2024-12-18T15:26:14Z

Sounds like a great approach to me.
Firestore Cursors: https://firebase.google.com/docs/firestore/query-data/query-cursors

Pagination and SMDB:
This is also used in SMDB: Main menu -> Participants -> dropdown options such as 'verified', 'not yet verified', 'all', etc.

I think we can use Firestore's .count() feature, which costs 1 read per 1000 documents counted, to maintain pagination options in SMDB if we want to keep that.

I don't know how those are used in SMDB, so it would be good to discuss those use cases. Those calls may be doing more than necessary in addition to the costly offset operation.

Potentially costly offset operations (Warren flagged these previously in the codebase, I'm just adding them here for visibility):
retrieveParticipants - primary
retrieveRefusalWithdrawalParticipants
retrieveParticipantsEligibleForIncentives
getBoxesPagination

anthonypetersen · 2024-12-18T15:39:30Z

retrieveParticipants offset reads account for about 97% of our reads, the others don't really seem to be much of an issue at the moment.

For the purpose of viewing on the SMDB, I feel like full pagination isn't totally necessary and returning 1000 records max would be fine. No one using that dashboard will truly go through pages of records to see thousands of documents, they would use a search functionality that is more precise.

brotzmanmj · 2024-12-18T16:58:36Z

Added Amelia for when you are ready for ops input on use cases, thanks

anthonypetersen assigned anthonypetersen, we-ai, Davinkjohnson and JoeArmani Dec 18, 2024

anthonypetersen added the API label Dec 18, 2024

brotzmanmj assigned robertsamm Dec 18, 2024

sonyekere added the CCC Backlog label Jan 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor retrieveParticipants() API #1176

Refactor retrieveParticipants() API #1176

anthonypetersen commented Dec 18, 2024

we-ai commented Dec 18, 2024

JoeArmani commented Dec 18, 2024

anthonypetersen commented Dec 18, 2024

brotzmanmj commented Dec 18, 2024

Refactor retrieveParticipants() API #1176

Refactor retrieveParticipants() API #1176

Comments

anthonypetersen commented Dec 18, 2024

we-ai commented Dec 18, 2024

JoeArmani commented Dec 18, 2024

anthonypetersen commented Dec 18, 2024

brotzmanmj commented Dec 18, 2024