Perf: optimize fetching last file commit #58055

camdencheek · 2023-11-01T19:10:44Z

When fetching the information about the commit that modified a file, we use an ancestors query that is filtered to the file name. This is a paginated interface, so we always request n+1 items so we know whether there is a next page.

However, the extra requested commit can actually be quite expensive to fetch, particularly in the relatively common case that a file has only been modified in one commit: the one that created it. In that case, we have to iterate over the entire history of the repository to look for a commit that doesn't exist to produce information about a next page that will never be fetched.

The ideal solution here would be to only fetch n+1 if the hasNextPage field of the resolver is requested. Unfortunately, our GraphQL package still doesn't support querying for which fields are requested, which is a huge bummer.

Instead, this takes a slightly hackier approach and assumes that if the number requested is 1, it's better to serve that quickly and claim we have a next page without knowing whether that page will contain anything.

Test plan

Manually tested the the trace of the FetchOwnersAndHistory GraphQL request. For a file with only one commit, the span representing the git command shrunk from 300ms to 70ms, and there no longer a visible loading indicator.

eseliger

drive-by: wouldn't this affect the correctness of the pageInfo?

camdencheek · 2023-11-01T21:14:03Z

wouldn't this affect the correctness of the pageInfo?

Yes, which is why I added the bonus logic in PageInfo(). Basically, if n == 1, we just always claim there is a next page. The next page might be empty, but I don't think that counts a correctness issue, just an unnecessary followup request.

In practice, if you actually care about paginating, I don't think you'd ever have a page size of 1 anyways, so I don't think this will ever be problematic. However, the real solution is to fix our graphql library so that we query the requested fields, making this whole dance unnecessary.

eseliger · 2023-11-01T21:34:41Z

Yeah, I didn't mean to block this PR, but returning hasNextPage: true when there's no next page actually feels like incorrect API data to me :) Maybe we can add a comment about that somewhere?

camdencheek · 2023-11-01T22:49:25Z

Yeah you're right. I added some more comments. I hope we can solve the "we don't know what fields were requested" issue for real...but until then I think this is still worth the hack and slightly incorrect behavior.

When fetching the information about the commit that modified a file, we use an ancestors query that is filtered to the file name. This is a paginated interface, so we always request n+1 items so we know whether there is a next page. However, the extra requested commit can actually be quite expensive to fetch, particularly in the relatively common case that a file has only been modified in one commit: the one that created it. In that case, we have to iterate over the entire history of the repository to look for a commit that doesn't exist to produce information about a next page that will never be fetched. The ideal solution here would be to only fetch n+1 if the hasNextPage field of the resolver is requested. Unfortunately, our GraphQL package still doesn't support querying for which fields are requested, which is a huge bummer. Instead, this takes a slightly hackier approach and assumes that if the number requested is 1, it's better to serve that quickly and claim we have a next page without knowing whether that page will contain anything.

optimize fetching last file commit

2fdcc65

cla-bot bot added the cla-signed label Nov 1, 2023

camdencheek requested a review from a team November 1, 2023 19:11

fkling approved these changes Nov 1, 2023

View reviewed changes

eseliger reviewed Nov 1, 2023

View reviewed changes

add more comments

55d7865

camdencheek merged commit d069362 into main Nov 1, 2023

camdencheek deleted the cc/optimize-ancestors branch November 1, 2023 23:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perf: optimize fetching last file commit #58055

Perf: optimize fetching last file commit #58055

camdencheek commented Nov 1, 2023 •

edited

Loading

eseliger left a comment

camdencheek commented Nov 1, 2023

eseliger commented Nov 1, 2023

camdencheek commented Nov 1, 2023

Perf: optimize fetching last file commit #58055

Perf: optimize fetching last file commit #58055

Conversation

camdencheek commented Nov 1, 2023 • edited Loading

Test plan

eseliger left a comment

Choose a reason for hiding this comment

camdencheek commented Nov 1, 2023

eseliger commented Nov 1, 2023

camdencheek commented Nov 1, 2023

camdencheek commented Nov 1, 2023 •

edited

Loading