[WIP] Improve performance of buildIndex #748
Description
A but chunk or time from a clone is spent building pack index. In current code the decoder is used to create the index but an specific method to build indexes may be used.
I am testing a two pass system where in the first pass the hashes for non delta objects and hierarchy of objects children are generated. In the delta objects hashes are computed and the index generated. Here are the flame graphs of the index building process with the previous code and the new test.
The test was done cloning git://github.com/numpy/numpy.git
code from a local server.
- Previous code (11.90s):
- New code (9.52s):
There is still lots of time used in reading/decompressing/creating crc from the pack file as some objects are read twice and there is a significant time used in PatchDelta growing slice.
I am changing the code to add a cache that may decrease reading some objects twice and making PatchDelta objects bigger so they don't need to be grown. This issue will be updated with new findings.