-
Notifications
You must be signed in to change notification settings - Fork 20.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core/filtermaps: two dimensional log filter data structure #30370
base: master
Are you sure you want to change the base?
Conversation
c924f0d
to
e6b037f
Compare
9a05680
to
9ad34e5
Compare
a8aa689
to
c04968b
Compare
c592bbf
to
28cdf15
Compare
Do you have some numbers about the performance of the filtermaps? (size, lookup speed, generation speed, etc) |
I measured indexing and unindexing time for the entire chain history and I also saved the log where the index size was 2.350.000 blocks which is the currently proposed default setting:
Database size growth is hard to measure exactly because of compaction (or the lack of it), doing a full indexing after a full unindexing my db size grew 57Gb but it would probably be bigger when done on a freshly synced database. A starting point to do some estimations is that each map consists of 4096 rows which are 64 bytes long on average, stored under consecutive keys so probably a low db overhead per entry. So the entire history log should be about 58.6 Gb plus db overhead while the recommended 2.350.000 blocks (one year) history should be about 12.7 Gb plus db overhead. Also note that this PR removes the old bloombits db which is about 5-6 Gb. The log search performance depends on what we are searching for, I chose a more difficult but pretty common scenario where some of the search values appear very frequently while the overall pattern happens 40 times throughout the chain history. It's a WETH transaction, the filter pattern is for one address and 3 topics.
I did the test for 1M blocks, 10M blocks and the entire chain history, both with and without indexing:
|
7b61867
to
30fd63f
Compare
34a2d4d
to
3b93728
Compare
2fbd945
to
78b5918
Compare
This PR implements a new log filter data structure that is intended to replace
core/bloombits
.It can also be considered as a pilot project for my EIP-7745 proposal:
https://github.com/zsfelfoldi/EIPs/blob/new-log-filter/EIPS/eip-7745.md
Note that this PR implements the filter structure proposed in the EIP but does not touch consensus. It implements the filter maps but not the tree hash structure. It also does not add pointers to headers and receipts, instead it stores block to log value pointers separately.
Regardless of whether and when EIP-7745 might get accepted, this PR provides immediate value to Geth users interested in logs as it should drastically speed up log search compared to bloombits which is not practically useless because of the overpopulated bloom filters. The EIP is mostly interesting for light client friendliness.