core, triedb/pathdb: final integration (snapshot integration pt 5) #30661

rjl493456442 · 2024-10-23T07:15:49Z

No description provided.

core/blockchain.go

MariusVanDerWijden · 2024-12-30T10:42:46Z

core/blockchain_repair_test.go

@@ -1997,15 +1999,21 @@ func testIssue23496(t *testing.T, scheme string) {
 	}
 	expHead := uint64(1)
 	if scheme == rawdb.PathScheme {
-		expHead = uint64(2)
+		expHead = uint64(3)


Maybe we could add a comment here, sth. like: "The pathdb database makes sure that snapshot and trie are consistent, so only the last block is reverted in case of a crash."

MariusVanDerWijden · 2024-12-30T10:43:28Z

core/blockchain_repair_test.go

+		// Reinsert B3-B4
+		if _, err := chain.InsertChain(blocks[2:]); err != nil {
+			t.Fatalf("Failed to import canonical chain tail: %v", err)


Suggested change

// Reinsert B3-B4

if _, err := chain.InsertChain(blocks[2:]); err != nil {

t.Fatalf("Failed to import canonical chain tail: %v", err)

// Reinsert B4

if _, err := chain.InsertChain(blocks[3:]); err != nil {

t.Fatalf("Failed to import canonical chain tail: %v", err)

In path mode only the last block is missing, reinsert that

MariusVanDerWijden · 2024-12-30T10:48:33Z

core/blockchain_snapshot_test.go

@@ -570,7 +574,7 @@ func TestHighCommitCrashWithNewSnapshot(t *testing.T) {
 	for _, scheme := range []string{rawdb.HashScheme, rawdb.PathScheme} {
 		expHead := uint64(0)
 		if scheme == rawdb.PathScheme {
-			expHead = uint64(4)
+			expHead = uint64(6)


Same here, would be good to have a comment that in path mode we expect the database to be consistent up to block 6, since that was the last comitted

MariusVanDerWijden · 2024-12-30T11:01:03Z

core/state/database.go

-		// the state reader with database.
+		// If standalone state snapshot is not available (path scheme
+		// or the state snapshot is explicitly disabled in hash mode),
+		// try to construct the state reader with database.
 		reader, err := db.triedb.StateReader(stateRoot)
 		if err == nil {


Afaict this only works because hashdb.StateReader returns an error, otherwise we would create two readers for a hashdb. Right now the logic (afaict) is:

if snapshot { // create snapshot reader } else { if hashdb { // do nothing } else { // create pathdb snapshot reader } }

Which seems a bit convoluted, why not make it like this:

if triedb.Scheme() == rawdb.HashScheme && db.snap != nil { // construct legacy snap reader } if triedb.Scheme() == rawdb.PathScheme { // construct triedb.StateReader }

MariusVanDerWijden · 2024-12-30T11:06:21Z

core/state/statedb_test.go

@@ -979,7 +979,8 @@ func testMissingTrieNodes(t *testing.T, scheme string) {
 	)
 	if scheme == rawdb.PathScheme {
 		tdb = triedb.NewDatabase(memDb, &triedb.Config{PathDB: &pathdb.Config{
-			CleanCacheSize:  0,
+			TrieCleanSize:   0,
+			StateCleanSize:  0,
 			WriteBufferSize: 0,


Should we also disable the background generation (SnapshotNoBuild: true) here? Seems like disabling it does not impact the outcome of the test and we would save some resources

Same with a few other places where we use pathdb.Defaults, disabling the background generation there, saved around 200ms in my (very rough) tests

Okay nvm, the 200ms was just a fluke on my machine, seem to be almost equal. So probably not worth it

MariusVanDerWijden · 2024-12-30T11:20:54Z

tests/block_test_util.go

-		if err := chain.Snapshots().Verify(chain.CurrentBlock().Root); err != nil {
-			return err
+		if chain.Snapshots() != nil {
+			if err := chain.Snapshots().Verify(chain.CurrentBlock().Root); err != nil {


We could, maybe in a follow-up PR, add something to TrieDB to also let us verify its integrity. This way we wouldn't loose this sanity check when running the block tests

MariusVanDerWijden · 2024-12-30T12:04:30Z

triedb/pathdb/database.go

@@ -362,6 +430,7 @@ func (db *Database) Enable(root common.Hash) error {
 	// reset the persistent state id back to zero.
 	batch := db.diskdb.NewBatch()
 	rawdb.DeleteTrieJournal(batch)
+	rawdb.DeleteSnapshotRoot(batch)


Could you please explain why we need to delete the snapshot root here, I don't really understand

MariusVanDerWijden · 2024-12-30T12:20:17Z

triedb/pathdb/flush.go

+			}
+		}
+	}
+	for addrHash, storages := range storageData {


I'm wondering whether it makes a difference in the speed of flushing if we sort the keys before trying to insert them, iirc some databases handle ordered key insertions better than random keys. This way we could also break on the first key that exceeds the genMarker

MariusVanDerWijden · 2024-12-30T12:31:17Z

triedb/pathdb/context.go

+
+// removeStorageLeft deletes all storage entries which are located after
+// the current iterator position.
+func (ctx *generatorContext) removeStorageLeft() uint64 {


This naming is a bit misleading, it deletes all storages that is left, but this storage is "right" of the marker in my mental model. Maybe removeRemainingStorage or something?

MariusVanDerWijden · 2024-12-30T12:48:20Z

triedb/pathdb/generate.go

+// into two parts.
+func splitMarker(marker []byte) ([]byte, []byte) {
+	var accMarker []byte
+	if len(marker) > 0 { // []byte{} is the start, use nil for that


Do we need to check here that len(marker) >= HashLength?

MariusVanDerWijden · 2024-12-30T13:37:38Z

Took a first look now and it looks pretty good so far. I added a few questions/things I didn't understand while reviewing. The biggest changes (generator, generator_test, context) are modified copies from the snapshot package

rjl493456442 requested review from karalabe and holiman as code owners October 23, 2024 07:15

rjl493456442 force-pushed the snapshot-integration-p5 branch from d249035 to 133900e Compare October 23, 2024 07:24

holiman changed the title ~~Snapshot integration p5~~ core, triedb/pathdb: final integration (snapshot integration pt 5) Oct 23, 2024

MariusVanDerWijden reviewed Oct 28, 2024

View reviewed changes

core/blockchain.go Outdated Show resolved Hide resolved

holiman mentioned this pull request Nov 8, 2024

all: unify the trie database and snapshot in path mode #30159

Closed

fjl added the pbss-archive label Nov 28, 2024

rjl493456442 force-pushed the snapshot-integration-p5 branch 4 times, most recently from 22dcce6 to b879386 Compare December 18, 2024 08:45

core, triedb/pathdb: integrate state snapshot inth pathdb

985a84e

rjl493456442 force-pushed the snapshot-integration-p5 branch from b879386 to 985a84e Compare December 23, 2024 03:47

rjl493456442 mentioned this pull request Dec 30, 2024

triedb/pathdb: introduce lookup structure to optimize state access #30971

Open

MariusVanDerWijden reviewed Dec 30, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core, triedb/pathdb: final integration (snapshot integration pt 5) #30661

core, triedb/pathdb: final integration (snapshot integration pt 5) #30661

rjl493456442 commented Oct 23, 2024

MariusVanDerWijden Dec 30, 2024

MariusVanDerWijden Dec 30, 2024

MariusVanDerWijden Dec 30, 2024

MariusVanDerWijden Dec 30, 2024

MariusVanDerWijden Dec 30, 2024

MariusVanDerWijden Dec 30, 2024

MariusVanDerWijden Dec 30, 2024

MariusVanDerWijden Dec 30, 2024

MariusVanDerWijden Dec 30, 2024

MariusVanDerWijden Dec 30, 2024

MariusVanDerWijden Dec 30, 2024

MariusVanDerWijden Dec 30, 2024

MariusVanDerWijden commented Dec 30, 2024

core, triedb/pathdb: final integration (snapshot integration pt 5) #30661

Are you sure you want to change the base?

core, triedb/pathdb: final integration (snapshot integration pt 5) #30661

Conversation

rjl493456442 commented Oct 23, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MariusVanDerWijden commented Dec 30, 2024