Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core, triedb/pathdb: final integration (snapshot integration pt 5) #30661

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

rjl493456442
Copy link
Member

No description provided.

@rjl493456442 rjl493456442 force-pushed the snapshot-integration-p5 branch from d249035 to 133900e Compare October 23, 2024 07:24
@holiman holiman changed the title Snapshot integration p5 core, triedb/pathdb: final integration (snapshot integration pt 5) Oct 23, 2024
@rjl493456442 rjl493456442 force-pushed the snapshot-integration-p5 branch 4 times, most recently from 22dcce6 to b879386 Compare December 18, 2024 08:45
@@ -1997,15 +1999,21 @@ func testIssue23496(t *testing.T, scheme string) {
}
expHead := uint64(1)
if scheme == rawdb.PathScheme {
expHead = uint64(2)
expHead = uint64(3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could add a comment here, sth. like: "The pathdb database makes sure that snapshot and trie are consistent, so only the last block is reverted in case of a crash."

Comment on lines +2008 to +2010
// Reinsert B3-B4
if _, err := chain.InsertChain(blocks[2:]); err != nil {
t.Fatalf("Failed to import canonical chain tail: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Reinsert B3-B4
if _, err := chain.InsertChain(blocks[2:]); err != nil {
t.Fatalf("Failed to import canonical chain tail: %v", err)
// Reinsert B4
if _, err := chain.InsertChain(blocks[3:]); err != nil {
t.Fatalf("Failed to import canonical chain tail: %v", err)

In path mode only the last block is missing, reinsert that

@@ -570,7 +574,7 @@ func TestHighCommitCrashWithNewSnapshot(t *testing.T) {
for _, scheme := range []string{rawdb.HashScheme, rawdb.PathScheme} {
expHead := uint64(0)
if scheme == rawdb.PathScheme {
expHead = uint64(4)
expHead = uint64(6)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, would be good to have a comment that in path mode we expect the database to be consistent up to block 6, since that was the last comitted

// the state reader with database.
// If standalone state snapshot is not available (path scheme
// or the state snapshot is explicitly disabled in hash mode),
// try to construct the state reader with database.
reader, err := db.triedb.StateReader(stateRoot)
if err == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Afaict this only works because hashdb.StateReader returns an error, otherwise we would create two readers for a hashdb. Right now the logic (afaict) is:

if snapshot { 
     // create snapshot reader
} else {
    if hashdb {
        // do nothing
    } else {
        // create pathdb snapshot reader
    }
}

Which seems a bit convoluted, why not make it like this:

if triedb.Scheme() == rawdb.HashScheme && db.snap != nil {
    // construct legacy snap reader
}
if triedb.Scheme() == rawdb.PathScheme {
    // construct triedb.StateReader
}

@@ -979,7 +979,8 @@ func testMissingTrieNodes(t *testing.T, scheme string) {
)
if scheme == rawdb.PathScheme {
tdb = triedb.NewDatabase(memDb, &triedb.Config{PathDB: &pathdb.Config{
CleanCacheSize: 0,
TrieCleanSize: 0,
StateCleanSize: 0,
WriteBufferSize: 0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also disable the background generation (SnapshotNoBuild: true) here? Seems like disabling it does not impact the outcome of the test and we would save some resources

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with a few other places where we use pathdb.Defaults, disabling the background generation there, saved around 200ms in my (very rough) tests

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay nvm, the 200ms was just a fluke on my machine, seem to be almost equal. So probably not worth it

if err := chain.Snapshots().Verify(chain.CurrentBlock().Root); err != nil {
return err
if chain.Snapshots() != nil {
if err := chain.Snapshots().Verify(chain.CurrentBlock().Root); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, maybe in a follow-up PR, add something to TrieDB to also let us verify its integrity. This way we wouldn't loose this sanity check when running the block tests

@@ -362,6 +430,7 @@ func (db *Database) Enable(root common.Hash) error {
// reset the persistent state id back to zero.
batch := db.diskdb.NewBatch()
rawdb.DeleteTrieJournal(batch)
rawdb.DeleteSnapshotRoot(batch)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please explain why we need to delete the snapshot root here, I don't really understand

}
}
}
for addrHash, storages := range storageData {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering whether it makes a difference in the speed of flushing if we sort the keys before trying to insert them, iirc some databases handle ordered key insertions better than random keys. This way we could also break on the first key that exceeds the genMarker


// removeStorageLeft deletes all storage entries which are located after
// the current iterator position.
func (ctx *generatorContext) removeStorageLeft() uint64 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This naming is a bit misleading, it deletes all storages that is left, but this storage is "right" of the marker in my mental model. Maybe removeRemainingStorage or something?

// into two parts.
func splitMarker(marker []byte) ([]byte, []byte) {
var accMarker []byte
if len(marker) > 0 { // []byte{} is the start, use nil for that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check here that len(marker) >= HashLength?

@MariusVanDerWijden
Copy link
Member

Took a first look now and it looks pretty good so far. I added a few questions/things I didn't understand while reviewing. The biggest changes (generator, generator_test, context) are modified copies from the snapshot package

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants