Redundant write on EigenDA failure #242

Inkvi · 2025-01-14T23:39:43Z

Changes proposed

Current proxy implementation lacks protection for failures on writes. If EigenDA fails for any reason, we still want to write our commitment to the DA layer if we have caches or fallbacks enabled. Otherwise our DA batches will be postponed until Eigen recovers and might overwhelm the batcher if that period is prolonged.

Since certificate is not available at this point, a keccak of the payload has to be used as commitment.

Note to reviewers

I am aware of your PR to achieve a similar results with Eth DA as backend instead of s3. But that PR is a few months old and there was no traction from OP team.

samlaf · 2025-01-16T17:05:37Z

@Inkvi thanks for this. Bit overwhelmed at the moment but will review asap. Ping me if I forget.

ethenotethan · 2025-01-21T20:41:36Z

store/manager.go

+		log.Error("Failed to write to EigenDA backend", "err", err)
+		// write to EigenDA failed, which shouldn't happen if the backend is functioning properly
+		// use the payload as the key to avoid data loss
+		if m.secondary.Enabled() && !m.secondary.AsyncWriteEntry() {
+			redundantErr := m.secondary.HandleRedundantWrites(ctx, value, value)
+			if redundantErr != nil {
+				log.Error("Failed to write to redundant backends", "err", redundantErr)
+				return nil, redundantErr
+			}
+
+			return crypto.Keccak256(value), nil
+		}


iiuc the idea here is to use secondary backends as primary in the event of a failed dispersal to eigenda. this will create complications to some of our integrations (e.g, in arbitrum x eigenda the commitment is posted/verified against the inbox directly which would now fail with failover). Would prefer if this was an opt-in and feature guarded by some dangerous config flag...

cc @samlaf

Overall seems like an approach that could work, but the PR in its current form would require a lot more work:

we would want to revert to a keccak commitment mode when this kind of failover happens, so that the derivation pipeline knows that the failover happens (this would also be a more robust approach than the ad-hoc reading you currently have)

as @epociask said, this failover behavior should be feature guarded

this failover should only happen if the blob's size is <128KiB (to not hit against this size limit in the derivation pipeline)

would need to add some tests

@epociask thanks for highlighting the integration problems with Arbitrum. I am not familiar with Arbitrum stack and how alt da is handled there. Could you shed more light on the failure mechanism? Are you saying that Arbitrum's inbox contract verifies the EigenDA certificate directly on-chain? That would imply a direct integration between Arbitrum stack and EigenDA which is unlikely in my mind.

@samlaf the last time I looked at the op-batcher (v 1.9.5) it couldn't revert to a different commitment mode and generic commitment mode is used by the batcher whenever da service is specified. If that still holds true, then the blob size limitation of 128kb is not applicable anymore.

That would imply a direct integration between Arbitrum stack and EigenDA which is unlikely in my mind.

This is what we've done in our Arbitrum fork, feel free to look at the code here where the on-chain cert verification is performed.

If that still holds true, then the blob size limitation of 128kb is not applicable anymore.

In a world with enabled fraud proofs wouldn't you need to one step prove the reading of blob contents from a preimage oracle? Curious how that would work using a hash based commitment scheme where opcode resolution would require uploading the entire pre-image or blob contents. Agree the size limitation is irrelevant for insecure integrations but presents dramatic security implications to stage1-2 rollups.

I wasn't aware of your arbitrum fork, now it makes total sense why it won't work for a nitro stack.

Hey @ethenotethan is it possible to circumvent the 128kb tx size limit using the large preimage proposal functionality exposed by the preimage oracle? e.g. https://github.com/ethereum-optimism/optimism/blob/develop/packages/contracts-bedrock/src/cannon/PreimageOracle.sol#L530 and https://github.com/ethereum-optimism/optimism/blob/develop/op-challenger/game/fault/preimages/large.go

Oh I see, the limit is from the stock DAChallenge contract not the preimage oracle? It should be possible to implement a DAChallenge contract that uses a similar mechanism to the large preimage proposal in order to circumvent this limit there too- ofc that doesn't exist atm and would be a significant amount of work.

Hey @ian-shim! It should theoretically work but do see some complications if the pre-image is very large. E.g EigenDA supports up 16 MiB/s on mainnet. In the worst case this limit could be hit for a single blob dispersal:

16 MiB blob / 128 kb per calldata tx = 131 txs

In these worst case when during a dispute game, you'd have to stream 131 txs to prove the execution of a READPREIMAGE operation. IIUC this could have implications to core game mechanics/constraints (e.g, challenger bond sizes, resource fractionalization between challenge states). Can see this opening up DoS vectors that can be mitigated by either finding some max blob size threshold that ensures dispute safety or rework core challenge params to support this 16-32mib case.

One thing we could also explore is mitigating this within proxy itself; e.g we could update the service to partition blobs > 128kb into "sub-blobs" which are committed to individually using keccak256 hashing and form some multi-commitment da cert. This would require changes in the node software though to process this 1 to many commitment.

Thanks @ethenotethan that makes sense. I like the idea in the last paragraph, this is similar to something I asked about in our shared slack channel- instead of commiting a single cert per tx pack a bunch in a single tx's calldata. Originally was thinking about this as a way to further reduce gas costs but if it could also be used to make a fallback like this work in the fault proof system that would be awesome.

If EigenDA fails for any reason, we still want to write our commitment to the DA layer if we have caches or fallbacks enabled. Since certificate is not available at this point, a keccak of the payload has to be used as commitment.

Inkvi · 2025-01-29T22:39:59Z

Added units tests and hid the feature behind a flag.

ethenotethan reviewed Jan 21, 2025

View reviewed changes

Inkvi closed this Jan 29, 2025

Inkvi force-pushed the main branch from 5613557 to 0e7dcaf Compare January 29, 2025 22:29

Redundant write on EigenDA failure

07fc191

If EigenDA fails for any reason, we still want to write our commitment to the DA layer if we have caches or fallbacks enabled. Since certificate is not available at this point, a keccak of the payload has to be used as commitment.

Inkvi reopened this Jan 29, 2025

Inkvi requested a review from samlaf January 29, 2025 22:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redundant write on EigenDA failure #242

Redundant write on EigenDA failure #242

Inkvi commented Jan 14, 2025

samlaf commented Jan 16, 2025

ethenotethan Jan 21, 2025

samlaf Jan 21, 2025

Inkvi Jan 22, 2025

Inkvi Jan 22, 2025

ethenotethan Jan 23, 2025 •

edited

Loading

Inkvi Jan 29, 2025

i-norden Jan 30, 2025

i-norden Jan 30, 2025

ethenotethan Feb 2, 2025

i-norden Feb 3, 2025

Inkvi commented Jan 29, 2025

Redundant write on EigenDA failure #242

Are you sure you want to change the base?

Redundant write on EigenDA failure #242

Conversation

Inkvi commented Jan 14, 2025

Changes proposed

Note to reviewers

samlaf commented Jan 16, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ethenotethan Jan 23, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Inkvi commented Jan 29, 2025

ethenotethan Jan 23, 2025 •

edited

Loading