cip: Header Pruning for LNs #279

Wondertan · 2025-03-21T18:05:27Z

Overview

CIP for Header Pruning. As Header Pruning breaks users, particularly by preventing historical queries(to be fixed in subsequent CIP), it was decided to promote this change to a CIP.

This CIP start a series of DA targeted CIPs to optimize bandwidth and storage usage for LN, particularly targeting overhead brought by headers.

cips/cip-x.md

jcstein

nits mostly

rootulp

LGTM after resolving the parameter question. Given the DA network parameters aren't already in CIPs (or specs) somewhere, I suggest keeping the source of truth in code and removing the parameter from this CIP.

The rationale is that it's way easier to update the parameter in code and there is no risk of the parameter in docs becoming stale with the implementation in code.

cips/cip-x.md

Co-authored-by: Josh Stein <[email protected]> Co-authored-by: Rootul P <[email protected]>

Wondertan · 2025-03-23T16:02:30Z

Given the DA network parameters aren't already in CIPs (or specs) somewhere, I suggest keeping the source of truth in code and removing the parameter from this CIP.

@rootulp, rhere is SamplingWindow param that is already in CIP-4. We can tie HeaderPruningWindow to it which is 30 days for simplicity. It should though be defined based of TrustingPeriod which is 14 days(basing on @nashqueue's findings), but that's gonna be a different CIP by someone else(like @cmwaters) that will define the TrustingPeriod and resolve issues in between all the params by adjusting them.

So given we have the linkable parameter in the old CIP, I believe it fine to define the new parameter in the CIP.

cmwaters · 2025-03-24T11:11:53Z

Yeah there are already node parameters defined in CIP 4 (and we may look to adjust those soon). These are important parameters so I would also prefer to have them documented here (possibly in the same way we do core/app parameters)

cips/cip-x.md

jcstein

LGTM

ebuchman

Nice. Some nits and a few stupid questions

cips/cip-x.md

ebuchman · 2025-03-25T16:13:25Z

cips/cip-x.md

+The estimation of Tail height is done as follows:
+
+```go
+func estimateTail(head, tail Header, blockTime, pruningWindow time.Duration) (height uint64) {


this is confusing, why is "now" involved? I would have expected it to be:

numBlockToKeep = pruningWindow / blockTime newTail = head - numBlocksToKeep return max(currentTail, newTail) // in case newTail is somehow older than currentTail

This snippet is actually a wrong one. This one is for for finding the range that needs to trimmed off. Thanks for catching that!

The now is used there as the starting point as opposed to the time of the last header. This is inherited from pruning for samples as its how it works currently. However, thinking about this now in the case of header pruning, I think it's reasonable to just cut off basing of the most recent header.

fixed a typo but otherwise lgtm

Something I realized coming from the actual implementation:

It should take trustingPeriod instead of headerPruningWindow in here. As HPR can be arbitrary increased, it can become bigger then TP and more importantly the unbonding window, which then becomes a security issue due to LRA. See fa9243f

Using head in tail estimation actually make tail requesting dependent on head. This is not ideal, because it adds a round trip of delay for the subjective init. You first request Head and then the Tail. This is kinda unavoidable, because we need some header to be time reference to convert time to heights. We could avoid this by hardcoding the time of genesis header or some other header, but that might be a bit too much for now.

not sure I follow here. Isn't SamplingWindow (~1month) already bigger than TrustingPeriod (~two weeks)? And we need pruningWindow >= SamplingWindow. I would think the light client syncing should be independent of what's happening here. If the LC hasn't been online in at least TP, it will need to resubjectively initialize, but once it does it can safely go backwards via hash links to fill in older blocks

per above, my sense is a light node should first sync, and then decide if it needs to backfill. how is subjective init happening anyways?

Isn't SamplingWindow (~1month) already bigger than TrustingPeriod (~two weeks)?

Yes, but that's exactly where the problem is. The fact that SW and subsequently HPW are bigger then TP imposes risks on newly initializing nodes. If we start syncing from the Tail beyond TP we are in trouble due to LRA and the change here is to prevent that. It ensures the Tail is always withing TP, no matter which HPW is set.

how is subjective init happening anyways?

So the missing context here might be that we do subj init a bit differently compared to other PoS projects. We automate that in a way that I believe it makes the process even less error prone. Usually, users are asked to enter on subj init a trusted hash of a header with validator set they believed to be correct. Instead, we introduce notion of a trusted peer that we request such a header with valset from. This in turn removes the hassle from user to find the correct hash, improving UX and reduces risks of phishing and other rookie mistakes by users pasting wrong hashes. The trusted peers are by default hardcoded infrastructure nodes, that can be altered in configs.

So my point there is that now instead of a single RT, there is gonna be two for two headers. This is okayish, but not ideal

i see, in that case shoudn't we just get a very recent header from the trusted peer for subjective init, possibly even the latest head, and then calculate what tail we need from that, and sync backwards? syncing backwards doesnt suffer from LR attacks, only syncing forward from something outside the TP

Backwards sync is the next stage and something that was on our radar for quite a while. It gonna help us to avoid syncing validator sets and commits which are necessary for forward sync. We also gonna do it with MMR so we can proof inclusion in the past without needing the whole header chain up to that point in the past.

cips/cip-x.md

Co-authored-by: Ethan Buchman <[email protected]>

Wondertan · 2025-03-26T17:12:03Z

@jcstein, when should we assign the number?

Co-authored-by: Ethan Buchman <[email protected]>

Wondertan · 2025-03-26T17:44:15Z

GH applied the same suggestion from Ethan 5 times as separate commits lol

jcstein · 2025-03-27T21:24:04Z

@Wondertan after comments are resolved. can you please close the ones above that you've resolved with a link to the commit before resolving? thank you!

then we'll assign number and get this merged. just want to make sure @ebuchman's comments are addressed!

Wondertan · 2025-03-28T13:26:11Z

can you please close the ones above that you've resolved with a link to the commit before resolving?

You mean like I did with your review? Ofc

numBlocksToKeep -> headersToRetain

jcstein · 2025-03-31T14:31:09Z

this should be good after comments are resolved @Wondertan

jcstein · 2025-04-01T18:09:53Z

I just realized maybe @Wondertan doesn't have permissions to resolve the comments. can you confirm?

cips/cip-x.md

walldiss · 2025-04-02T14:56:14Z

cips/cip-x.md

+Currently, every data availability (DA) node type synchronizes all historical headers starting from genesis (or other statically configured
+historical header) until the subjectively initialized head of the chain. We change that by adding a way to sync a
+constant size range of headers instead of the whole history.


Suggested change

Currently, every data availability (DA) node type synchronizes all historical headers starting from genesis (or other statically configured

historical header) until the subjectively initialized head of the chain. We change that by adding a way to sync a

constant size range of headers instead of the whole history.

Currently, every data availability (DA) node type synchronizes all historical headers starting from genesis (or other statically configured

historical header) until the subjectively initialized head of the chain. We change that by adding a way to sync a

constant time range of headers instead of the whole history.

Was it meant to be fixed time range instead of size?

It more length/size instead of time. Constant time usually refers to O(1) algorithms, which is not the case here.

walldiss · 2025-04-02T15:32:06Z

cips/cip-x.md

+The estimation of Tail is done as follows:
+
+```go
+func estimateTail(head Header, blockTime, window time.Duration) (height uint64) {
+    headersToRetain := window / blockTime
+    tail := head.Height() - headersToRetain
+    return tail
+}
+```


Re-posting thought I've shared with you on 1-1:
The current estimation approach relies on a constant blockTime parameter, but in reality, block times can vary. It means that estimated Tail could drift from the intended SamplingWindow timeframe.
Given that HeaderPruningWindow must be ≥ SamplingWindow for proper sampling it can lead to potential sampling failures if too few headers are retained.
If the estimation is too conservative (retains more headers than needed), it reduces the storage efficiency benefits

Alternative approach:
Instead of local estimation, consider having nodes request the exact Tail header from trusted peers. The trusted peer could find the header closest to time.now() - SamplingWindow, providing precise alignment with sampling requirements and eliminating estimation errors.

This approach seems more reliable while still achieving the storage optimization goals. Thoughts?

cip: Header Pruning for LNs

7665385

celestia-bot requested review from ebuchman, jcstein and rootulp March 21, 2025 18:05

Wondertan added 2 commits March 21, 2025 19:07

fmt

47ca89d

add forum link

fe1e5cb

Wondertan force-pushed the header-pruning branch from 9ed346d to fe1e5cb Compare March 21, 2025 18:12

jcstein reviewed Mar 21, 2025

View reviewed changes

cips/cip-x.md Outdated Show resolved Hide resolved

jcstein reviewed Mar 21, 2025

View reviewed changes

cips/cip-x.md Outdated Show resolved Hide resolved

jcstein reviewed Mar 21, 2025

View reviewed changes

cips/cip-x.md Outdated Show resolved Hide resolved

jcstein reviewed Mar 21, 2025

View reviewed changes

cips/cip-x.md Outdated Show resolved Hide resolved

jcstein reviewed Mar 21, 2025

View reviewed changes

cips/cip-x.md Outdated Show resolved Hide resolved

jcstein requested changes Mar 21, 2025

View reviewed changes

rootulp assigned Wondertan Mar 22, 2025

rootulp reviewed Mar 22, 2025

View reviewed changes

cips/cip-x.md Outdated Show resolved Hide resolved

cips/cip-x.md Outdated Show resolved Hide resolved

Wondertan and others added 2 commits March 23, 2025 16:50

Apply suggestions from code review

6de5c93

Co-authored-by: Josh Stein <[email protected]> Co-authored-by: Rootul P <[email protected]>

fix param todo

f6b5dce

more or equal onlu

fd1c1cd

renaynay reviewed Mar 24, 2025

View reviewed changes

cips/cip-x.md Outdated Show resolved Hide resolved

Wondertan requested review from rootulp and jcstein March 24, 2025 12:09

Wondertan mentioned this pull request Mar 24, 2025

Pruning celestiaorg/go-header#254

Open

add clarification

ceed598

jcstein approved these changes Mar 25, 2025

View reviewed changes

fix: metadata table

21d0277

ebuchman reviewed Mar 25, 2025

View reviewed changes

Wondertan and others added 2 commits March 26, 2025 18:08

Update cips/cip-x.md

b0957d5

Co-authored-by: Ethan Buchman <[email protected]>

Update cips/cip-x.md

bb0bd2a

Co-authored-by: Ethan Buchman <[email protected]>

Wondertan and others added 2 commits March 26, 2025 18:09

Update cips/cip-x.md

bd86c75

Co-authored-by: Ethan Buchman <[email protected]>

Update cips/cip-x.md

a547fd7

Co-authored-by: Ethan Buchman <[email protected]>

Update cips/cip-x.md

34b1b84

Co-authored-by: Ethan Buchman <[email protected]>

Wondertan added 2 commits March 28, 2025 14:23

resolving review comments

7bb5933

fmt

cba3caa

Update cip-x.md

0669993

numBlocksToKeep -> headersToRetain

jcstein self-requested a review March 31, 2025 14:30

jcstein mentioned this pull request Apr 1, 2025

docs: define SamplingWindow variable in CIP-4 #282

Open

jcstein reviewed Apr 1, 2025

View reviewed changes

cips/cip-x.md Outdated Show resolved Hide resolved

jcstein and others added 3 commits April 1, 2025 14:10

Apply suggestions from code review

1eac8d7

clarify tail estimation

fa9243f

lint

d6ee4e6

jcstein self-requested a review April 1, 2025 18:46

jcstein reviewed Apr 1, 2025

View reviewed changes

cips/cip-x.md Outdated Show resolved Hide resolved

Apply suggestions from code review

58561af

jcstein mentioned this pull request Apr 1, 2025

Celestia Core Devs Call 27 #278

Closed

walldiss reviewed Apr 2, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cip: Header Pruning for LNs #279

cip: Header Pruning for LNs #279

Wondertan commented Mar 21, 2025 •

edited

Loading

jcstein left a comment

rootulp left a comment

Wondertan commented Mar 23, 2025

cmwaters commented Mar 24, 2025

jcstein left a comment

ebuchman left a comment

ebuchman Mar 25, 2025

Wondertan Mar 26, 2025

Wondertan Mar 28, 2025

ebuchman Mar 28, 2025

Wondertan Apr 1, 2025

ebuchman Apr 1, 2025

Wondertan Apr 2, 2025 •

edited

Loading

Wondertan Apr 2, 2025

ebuchman Apr 2, 2025

Wondertan Apr 2, 2025

Wondertan commented Mar 26, 2025 •

edited

Loading

Wondertan commented Mar 26, 2025

jcstein commented Mar 27, 2025

Wondertan commented Mar 28, 2025

jcstein commented Mar 31, 2025

jcstein commented Apr 1, 2025

walldiss Apr 2, 2025

Wondertan Apr 2, 2025

walldiss Apr 2, 2025

cip: Header Pruning for LNs #279

Are you sure you want to change the base?

cip: Header Pruning for LNs #279

Conversation

Wondertan commented Mar 21, 2025 • edited Loading

Overview

jcstein left a comment

Choose a reason for hiding this comment

rootulp left a comment

Choose a reason for hiding this comment

Wondertan commented Mar 23, 2025

cmwaters commented Mar 24, 2025

jcstein left a comment

Choose a reason for hiding this comment

ebuchman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Wondertan Apr 2, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Wondertan commented Mar 26, 2025 • edited Loading

Wondertan commented Mar 26, 2025

jcstein commented Mar 27, 2025

Wondertan commented Mar 28, 2025

jcstein commented Mar 31, 2025

jcstein commented Apr 1, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Wondertan commented Mar 21, 2025 •

edited

Loading

Wondertan Apr 2, 2025 •

edited

Loading

Wondertan commented Mar 26, 2025 •

edited

Loading