Open
Description
Most people are ok with whatever chunker and hash function is the current default in commands that import data to IPFS.
In case of go-ipfs, these are ipfs add
, ipfs dag put
, and ipfs block put
.
However, one can not only use custom --chunker
and --hash
function when doing ipfs add
, but also choose to produce TrickleDAG instead of MErkleDAG by passing --trickle
, enable or disable --raw-leaves
, or even write own software that chunks and hashes and assembles UnixFS DAG in novel ways.
One can go beyond that, and import a JSON data as dag-json or dag-cbor, creating data structures beyond regular files and directories.
We need an article that explains:
- what is the current default when importing files and why
- chunker (why we use size-based, when to use rabin or buzzhash)
- hash (why we use sha2-256)
- raw leaves (possible and default when cidv1 is used, but legacy implementations used cidv0 without this)
- cid version
- we should document cid v1 as the default, but note that legacy implementations may use v0
- dag type (
--trickle
better suited for append-only data such as logs?)
- what are the knobs one can change during import, and what is their impact/tradeoffs
- things to hitn at, but no need to go to deep
- note dag-pb alternatives exist, mention dag-json and dag-cbor, and hint when using non-Unixfs DAGs make sense
Prior art:
--help
explainer around different chunkers Expand rolling chunker documentation kubo#8952- DAG metadata impacting final CID Hash changes if we change our metadata #1152