Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement The Update Framework for Project Signing #3724

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

walterhpearce
Copy link

@walterhpearce walterhpearce commented Oct 31, 2024

This RFC was co-authored by Walter Pearce (@walterhpearce) and Josh Triplett (@joshtriplett).

Here, we propose the alternative adoption and implementation of The Update Framework for providing the chain and trust and implementing signatures for crates and releases. This provides us with the same mitigations and protections as in the previous RFC, utilizing the standard TUF framework for achieving it using new industry standard techniques, tailored for the Rust ecosystem.

Big thanks to @epage @Eh2406 @mdtro @woodruffw for the insights and discussion around this. Also heartfelt thanks to anyone else I missed who participated in the RustConf 2024 Cargo Vault discussions around this topic.

We're going to have follow-up discussions with the infrastructure team on deploying and documenting the infrastructure for this, and on using this infrastructure to set up mirrors (which was one of the primary motivations for creating this infrastructure). Depending on the complexity of setting up mirroring, we may follow up with a subsequent RFC on mirroring.

(This RFC supersedes and closes #3579, the previous draft Public Key Infrastructure RFC, which did not use TUF.)

Rendered

@joshtriplett joshtriplett added T-infra Relevant to the infrastructure team, which will review and decide on the RFC. T-crates-io Relevant to the crates.io team, which will review and decide on the RFC. T-cargo Relevant to the Cargo team, which will review and decide on the RFC. labels Oct 31, 2024
Comment on lines 173 to 179
## (cargo-tuf-lib) Standard TUF Implementation

We propose creating a new crate, `cargo-tuf-lib`, which shall be used by both Cargo and Rustup for doing TUF synchronization and update procedures. This library shall be a shim wrapper around the `rust-tuf` crate (https://github.com/rustfoundation/rust-tuf), providing a simplified and shared interface for doing synchronization and verification of the TUF repositories and their files.

The API surface of this crate is to be determined upon implementation in Cargo and Rustup. However, because both tools will need to perform synchronization and validation against the tuf-root repository, they shall used this shared interface to guarantee compatibility.

This API will include operations to sync the TUF repositories efficiently, and to perform a verified download of an object.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsure if we should get this specific about the implementation in this RFC. Also, if this is a Rust Project crate, it is then subject to https://rust-lang.github.io/rfcs/3119-rust-crate-ownership.html

Comment on lines +187 to +191
Creation of a new `~/.cargo/tuf` directory. (If Cargo stores its registry information in another directory, the `tuf` directory should be stored alongside the `registry` directory.) This directory shall be used for all TUF operations by project tools (both Rustup and Cargo). The cargo folder was chosen as the main location of residence for these files given that although Rustup will be performing the initialization of these folders, there is already a precedent set for shared files living within the cargo folder.

- `~/.cargo/tuf` The top-level directory of local copies of TUF repositories
- `~/.cargo/tuf/root` a copy of the `tuf-root` repository locally synchronized
- `~/.cargo/tuf/crates` a copy of the `tuf-crates` repository locally synchronized
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For crates, I would assume this is per-registry and needs to be stored in a registry-specific location, maybe as a sibling to the index?


## TAP-16 Implementation

We're proposing to use [TAP-16](https://github.com/theupdateframework/taps/blob/master/tap16.md) to provide efficient update checking and download sizes. TAP-16 uses Merkle trees rather than full lists for the download of a snapshot of the inventory of a repository (`snapshot.json`). We want to ensure that, as crates.io grows, the total size clients have to download when checking for updates remains small.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice, what is the expected impact for this for crates.io's current size and if it grew to the size of some of the larger registries for other languages?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it would be good to understand the time and space complexities of the operations involved.

Does TAP-16 prevent a full snapshot being necessary at all, or does it just reduce the download size? I think it should be possible to validate a set of dependencies without having to have a full snapshot of the repository...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completely agree that these are the critical questions. Based on my reading of TAP-16, the answers entirely depend on the implementation details that are specified in the POUF. I think we need these answers in order to stabilize this functionality.

Comment on lines +199 to +200
- `cargo-tuf-lib::sync` attempted prior to an index update
- `cargo-tuf-lib::verify_snapshot` called on an index update on the entire index
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the sparse registry, we don't have a "index update" phase but we update the parts of the registry if-needed as we perform a registry operation. We also don't have the entire index.

Do we need to do this sync even if we won't download anything new from the registry? Could we instead only check if there was a change? Can we only check what changed or what is downloaded?

@joshtriplett joshtriplett added the I-council-nominated Indicates that an issue has been nominated for prioritizing at the next council meeting. label Oct 31, 2024
@joshtriplett
Copy link
Member

joshtriplett commented Oct 31, 2024

Nominating this for the Leadership Council, to approve the one line-item about appointing the root quorum. See https://rust-lang.zulipchat.com/#narrow/channel/392734-council/topic/Approval.20of.20Council-related.20components.20of.20signing.20RFC for details.

(ed: Inlining that:)

@walterhpearce and I just published a new version of the signing infrastructure RFC: #3724

Most of this RFC is going to be the domain of the infrastructure, crates.io, and Cargo teams. However, the RFC specifies that it's the responsibility of the Leadership Council to appoint the members of the root signing quorum, as trusted individuals in a 5-of-9 quorum whose signatures jointly constitute the root signing key. (They don't decide what to sign, though they do flag "hey, something seems wrong here".)

I wanted to call the Council's attention to this, so that the Council can sign off on that one particular line item of the RFC. I don't think it makes sense to add the Council as another team on the RFC as a whole, just for that one line item. I think it makes sense for the Council to make a decision about accepting the responsibility to appoint the quorum members, and make a post to the RFC saying "the Council confirms it is taking on this responsibility".


This proposal delegates the authority of selecting and managing the quorum membership to the Rust Project's Leadership Council. We recommend that the quorum be selected from trusted individuals within the Project and Foundation. This is a position of trust, not one of authority over anything other than safe handling of key material and flagging of unusual activity; the 9 members are responsible for executing quorum operations, and providing transparency and trust (through their refusal to participate in any malicious operations), but not for deciding independently what key operations should happen.

These individuals should be available in the event quorum is needed for root key operations. These roles can and should be reappointed as needed, but to simplify logistics, these roles should not require rotation more often than 2-3 years. (Operations requiring quorum are expected to be rare, and an annual rotation would make rotation of the quorum group the vast majority of quorum operations.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe there should be some kind of signing ceremony at least annually to check that the quorum can still be met?

I could imagine nothing needing doing for 5 years, and then finding out that three people are un-contactable and two people can't find their keys...

It might be better to find out sooner so that the lost keys can be rotated out and new keys rotated in as soon as possible.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TUF metadata files have an expiration date which forces that they're periodically re-signed, and also ensures clients aren't using outdated files. So this can be set to e.g. 1 year in the future, and at least a quorum will need to come together at that interval to sign a new root metadata file with a new expiration date.


## TAP-16 Implementation

We're proposing to use [TAP-16](https://github.com/theupdateframework/taps/blob/master/tap16.md) to provide efficient update checking and download sizes. TAP-16 uses Merkle trees rather than full lists for the download of a snapshot of the inventory of a repository (`snapshot.json`). We want to ensure that, as crates.io grows, the total size clients have to download when checking for updates remains small.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it would be good to understand the time and space complexities of the operations involved.

Does TAP-16 prevent a full snapshot being necessary at all, or does it just reduce the download size? I think it should be possible to validate a set of dependencies without having to have a full snapshot of the repository...

Copy link
Member

@pietroalbini pietroalbini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RFC looks like a great improvement over the previous iteration, thank you so much for everyone who contributed to it! ❤️

The only piece of feedback I have (spread across multiple comments) is how the TUF roles are distributed, but otherwise this looks great to me.


## Summary & Motivations

We propose the creation of two distinct TUF repositories for signing of Rust Project content and crates, respectively. Two main motivations exist for separating these concerns: The cadence of content published within each, and the trust of each. Rustup and Rust releases (both nightly and stable) are conducted under a controlled and predictable manner which is managed by the Project. However, crates are published by the community, and as such we see a larger and much more varied volume of content which may exist within this repository. These repositories. We have additionally modeled signing the root of one repository by the other - this implicitly grants us a chain of trust from the "Project" (tuf-root) to the separate crates.io repository (tuf-crates). The sections below go into more detail on each repository and its configuration.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have a unified TUF repository for both crates.io and releases.

On the technical side, there is no difference in trust between having separate repositories or a single one, as with partial delegation we can prevent one from changing the other. The frequency of changes also shouldn't impact TUF (to my understanding), as both rustup and Cargo will have to still download the snapshot JSON to see if updates are present.

On the social side, TUF is going to be mostly an implementation detail, and users should not be expected to manually verify the TUF repositories or even know they exist. If someone cares enough to actually check them manually, they will understand that the crates content signed with TUF is not endorsed like the releases are.

The disadvantage of the two repositories is that we have to maintain two different quorums, which adds additional overhead (especially for the crates.io one).


##### Release (Stable/Beta) Role

The Release role shall have the authority to only sign stable rust releases. We propose this role also consist of a quorum model, consisting of all members of the release team. This role should have a 3 member threshold, and always consist of all members of the release team. At the time a new stable release is being compiled and shipped, a signing quorum must be conducted for this release.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The release team has been moving to get the release process as automated and hands-off as possible, and we finally achieved it a few months ago. It's now possible for members of the release team to start a whole release with a single command, and publishing releases doesn't require infra-admins privileges anymore.

This new release process doesn't let the person publishing the release control the contents of the release in any way, or have access to any signing key, and it only allows publishing the latest commit in the stable branch (which went through CI).

Requiring three quarters of the release team to sign the release would feel like a regression to me, as it would add more overhead to the volunteers running the release. With the new release process, the risk of a release team member releasing a rogue release has been mostly mitigated.

It would also mean that release team members have to start carrying persistent private keys with privileged access, instead of the signing key living locked down in AWS KMS with full audit logs.


##### Release (Stable/Beta) Role

The Release role shall have the authority to only sign stable rust releases. We propose this role also consist of a quorum model, consisting of all members of the release team. This role should have a 3 member threshold, and always consist of all members of the release team. At the time a new stable release is being compiled and shipped, a signing quorum must be conducted for this release.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beta releases are fully automated to the same level of nightly releases. They happen automatically with any team member input as soon as a new commit lands in the beta branch. They must continue not to require any user interaction.


###### Rustup Role

This shall be a quorum based role, consisting of all members of the Rustup & Infrastructure team members. We recommend having at least a 3-member threshold. We have decided to have this roles quorum be broader to allow for emergency updates and releases of Rustup; we may want to increase the threshold when these teams have more members.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On one hand, the team here should be the release team, not the infrastructure team. In the past most rustup releases have been done by me and Mark with our release team hat.

On the other hand, JD has been working to migrate rustup to the same release process used by Rust releases, so my comment about stable releases applies here as well (they should not require individuals signing, the signing key should live in AWS KMS).


##### Root Role

The root role of the tuf-crates repository shall consist of all members of the crates.io rust team with a threshold of 3. As a special case, updating this role shall also require a resigning by the root role of the tuf-root repository (sign a metadata entry existing within tuf-root). This means any changes to the membership of the crates.io team will also require a signing ceremony via github by the root quorum.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the quorum set here to the crates.io team, rather than the same quorum as the other repository (either by reusing the same repository or having the same keys as the quorum)? The crates.io team is not involved in operating the infrastructure (they wouldn't have the access to manage the target role).

Having every member of the crates.io team being part of the quorum would mean onboarding new contributors would both be harder (as it would require a quorum event) and would imply a lot more trust given to the new member (compared to just approval rights on the repository).

@burdges
Copy link

burdges commented Nov 1, 2024

cc @tarcieri


## Summary & Motivations

We propose the creation of two distinct TUF repositories for signing of Rust Project content and crates, respectively. Two main motivations exist for separating these concerns: The cadence of content published within each, and the trust of each. Rustup and Rust releases (both nightly and stable) are conducted under a controlled and predictable manner which is managed by the Project. However, crates are published by the community, and as such we see a larger and much more varied volume of content which may exist within this repository. We have additionally modeled signing the root of one repository by the other - this implicitly grants us a chain of trust from the "Project" (tuf-root) to the separate crates.io repository (tuf-crates). The sections below go into more detail on each repository and its configuration.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why you're modeling this as two repositories with two separate roots rather than two delegated targets which effectively namespace resources and are managed under a single TUF repo.

The idea of a TUF root is it's self-signing, so if the keys listed in that file aren't intended to be able to sign future updates to the root, perhaps it isn't a root you want.

It sounds like the idea is you want to use tuf-root to delegate authority to tuf-crates, which sounds like a delegated target to me, e.g. you could have separate targets for crates versus releases or what have you, each with authority delegated to their own independent set of keys, but with a common root able to update the keys used to manage either.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, see also #3724 (comment).

## Repository Workflows
[tuf-on-ci](https://github.com/theupdateframework/tuf-on-ci) shall be used for workflows on each repository.

tuf-on-ci is a set of CI tools which are integrated with GitHub Actions for providing TUF quorum and hardware key support for managing TUF repositories via pull requests on GitHub. This is a ready made and production ready suite of CI that is used by the sigstore root signing for managing TUF quorums.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disclaimer: I'm one of the maintainers of tuf-on-ci. But feel free to also mention that GitHub is also using tuf-on-ci to manage our TUF root for artifact attestations.

#### Terminology

- `tuf`: The Update Framework and its specification
- `targets`: The actual content and files distributed and to be signed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

target is already an overloaded term in the Rust/Cargo world, we may want to be careful how we use it in documentation of this RFC to avoid having an additional meeting we need to disambiguate.


### Controlling enablement of TUF

This RFC does not specify how to handle non-`crates.io` repositories. Cargo can choose to enable TUF for third-party repositories in the future, or may default to only using TUF for crates.io unless otherwise configured. Cargo might choose to use an environment variable (e.g. `CARGO_TUF_DISABLE`) to disable all usage of TUF on third-party repositories.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I agree with this. The full TUF process involves the binary Cargo/Rustup shipping with an initial public key on which to base the root of trust. It is hard to imagine a system where Cargo ships with a root public key for arbitrary third-party registries. On the other hand the Tap-16 mechanism is fundamentally useful independent of the rest of TUF. It allows for efficient listing of all package names, and efficiently scanning for changed packages. I think that functionality should be documented clearly enough that third-party registries can use it.

- Creation of `rust-lang/tuf-root` and `rust-lang/tuf-crates` repositories on GitHub
- Initiation of the root signing ceremony via tuf-on-ci on each repository
- Facilitate the initial and subsequent signing events
- Determine how to mirror and distribute both repositories via CDN. We recommend that synchronization of the repository for the `rust-tuf-lib` should be done via HTTPS downloads from the CDN to prevent the end-user downloads being reliant on `git` or `GitHub`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTP there is no need for the S. The entire point of TUF is that you don't need to trust any of the computers you received the file from in order to verify that they are authentic and up-to-date.

Copy link

@ctz ctz Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTP there is no need for the S.

Does cargo have any privacy goals? Is it OK to publish that a user has downloaded, eg, https://crates.io/crates/buttplug ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would still like to serve everything under rust-lang.org and crates.io as HTTPS.


All members of all signing quorums within the Rust Project will require hardware keys, the expenses for which will be covered by the Rust Foundation.

## Root Quorum Model
Copy link

@woodruffw woodruffw Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some general notes about hardware-backed quorum models, based on our experience on PyPI when attempting to deploy TUF on PyPI (i.e. PEP 458):

  1. Hardware (i.e. HSM) backed tokens are difficult to operationalize over expected key lifetimes:
    • You'll need to establish a chain of trust for the hardware itself (some, but not all, HSM models have baked-in device attestations).
    • You'll need to perform a secure offline signing and enrollment ceremony (we designed a runbook for PyPI, but it's pretty old at this point and was also constrainted by HSM vendor limitations that should be re-evaluated)
    • You'll need a testable compromise and roation process, to prevent/limit normalization of deviance around key management and enrollment into the quorum.
    • Each quorum party will ideally physically secure their HSM in a way that stymies a medium-complexity physical adversary: PyPI chose tamper-evident bags, and in practice each quorum member will need multiple bags and a tag-in-tag-out procedure for removing their key from their bag if occasional key operations are expected.
  2. This section is currently a little light on cryptographic details: it specifies the size of the quorum, but it doesn't say which types and sizes of keys are permitted in the root set, or how the community will verify that a key is actually enrolled within a particular HSM. For PyPI we stipulated a mix of P-256 and P-384 keys due to the limitations of HSMs at the time, but it might be possible to do Ed25519 keypairs with current commercial HSMs. We also prepared HSM-level attestations of key possession, although in practice hardware limitations meant that only the YubiHSMs actually supported root key attestation (versus attestation of the HSM itself).

As a whole, these were nontrivial issues to address in our initial attempt to implement TUF on PyPI, and IMO they're a large part of why TUF (in the form of PEP 458) hasn't materialized on PyPI. So as part of this RFC I recommend the Rust's Leadership Council think about the over/under on a hardware quorum model versus something simpler (e.g. a soft root key in a cloud HSM) or even a different, smaller-footprint architecture (like a transparency scheme). I'll leave a separate comment on the latter.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add to what @woodruffw wrote, for our HSM enrolment we require each key-holder to upload a device and key attestation that can be verified up to the manufacturer's root CA. Verification of this is done via GitHub actions, so for each new key added we get automatic verification.

We also hook this into the TUF verification process, so that each time a TUF metadata document is updated we run a workflow that verify the TUF signatures via HSM's attested device key. This may be slightly overkill, but makes it simple and foolproof to see that no extra keys has been added that is not known and approved.

- [Trusted Publishing support for crates.io](https://github.com/rust-lang/rfcs/pull/3691)

Other posts on TUF usage:
- [Python PEP #458: Secure PyPI downloads with signed repository metadata](https://peps.python.org/pep-0458/)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned this in the comment about about hardware-backed quorums, but to highlight it here as well: PEP 458 was never fully rolled out on PyPI, in part because we had trouble operationalizing the signing model. So that doc specifies an architecture for TUF on PyPI, but it's worth nothing that the implementation itself isn't in place.

@woodruffw
Copy link

Thank you for making this RFC! I think it's a tangible improvement over the original "homegrown" PKI proposal.

At the same time, I want issue a caution about self-deployed PKIs, even ones that adopt a well-defined structure (like TUF has). I made a similar comment on the original RFC, but to refresh it:

PKI standup and maintenance are operationally complex and expensive (in terms of technical maturity and human hours), and are historically subject to "normalization of deviance" risks. The most common form of deviance around PKI construction is developing a (necessarily) complicated secure enrollment scheme, which is then difficult to rapidly turn around when key compromise occurs. The alternative to this is a "soft" PKI e.g. with software/cloud HSMs, which is significantly easier to operationalize but has a much weaker (strictly online) security model.

On the PyPI side of things, we've learned some hard lessons around trying to operationalize both end-user and index-wide signing. Based on our experiences, we're largely moving in the direction of cryptographic transparency as a way to isolate operational complexity:

  1. For end user signing, we currently support PEP 740, which is built up around Trusted Publishing and Sigstore. The underlying idea behind it is that users already have a trusted identity when they enroll a Trusted Publisher, so we use that same identity (via OIDC) as a signing identity through Sigstore's ability to bind machine identities to short-lived, auditable signing keys. The end result of this is that we have about 20,000 projects doing codesigning without having to think about it at all, or manage any long-lived signing keys.
  2. For index-level signing, we're currently looking at index-wide transparency approaches similar to Go's checksum database. These approaches fall under the general category of binary transparency, and have a few key security and operational benefits from our perspective:
    • Minimal-to-no-PKI requirements: in the simplest possible form, a binary transparency log only has one active public key (for the log itself). This key can be distributed by baking it into the client itself (go or rustup), or through a well-known URL on the public Web PKI, or through a TUF repo.
    • Consistency and auditability on top of integrity and authenticity: TUF provides strong authenticity and integrity guarantees modulo trust in the root set, but doesn't provide consistency or auditability properties on its own: a compromised TUF repository can serve "split views" wherein an attacker serves different-but-valid signatures to different audiences (e.g. a subset of intended victims for a targeted attack), and there's no publicly auditable consistency proof for signatures listed in a TUF repo (meaning that there's no strong immutable public mapping of package identifiers to their content hashes). Transparency schemes provide both of these properties (via a publicly auditable, verifiable transparency log) while also providing integrity and authenticity. From a quick look TAP 16 appears to introduce similar primitives to TUF as a transparency log has (specifically, a Merkle tree), but I'm not clear on whether they provide the same monitoring/witnessing capabilities as a normal transparency scheme does.
    • Client simplicity: with a binary transparency scheme, the client side verification of index authenticity/integrity/consistency is small and cryptographically parsimonious: the client verifies a single inclusion proof + ECDSA/EdDSA signature for the artifact (crate, or Rust distribution) being installed, plus zero or more co-signatures from ecosystem witnesses (these are optional, but make it even log for the index to engage in split-views/inconsistency). This makes the "stack" on the client side very small: there's no X.509 or other format parsing, just ECC signature verification and a small amount of Merkle tree manipulation.

TL;DR: I think this RFC is a marked improvement over the original PKI RFC, but I want to re-iterate my opinion that there is an "iceberg" of complexity in PKI deployment - even for TUF - that represents a non-trivial operational risk versus a transparency-style model.

@trishankkarthik
Copy link

PKI standup and maintenance are operationally complex and expensive (in terms of technical maturity and human hours), and are historically subject to "normalization of deviance" risks. The most common form of deviance around PKI construction is developing a (necessarily) complicated secure enrollment scheme, which is then difficult to rapidly turn around when key compromise occurs. The alternative to this is a "soft" PKI e.g. with software/cloud HSMs, which is significantly easier to operationalize but has a much weaker (strictly online) security model.

TUF does not require that you use HSMs for root keys. TAP 18 allows TUF to use ephemeral keys (Sigstore's Fulcio). For PEP 458, @kairoaraujo and I plan to allow PyPI admins to use the YubiHSMs you prepared as well as Fulcio.

  • For end user signing, we currently support PEP 740, which is built up around Trusted Publishing and Sigstore. The underlying idea behind it is that users already have a trusted identity when they enroll a Trusted Publisher, so we use that same identity (via OIDC) as a signing identity through Sigstore's ability to bind machine identities to short-lived, auditable signing keys. The end result of this is that we have about 20,000 projects doing codesigning without having to think about it at all, or manage any long-lived signing keys.

The problem is that there will eventually be a long-lived signing key somewhere. (Even Fulcio depends on relatively long-lived signing keys distributed by the Sigstore root of trust.) Each of the 20,000 projects use slightly different identities (e.g., GitHub repos). How is a client expected to know which identity is supposed to be trusted for which in-toto attestation1 for which package?

  • Minimal-to-no-PKI requirements: in the simplest possible form, a binary transparency log only has one active public key (for the log itself). This key can be distributed by baking it into the client itself (go or rustup), or through a well-known URL on the public Web PKI, or through a TUF repo.

This is true only for as long as you handwave away the problem of key distribution for multiple attestations as pointed out above. In fact, when using a TUF repo (as the Sigstore root of trust does), you can use a single set of root keys to "slash-and-burn" the keys to the whole system (including the root keys themselves) without end-users noticing anything2.

  • Consistency and auditability on top of integrity and authenticity: TUF provides strong authenticity and integrity guarantees modulo trust in the root set, but doesn't provide consistency or auditability properties on its own: a compromised TUF repository can serve "split views" wherein an attacker serves different-but-valid signatures to different audiences (e.g. a subset of intended victims for a targeted attack)

We have long ago discussed how TUF and Transparent/Tamper-Evident/whathaveyou logs are complimentary to each other. One idea is to record TUF timestamp metadata unto Transparent Logs so that you can audit TUF repos for, say, split views (aka forking attacks, which are relatively unfruitful attacks to carry out during an actual repository compromise). Speaking of monitoring, the auditability of Transparent Logs depends on active, independent monitoring, which is not the case right now as far as I'm aware right now.

  • and there's no publicly auditable consistency proof for signatures listed in a TUF repo (meaning that there's no strong immutable public mapping of package identifiers to their content hashes).

Not true if you, for example, record the inclusion proofs within the TUF metadata, or the TUF timestamp metadata unto Transparent Logs as discussed above.

  • From a quick look TAP 16 appears to introduce similar primitives to TUF as a transparency log has (specifically, a Merkle tree), but I'm not clear on whether they provide the same monitoring/witnessing capabilities as a normal transparency scheme does.

TAP 16 is for scaling TUF snapshot metadata, not for replacing Transparent Logs.

  • Client simplicity: with a binary transparency scheme, the client side verification of index authenticity/integrity/consistency is small and cryptographically parsimonious: the client verifies a single inclusion proof + ECDSA/EdDSA signature for the artifact (crate, or Rust distribution) being installed, plus zero or more co-signatures from ecosystem witnesses (these are optional, but make it even log for the index to engage in split-views/inconsistency). This makes the "stack" on the client side very small: there's no X.509 or other format parsing, just ECC signature verification and a small amount of Merkle tree manipulation.

You are describing a system that does not exist yet AFAICT (at least for PyPI or Homebrew). Furthermore, it is no longer true that the ecosystem will use only a single key (from a Transparent Log) for verification if you also need to somehow verify signatures from these "witnesses" (which is apparently the case if you need stronger assurance, but never a guarantee, that a package version has indeed been included at most once). You are then back to square one with some sort of PKI, and I suspect our only real argument is for what exactly.


Objectively speaking: TUF has had a reputation of being "too complicated" for two reasons, one more valid than the other.

The first is documentation and tooling (especially for bootstrapping, key management, and updating metadata), which is relative easy to solve3, especially with projects like Rugged, tuf-on-ci, and RSTUF. The real genius of Sigstore is that it is a managed service, and I suspect few would have called it "simple" otherwise. There is absolutely no reason why TUF couldn't also be another managed service, perhaps on Sigstore itself.

The second real problem then, to me, is scalability, especially as learned by those who have actually deployed it in production at scale. What I think is really worth discussing is whether TUF can and ought to handle a large enough number of updates4 at a fast enough rate.


Finally, let me end with which use cases would need something like TUF, especially for OSS package registries. Index signing is just the foundational step. If you use TUF simply to sign your package indices, then there might be a good argument that TUF is "overkill" for the problem.

However, TUF shines the moment you have multiple keys in the system, you need revocation (both of which are just a question of when not if), and you have dependencies between packages (an underrated problem). The biggest downside to Transparent Logs is precisely due to their upside: they do not allow for easy revocation due to immutability. Once you have a bad package version with something like Binary Transparency5, you are forever stuck with it, and need out-of-band mechanisms in order to know to avoid it in the first place. Think of TUF as an auditable, mutable layer over an immutable one.

Here is how I think TUF should be used to secure OSS package registries (in increasing order of importance):

  1. Protect dependencies between packages from mix-and-match and rollback attacks
  2. Securely map which in-toto policies (aka layouts) apply to which versions of packages: this is what I suspect independent, third-party witnesses would care about
  3. Securely map the public keys used to verify threshold signatures for Verification Summary Attestations (VSAs) from witnesses for a given package version: this is what I suspect package managers would care about

Apologies if I was curt or inaccurate anywhere, as I am busy traveling in an opposite time zone at the moment. I look forward to our continued discussion, and how we (the TUF community) can help Cargo with using TUF and Transparent Logs.

Footnotes

  1. Complicated by the fact that there will eventually be multiple attestations per package.

  2. Assuming no bugs in the TUF specification or implementation.

  3. Actually, TLS (even v1.3) is vastly more complicated than TUF, yet you no longer hear anyone complaining about it. Why? Better tooling like Caddy and Let's Encrypt.

  4. Of what exactly: packages, package indices, policies (described below), or something even higher-level?

  5. Never mind the possible confusion: what's the difference between Transparent/Tamper-Evident Logs, Sigstore, Binary Transparency, and whatever new transparency logs arise in the future?

@woodruffw
Copy link

TUF does not require that you use HSMs for root keys. TAP 18 allows TUF to use ephemeral keys (Sigstore's Fulcio). For PEP 458, @kairoaraujo and I plan to allow PyPI admins to use the YubiHSMs you prepared as well as Fulcio.

Sorry, didn't mean to imply this was a universal feature of TUF -- that was in direct reference to what this current RFC language proscribes. If they don't go with a hardware-backed approach, the complexity calculations change.

This is true only for as long as you handwave away the problem of key distribution for multiple attestations as pointed out above. In fact, when using a TUF repo (as the Sigstore root of trust does), you can use a single set of root keys to "slash-and-burn" the keys to the whole system (including the root keys themselves) without end-users noticing anything2.

This was probably unclear: in that context, I was talking about binary transparency as a separate scheme, not something that someone would build using Sigstore. In a "traditional" BT deployment (e.g. of the sort Go does), there are no attestations per se and only one public key (the log's key).

(You're right that with witnessing, you then have N keys, one for each independent witness. That maps to a key & metadata discovery problem, and IMO TUF or .well-known key discovery would be a good fit for it.)

FWIW, I happen to agree that Sigstore is very complicated 🙂. I probably should have left the references to PEP 740, etc. out, since that was more for context than a technical argument for an exact equivalent in Rust. For Rust, I think my baseline position is "it's easy to prematurely discount the cost of maintaining a PKI, and so technologies that reduce the amount of PKI done are a good fit."

### (tuf-root + tuf-crates) Crates.io Membership Change

- The crates.io team will update the root role in the `tuf-crates` repository, triggering a signing event that the existing crates.io team must sign via Pull Request.
- An update to the tuf-crates-root.json file will occur in the `tuf-root` repository, which shall trigger a new singing event Pull Request, which the root quorum must perform.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*signing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- An update to the tuf-crates-root.json file will occur in the `tuf-root` repository, which shall trigger a new singing event Pull Request, which the root quorum must perform.
- An update to the tuf-crates-root.json file will occur in the `tuf-root` repository, which shall trigger a new signing event Pull Request, which the root quorum must perform.

(to make it easier to fix 😉)

@trishankkarthik
Copy link

For Rust, I think my baseline position is "it's easy to prematurely discount the cost of maintaining a PKI, and so technologies that reduce the amount of PKI done are a good fit."

To which my answer remains the same: better tooling (a largely one-time cost).

@SantiagoTorres
Copy link

TL;DR: I think this RFC is a marked improvement over the original PKI RFC, but I want to re-iterate my opinion that there is an "iceberg" of complexity in PKI deployment - even for TUF - that represents a non-trivial operational risk versus a transparency-style model.

Sorry, jumping in with two quick nits. I think transparency systems are sometimes considered opposite of e.g., TUF or anything based off of cryptographic signatures, but that is not the case. Server-side signing (be it on a tlog or an NPM registry) does not provide the same security argument as producer/client-side signing (e.g., as done by TUF targets, PGP, in-toto, a signed SBOM, etc).

This was probably unclear: in that context, I was talking about binary transparency as a separate scheme, not something that someone would build using Sigstore. In a "traditional" BT deployment (e.g. of the sort Go does), there are no attestations per se and only one public key (the log's key).

Nit 1: Which I believe achieves the exact same security properties as the current cargo index on a git repository. I'm not sure why adding another historic-MHT backend (or any hash-chained variant thereof) to store metadata would achieve anything new. At best, you could argue that adding a CI job to sign commits would suffice and provide the same fundamental win as a "traditional BT system".

(You're right that with witnessing, you then have N keys, one for each independent witness. That maps to a key & metadata discovery problem, and IMO TUF or .well-known key discovery would be a good fit for it.)

Witnessing (providing protection against fork*/split-view attacks) is not the same as monitoring (studying the semantic properties of a log entry to identify maliciously-written entries), and, as far as we know, there is near 0 independent monitoring of most BT deployments --- there is almost no knowledge of how a good entry looks like!

Further, I find it strange to push back against PKI, given that operationalizing a k-n witness/monitor system that notifies and/or blocks known-bad entries is also an open research problem, and also requires a PKI-like system for enrollment of monitors and gossiping between them. Using TUF or .well-known is effectively a "the secure turtles are all the way down" type of argument: if you maintain a TUF repository to deliver witness trust metadata, why not deliver package trust metadata using the same mechanism (nit 2)?

I don't mean to be snarky, but I sometimes wonder whether, if SSL was being worked on in 2020+, people would argue that we just need a "html and javascript transparency" and do away without the PKI.

@woodruffw
Copy link

Nit 1: Which I believe achieves the exact same security properties as the current cargo index on a git repository.

This might be a misunderstanding of what you mean (in which case I apologize), but I believe that the two aren't analogous in this case: the fact that the current cargo index is on git is an implementation detail, and intentional security design dictates that we shouldn't assign security properties to incidental design choices. Plus, to my understanding, there's a long term - unrelated - objective of moving towards the sparse index protocol which is (TMI) not inherently underlaid by git.

(You're right that git provides a variant of the same properties! But git doesn't have an equivalent of inclusion proofs or witnessing per se that make a transparency scheme advantageous.)

Witnessing (providing protection against fork*/split-view attacks) is not the same as monitoring (studying the semantic properties of a log entry to identify maliciously-written entries), and, as far as we know, there is near 0 independent monitoring of most BT deployments --- there is almost no knowledge of how a good entry looks like!

This is a true and valid criticism. Apart from Go's sumdb, there are scant examples of real-world BT deployments with claimant models/personas to reason about. The only one besides Go that I'm aware of is Homebrew's use of Sigstore in an effectively-BT setting, which almost certainly has no independent monitors at the moment (besides myself, which wouldn't be fair to count 🙂).

At the same time, I think this is also true in practice for packaging ecosystem deployments of TUF -- we don't have PEP 458 yet for PyPI, and to my understanding the RubyGems' TUF implementation from 2013 didn't fully materialize (I apologize if I'm mischaracterizing things there).

if you maintain a TUF repository to deliver witness trust metadata, why not deliver package trust metadata using the same mechanism (nit 2)?

The main argument here is that transparency is an independently valuable property, one that TUF can't (at present?) provide.

The secondary argument (to your point above) is that, given a choice between a hardware backed k-of-n PKI and a k-n distributed witness/monitor PKI, the former is harder for the index persona to operationalize. That doesn't mean that the latter is easy (or, on net, even exactly as hard), but that it's easier for the index itself while achieving similar cryptographic properties, plus transparent properties.

I don't mean to be snarky, but I sometimes wonder whether, if SSL was being worked on in 2020+, people would argue that we just need a "html and javascript transparency" and do away without the PKI.

I think that would be silly, so I appreciate the snark. However, there's an underlying truth that's been revealed by the last 30 years of operational failures in the Web PKI: we need a PKI for the public web, but the public web has also become more secure as we've reduced the number of independent PKI venodrs on it and forced them into Certificate Transparency.

In other words: in 2020+, I think it would be a correct observation that adding auditability to a set of smaller PKIs is a better ecosystem-level design decision than standing up a new PKI.


As a commenting note: this is a concrete RFC, so I don't want to drag the thread into a more abstract non-crates discussion about alternatives. I think I've registered my concerns to a degree that I personally consider appropriate and I appreciate that they've been responded to in a detailed, thoughtful, and considerate manner. With that, I'll cease my posting and let this RFC run through with the concrete consideration it deserves.

@tarcieri
Copy link

tarcieri commented Nov 4, 2024

The main argument here is that transparency is an independently valuable property, one that TUF can't (at present?) provide.

I know Sigstore has already been discussed, but I believe it can provide this capability, either via a self-hosted deployment as described in bring your own TUF, or via signing TUF metadata files which I believe can work via Sigstore-as-a-service (I'm a bit confused, because I swear TUF and in-toto used to be explicitly listed as artifact formats natively supported by cosign in addition to OCI, eBPF, WASM, etc but now I can't find a reference to that anymore).

I could be mistaken as I'm not a Sigstore expert, so I would be curious to hear from others who have opined on Sigstore, as well as the authors of the Sigstore-related RFCs (cc @lulf).

If I understand correctly, it seems like something easy to adopt incrementally as a retroactive add-on, and thus something which this RFC doesn't need to directly concern itself with other than a potential mention for future work.

@traviscross
Copy link
Contributor

traviscross commented Nov 5, 2024

RFCs usually consider the credible alternatives, and binary transparency seems to be the main one to consider as @woodruffw discussed above. It'd be good if the RFC could discuss this specifically in its text (right now it does not).

The way I'd most prefer to see that presented would be for this RFC to start by laying out the specific security and operational goals it hopes to achieve (and why), and the specific security claims it wants to make to end users (and what assumptions those security claims make), and to then analyze and compare the proposed instantiation of TUF and of some reasonable binary transparency scheme on how they might meet those goals and support those security claims.

It would be good too for the drawbacks to discuss the required and ongoing operational maturity that running a PKI demands, and it would be good for the RFC to compare TUF with BT here also. I find myself wondering whether any formal analysis has been done as to the startup and ongoing costs here, in skilled personnel and other things, that we'd be committing ourselves to by adopting this.

If the idea is that we should do TUF and also do binary transparency, as has also been discussed in this thread, then some details around that would be good to discuss also in the future possibilities section.

@traviscross
Copy link
Contributor

traviscross commented Nov 5, 2024

Above, @woodruffw discusses the experience of PyPI in adopting TUF. The experience of PyPI here seems highly relevant. We'd hate to repeat any mistakes they've made or have to relearn lessons that they've learned. If we want a broader set of perspectives, perhaps we could reach out to others from that community also to collect experiences.

(Perhaps, @woodruffw, you could suggest others on the relevant team that might have valuable experiences to share? I'm reaching out myself to some people who may be able to share experiences with adopting TUF or BT in similar systems with us.)

In any case, it seems that it would be valuable for this RFC to try to collect and capture the experiences of PyPI and others in adopting this framework (e.g. in the prior art section), and to discuss any ways in which our situation is different or ways that we've adjusted our own approach so as to avoid any problems or hardships that others have encountered.

Comment on lines +25 to +26
`rust-lang/tuf-crates`
- Crates.io crate index
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`rust-lang/tuf-crates`
- Crates.io crate index
`rust-lang/tuf-crates`
- Crates.io crate index

(avoids a markdown rendering bug)


##### Release (Stable/Beta) Role

The Release role shall have the authority to only sign stable rust releases. We propose this role also consist of a quorum model, consisting of all members of the release team. This role should have a 3 member threshold, and always consist of all members of the release team. At the time a new stable release is being compiled and shipped, a signing quorum must be conducted for this release.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The Release role shall have the authority to only sign stable rust releases. We propose this role also consist of a quorum model, consisting of all members of the release team. This role should have a 3 member threshold, and always consist of all members of the release team. At the time a new stable release is being compiled and shipped, a signing quorum must be conducted for this release.
The Release role shall have the authority to only sign stable and beta rust releases. We propose this role also consist of a quorum model, consisting of all members of the release team. This role should have a 3 member threshold, and always consist of all members of the release team. At the time a new stable release is being compiled and shipped, a signing quorum must be conducted for this release.

unless the heading of this section is wrong?

Comment on lines +90 to +92
###### Snapshots & Timestamps Role

These roles shall be a single-member role with a key utilized for automation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are these roles used for? in other words: what does "snapshots & timestamps" mean in this context?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are technical terms of art from the TUF specification. The snapshot is a single representation of every file in the index with the current hash for that file. This is used to ensure that the mirror/MITM is not holding back some individual package while keeping the others up to date. The timestamp is a very small file that points to the latest version of the other files, whose signature is very short-lived. This prevents the mirror/MITM from staying stuck on an old snapshot.


### `rust-lang/tuf-crates`

The actual target for tuf-crates shall be the crates index and not the artifacts themselves. This means that the TUF repository for crates.io is performed on much smaller payloads, which still provides us with cryptographic security due to the fact the index contains SHA-256 hashes of the crate file artifacts. Given the index already consists of SHA-512 signatures of all files, we are then utilizing TUF to validate the index, which in turn is utilized to validate the actual downloaded artifacts. This allows us to perform validation on index updates and not on final downloads, also reducing the overhead of performing multiple hashing and validation procedures on the larger crate artifact files.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The actual target for tuf-crates shall be the crates index and not the artifacts themselves. This means that the TUF repository for crates.io is performed on much smaller payloads, which still provides us with cryptographic security due to the fact the index contains SHA-256 hashes of the crate file artifacts. Given the index already consists of SHA-512 signatures of all files, we are then utilizing TUF to validate the index, which in turn is utilized to validate the actual downloaded artifacts. This allows us to perform validation on index updates and not on final downloads, also reducing the overhead of performing multiple hashing and validation procedures on the larger crate artifact files.
The actual target for tuf-crates shall be the crates index and not the artifacts themselves. This means that the TUF repository for crates.io is performed on much smaller payloads, which still provides us with cryptographic security due to the fact the index contains SHA-256 hashes of the crate file artifacts. Given the index already consists of SHA-256 signatures of all files, we are then utilizing TUF to validate the index, which in turn is utilized to validate the actual downloaded artifacts. This allows us to perform validation on index updates and not on final downloads, also reducing the overhead of performing multiple hashing and validation procedures on the larger crate artifact files.


## Crates.io changes

- Prior to updating the index, crates.io shall perform the online signing of the index entry to update the targets and sign the index entry, saving this as a like-pathed artifact in the TUF repository.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from just this sentence alone I unfortunately have no clue what that actually means for us 😅

side note: what happens if the update of the TUF repository is successful but then the index repo update fails? or if the sparse index upload fails? what happens if both indexes are out-of-sync?

the indexes are treated as eventually consistent by crates.io. I'm not sure if we can actually guarantee a TUF repo update happening before and close in time to the two separate index updates.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think implementation wise the TUF copy of the index is going to have to be a 3ed eventually consistent source. At least at first.

At some point we could start maintaining the Git index using an entirely separate server that watches for changes in the TUF index then downloads them then commits them. Similarly we could have a separate service that watches the TUF index and uploads the latest index files to S3 and does CDN invalidations. But this is all fundamentally "eventually consistent".

S3 does have some APIs for referring to old versions of the file. So we could customize TUF to refer to the historical files already uploaded to the sparse index. But this would require a custom variation of TUF for our purposes, which does not seem to be the approach in this RFC.

### (tuf-root + tuf-crates) Crates.io Membership Change

- The crates.io team will update the root role in the `tuf-crates` repository, triggering a signing event that the existing crates.io team must sign via Pull Request.
- An update to the tuf-crates-root.json file will occur in the `tuf-root` repository, which shall trigger a new singing event Pull Request, which the root quorum must perform.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- An update to the tuf-crates-root.json file will occur in the `tuf-root` repository, which shall trigger a new singing event Pull Request, which the root quorum must perform.
- An update to the tuf-crates-root.json file will occur in the `tuf-root` repository, which shall trigger a new signing event Pull Request, which the root quorum must perform.

(to make it easier to fix 😉)


## Root Quorum Model

The root key shall follow a `5-of-9` authentication model for all operations. We consider this a reasonable middle ground of quorum to prevent malicious activity, while allowing for rapid response to an event requiring a quorum. These events are iterated below in [When the Quorum will be needed][when-the-quorum-will-be-needed].
Copy link
Member

@Turbo87 Turbo87 Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to play devil's advocate here: what would happen if the majority of keyholders met at a large event and were all suddenly incapacitated? is there a way to recover from such a worst-case scenario?

@SantiagoTorres
Copy link

This might be a misunderstanding of what you mean (in which case I apologize), but I believe that the two aren't analogous in this case: the fact that the current cargo index is on git is an implementation detail, and intentional security design dictates that we shouldn't assign security properties to incidental design choices.

git is also a protocol, with intentional security properties and design choices. Using it as a building block is as reasonable as choosing a vanilla merkle hash tree or any other authenticated data structure.

Plus, to my understanding, there's a long term - unrelated - objective of moving towards the sparse index protocol which is (TMI) not inherently underlaid by git.

sparse-index is implemented using sparse checkouts from git, afair, but I haven't followed this work too closely I'll admit.

(You're right that git provides a variant of the same properties! But git doesn't have an equivalent of inclusion proofs or witnessing per se that make a transparency scheme advantageous.)

Of course you can! Hell, I implemented inclusion proof commitments on this paper in the browser, in javascript, and with minimal overhead (compared to a github API call). I'm not sure what's so special about a TL that people assume this is not doable with any other scheme. A git commit is an irregular MHT as any other, including a TL.

This is a true and valid criticism. Apart from Go's sumdb, there are scant examples of real-world BT deployments with claimant models/personas to reason about. The only one besides Go that I'm aware of is Homebrew's use of Sigstore in an effectively-BT setting, which almost certainly has no independent monitors at the moment (besides myself, which wouldn't be fair to count 🙂).

This is what I'm trying to get at: a vanilla BT implementation barely exists, or if it exists doesn't implement the same security properties as TUF (or any code signing solution for that matter). My original point was to make sure we stop making this non-sequitur of an argument. You can have BT with code signing. Hell, you should have BT with code signing. Sigstore is one option --- which is perfectly compatible with TUF, as you know :) Ironically, one of the main motivations for the speranza paper was that some cargo users were doxed by kiwifarms using historical cargo metadata (which I think was cited on a previous cargo RFC, #3403)

As per the monitoring, I believe we should be upfront about it. At times I try to warn the community about upselling a TL as if it was some sort of magical blockchain --- hell, if you came here arguing for people to "just use a blockchain" you'd probably be laughed out of the room.

At the same time, I think this is also true in practice for packaging ecosystem deployments of TUF -- we don't have PEP 458 yet for PyPI, and to my understanding the RubyGems' TUF implementation from 2013 didn't fully materialize (I apologize if I'm mischaracterizing things there).
At the same time, I think this is also true in practice for packaging ecosystem deployments of TUF -- we don't have PEP 458 yet for PyPI, and to my understanding the RubyGems' TUF implementation from 2013 didn't fully materialize (I apologize if I'm mischaracterizing things there).

I'm afraid to add that you also are not operating a transparency log on either the PyPI or the Homebrew case. I believe it's perfectly reasonable for a community repository to rely on hosted solutions (as a hosted sigstore instance), but arguing that hosting a BT log is easier than rolling PKI is disingenous.

Let me put it differently: every instance of a large transparency or codesigning solution (be it Sigstore, a TL, whatever) that we can contrast against has a large team of engineers on a MANGA company or a large non-for-profit with multiple stakeholders (such as the Linux Foundation):

  1. Homebrew: Sigstore (LF, engineers from Google)
  2. NuGet: Microsoft
  3. Sumdb: Google
  4. NPM: GitHub + Sigstore
  5. Dockerhub(?):

This is in contrast with the PyPI and Rust case, which as far as I know doesn't have the continuous backing of a large company with people running this infra as $DAYJOB. I may be missing a success story, but outside of the linux distro case (who work tirelessly to provide both codesigning and transparency) I really can't come up with any that is not an ad-hoc, stick it in a TL BT deployment (as Firefox's used to be). I believe this is important to highlight if the authors decide making a comparative study of deployments of e.g., TUF or BT.

The main argument here is that transparency is an independently valuable property, one that TUF can't (at present?) provide.

That's also my main argument. If transparency is an independently valuable property, why argue against another independently valuable property that's being proposed? Metaphorically speaking, it'd be like telling something they shouldn't get the burger because they are also ordering a coke.

To reiterate: every time somebody argues for CT/TL on a codesigning issue is making people confuse both of them, for no particular reason.

The secondary argument (to your point above) is that, given a choice between a hardware backed k-of-n PKI and a k-n distributed witness/monitor PKI, the former is harder for the index persona to operationalize. That doesn't mean that the latter is easy (or, on net, even exactly as hard), but that it's easier for the index itself while achieving similar cryptographic properties, plus transparent properties.

This is very close to a division by zero because we have exactly 0 k-of-n distributed witness/monitor PKI deployments, let alone "operationalized".

In other words: in 2020+, I think it would be a correct observation that adding auditability to a set of smaller PKIs is a better ecosystem-level design decision than standing up a new PKI.

I like this, but I really want to highlight the word "adding" that you used, because it's the correct one.

As a commenting note:

I agree with this. I also was hesitant to dragging things on but I think that ship has sailed now that people are conflating TL and code signing (again, sigh).

Moving on to reply to @tarcieri (not sure if I should be posting twice, apologies if I violate etiquette)

I know Sigstore has already been discussed, but I believe it can provide this capability, either via a self-hosted deployment as described in bring your own TUF, or via signing TUF metadata files which I believe can work via Sigstore-as-a-service (I'm a bit confused, because I swear TUF and in-toto used to be explicitly listed as artifact formats natively supported by cosign in addition to OCI, eBPF, WASM, etc but now I can't find a reference to that anymore).

Certainly! they were not removed from first class, but now the log doesn't serve these payloads (but instead just holds the hashes). This connects to the conversation above about reducing mission creep between the projects. You can still submit TUF and in-toto types into the log using the library/tools. However, you'll have to store the payload on your side. Sorry I'm linking for a PR but we're undergoing a re-write of the docs. See here for the types supported.

If I understand correctly, it seems like something easy to adopt incrementally as a retroactive add-on, and thus something which this RFC doesn't need to directly concern itself with other than a potential mention for future work.

Yep, this is why I'm making that big of a fuzz on the fact that these are tangential (as @woodruffw put it), or rather complementary (as I would've liked to say). Ideally, you want to have the properties of the TL/BT log, while managing trust information using something like TUF. You can then sprinkle in-toto attestations as another incremental portion, but I don't want to get ahead of myself --- this is a TUF/codesigning RFC after all.

@bjorn3
Copy link
Member

bjorn3 commented Nov 5, 2024

Plus, to my understanding, there's a long term - unrelated - objective of moving towards the sparse index protocol which is (TMI) not inherently underlaid by git.

sparse-index is implemented using sparse checkouts from git, afair, but I haven't followed this work too closely I'll admit.

It is not. It uses a file for each crate presented by the crates.io api. There is no git repo involved with it at all. The sparse registry has been the default for a while now.

@trishankkarthik
Copy link

trishankkarthik commented Nov 5, 2024

In short, I agree with @SantiagoTorres that transparency in all its forms (whether Sigstore, BT, etc) is usually misunderstood as a red herring in these discussions because you really want both (producer-side) codesigning and transparency, and they are not contradictory. To briefly see why, consider: transparency advocates usually assume that package registries are untrusted, and so you need transparent logs (TLs) to audit them, but what stops these untrusted registries—or even the TLs themselves—from simply spamming the TLs with new malicious package versions? Congratulations: you now have to reactively catch and block these immutable, transparent malware. With codesigning such as with TUF, even a registry compromise can prevent attackers from tampering with packages given the right PKI setup.

I agree with @tarcieri and @traviscross that we can and should talk about both TUF and TLs (whether Sigstore, BT, etc), especially with a threat model, the different problems they solve, a security analysis, and how they can work together.

As for experience operating TUF metadata repositories at scale, I recommend talking to people who have done it such as the Uptane community (including Datadog Remote Configuration) and Drupal (@ergonlogic). IIUC the RubyGems TUF integration was unfortunately never merged due to reviewer bandwidth1. Similarly, to the best of my understanding, the initial PyPI PEP 458 integration was not completed due to reviewer, contractor, and contributor bandwidth, and it simply fell through the cracks (as is typical in OSS). As Santiago mentioned, TUF integrations for package registries so far have unfortunately simply never received the kind of commercial support TLs have, thereby lending the illusion that the latter is "simpler" than the former. You can only do so much with part-time volunteer and contract work. RSTUF (@kairoaraujo and friends) solves this problem by abstracting away TUF as a collection of services you can run on registries themselves, but we still need a lot of support here, perhaps from the Rust community. My thinking is that we can and should host managed TUF metadata repositories for OSS package registries, and there are ways to do this as securely as possible.

Anyway, I'm excited about this RFC, and will make it a point to review it now that I'm back in a similar time zone 🙂

Footnotes

  1. Although @simi is changing this with RSTUF as discussed below.

@djc
Copy link
Contributor

djc commented Nov 5, 2024

I agree with @tarcieri and @traviscross that we can and should talk about both TUF and TLs (whether Sigstore, BT, etc), especially with a threat model, the different problems they solve, a security analysis, and how they can work together.

Trying to follow the jargon-laden discussions on transparency logs vs PKIs and having skimmed the RFC to see if I could make sense of all this, I think it would be good to do this and work from a more conceptual-level threat model on towards the point where the trade-offs between TLs and TUF become clearer.


## (cargo-tuf-lib) Standard TUF Implementation

We propose creating a new crate, `cargo-tuf-lib`, which shall be used by both Cargo and Rustup for doing TUF synchronization and update procedures. This library shall be a shim wrapper around the `rust-tuf` crate (https://github.com/rustfoundation/rust-tuf), providing a simplified and shared interface for doing synchronization and verification of the TUF repositories and their files.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much influence/ownership do we have over rust-tuf? Cargo ends up needing to be rather opinionated about... a lot of things. What are dependencies are, and how they can be configured, and what their licenses are, etc.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is https://github.com/rustfoundation/rust-tuf supposed to be a fork of https://github.com/theupdateframework/rust-tuf which hasn't been created yet? It 404s.

cc @erickt

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a fork. It is just private right now, I believe, until this gets approved and implementation going. Nothing has been done on the fork just yet. (cc: @walterhpearce)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd be happy to accept merges if you have anything to upload. I haven't been super active on rust-tuf since we haven't needed any changes for Fuchsia in a while. Also, please check out amazon's library https://crates.io/crates/tough as well, it also is pretty far along too.

@SantiagoTorres
Copy link

@djc re: jargon/etc. I think this would be valuable. Is this something where patches are welcome? I'm not very familiar with the rust RFC process, but I'd be happy to throw in some text to help further the discussion/evaluation/understanding of the proposal

@djc
Copy link
Contributor

djc commented Nov 7, 2024

@djc re: jargon/etc. I think this would be valuable. Is this something where patches are welcome? I'm not very familiar with the rust RFC process, but I'd be happy to throw in some text to help further the discussion/evaluation/understanding of the proposal

I'd leave that to the RFC authors to decide.

(I'll just point that I don't think the solution here is to just add a bunch of glossary to explain all the terms of art, but rather to work from a threat model and conceptual practices towards concrete algorithms and systems.)

Comment on lines +216 to +218
- Creation of `rust-lang/tuf-root` and `rust-lang/tuf-crates` repositories on GitHub
- Initiation of the root signing ceremony via tuf-on-ci on each repository
- Facilitate the initial and subsequent signing events
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand it, none of these three items actually require the involvement of the Infrastructure Team. The repositories can be created using the team repo, and a team with write access to them can configure tuf-on-ci in these repositories autonomously. Only eventual interactions with cloud-based resources (e.g. AWS KMS) would require support by the Infrastructure Team.

Given that the Infrastructure Team is already understaffed and has (in my opinion) way too many existing responsibilities, I'm wondering if it would make sense to create a new t-signing (sub) team (or similar) that owns the implementation and maintenance of TUF.

The infra-team has a successful track record of collaborating with other teams to provide cloud resources for them, so I have no concerns about working on the CDN together. But I'd like to see the signing effort owned by people who are passionate about the subject and can dedicate the necessary amount of time and effort to make it a success. 🙂


##### Root Role

The root role of the tuf-root shall be a TUF role consisting of 9 members with a 5 member threshold for signing (5-of-9); please reference the Root Quorum Model section below for details on how this role should be managed and its members selected. The sole purpose of this role shall be delegating authority to the other roles within the tuf-root repository (when members of these roles change). Finally, this role shall also be used for signing the tuf-crates root.json - thus protecting the chain of trust between tuf-root and tuf-crates.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally, this role shall also be used for signing the tuf-crates root.json - thus protecting the chain of trust between tuf-root and tuf-crates.

So IIUC both repos will share the same threshold of root keys at any given time? You should also think about how to rotate the root keys, and keep both repos in sync.


### `rust-lang/tuf-crates`

The actual target for tuf-crates shall be the crates index and not the artifacts themselves. This means that the TUF repository for crates.io is performed on much smaller payloads, which still provides us with cryptographic security due to the fact the index contains SHA-256 hashes of the crate file artifacts. Given the index already consists of SHA-512 signatures of all files, we are then utilizing TUF to validate the index, which in turn is utilized to validate the actual downloaded artifacts. This allows us to perform validation on index updates and not on final downloads, also reducing the overhead of performing multiple hashing and validation procedures on the larger crate artifact files.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes a lot of sense to me: by signing the indices, you could reduce the metadata overhead by at least 10x (e.g., 600K project indices vs 6M releases on PyPI).


## TUF Management

We propose the adaptation and implementation of TUF-on-CI (https://github.com/theupdateframework/tuf-on-ci) to manage roots and signing events via GitHub CI. This provides a GitHub-centric workflow for performing signing ceremonies via Pull Requests directly on the TUF repositories in question.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think even @jku would say that tuf-on-ci is better suited for the tuf-root rather than the tuf-crates repo.

For the latter, you probably want to use Repository Service for TUF (RSTUF). We are thinking about running RSTUF as a managed service on behalf of OSS package repos like PyPI, and would love to collaborate on testing it for Crates, too.

Cc @kairoaraujo @SantiagoTorres

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on rust-lang/rust#133638 (comment), it sounds like we will still need a long-lived(?) key for Debian and other distros to verify signatures on distributed tarballs, or work with them to integrate with TUF. Is there some standard approach here that we should be expecting to pursue here?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a separate delegation from tuf-crates that handles the direct signing of crates. Specifically, they would include crates along with, say, detached GPG signatures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-council-nominated Indicates that an issue has been nominated for prioritizing at the next council meeting. T-cargo Relevant to the Cargo team, which will review and decide on the RFC. T-crates-io Relevant to the crates.io team, which will review and decide on the RFC. T-infra Relevant to the infrastructure team, which will review and decide on the RFC.
Projects
Status: RFC needs review
Development

Successfully merging this pull request may close these issues.