tag2upload - Integrity risk assessment

Introduction

tag2upload necessarily involves a new service, able to upload source packages to the Debian archive. This naturally raises concerns about the integrity of the archive. It seems to me that these concerns should be captured and evaluated in a fairly formal risk assessment structure. That's what this document is.

Principles

All relevant concerns (in this case, concerns about package integrity) are recorded and evaluated in the context of existing practices. This includes even concerns where analysis shows that there is no additional risk. Categorisation of risks (as distinct, or not) is not intended to be exact; rather, distinctions are made when it helps the assessment. Risks are not complete scenarios: any particular bad scenario would involve one or more adverse risk events. Generally, benefits are not fully considered and this is not a cost/benefit analysis: the purpose is to spot all possible risks, so that they can be appropriately considered and maybe controlled, and as a "cost" input to cost/benefit decisions.

Note regarding the comparator, "Degree to which already accepted"

Most packages are now maintained in git somehow. Shared fast-forwarding git branches are regarded as primary by the maintainers. In these existing workflows, the archive is treated as an output format; a tool (usually a git-buildpackage rune) is run, ad-hoc, on the uploader's machine, to generate a .dsc for upload. Maintainers (when not using dgit push) generally do not do additional auditing or checking of the .dsc, other than that implied by any local checks, or functional tests, of .debs from formal binary builds.

For clarity and concretess, the risks of tag2upload are assessed in comparison to these specific existing workflows, rather than in comparison to the full range of existing Debian uploader workflows.

Note regarding "additional checks ... in dak"

The basic design does not require any changes to dak or to archive behaviour. We propose to initially deploy tag2upload without dak changes.

However, the design proposal includes optional components which could increase the integrity if implemented in dak. Where relevant these are mentioned in the risk table. We discuss these proposals and their implications in detail, below the risk table.

Risk Table

Risk Degree to which accepted in existing arrangements Control measures and mitigations Analysis; notably, additional risk?
tag2upload service key might be compromised/leaked. Risk is comparable to that posed by buildds. Service key is on a hardware token. Privilege separation: network access and source package creation occur in environment with no key access. Additional risk is minimal.
tag2upload service might corrupt source code as it passes through, turning good git tags into bad packages. Existing dscs are built from git data on uncontrolled uploader machines, with no audit trail, usually not in a clean environment, with uncontrolled tooling, and therefore unreproducibly. tag2upload verifies the git tag before attempting source package construction, so the complex part of the service is exposed only to authorised input. Source package construction done in throwaway VM. Source packages are reproducible from git tags. Artifacts in ftpmaster archive and messages to tag2upload reporting list can be used for auditing and any post-event analysis. The newly introduced risks are small. On balance tag2upload is a significant risk reduction.
tag2upload service might produce bad source packages unrelated to incoming signed git tags. Privilege separation: the trusted service component that has access to the key will only sign dscs with the same source package name and version as the incoming git tag. The signed git tag data includes the source package and version and (if this is implemented in dak) is included in the upload and verified by the archive; so no upload need be accepted without a corresponding git tag. Additional risk is minimal; there is no additional risk if the additional check is implemented in dak.
tag2upload might be compromised and then bypass existing archive access control policy. Privilege separation: the trusted service component replicates the DD/DM permission check using official keyring and dm.txt data, and will not sign a source package for an unauthorised DM. tag2upload includes information about the original tag signer, in the .dsc and the .changes. dak could repeat the permissions check using this data. tag2upload could include the actual signed git tag in the upload, which dak could verify for itself. Additional risk is minimal; there is no additional risk if the additional checks are implemented in dak.
tag2upload might use out-of-date or wrong keyring. tagu2upload uses the DSA-supplied keyrings from /srv/keyring.debian.org, like the archive does. No additional risk.
tag2upload might use out-of-date or wrong dm.txt (access control information about which DM may upload which package). Currently the dgit git server regularly downloads dm.txt from https://ftp-master.debian.org, to implement the existing (archive-equivalent) `dgit push' access control. (dm.txt is not published in /srv/keyring.debian.org) The proposed tag metadata check, in dak, would eliminate this risk for the archive. Risk of a DM exceeding their authority this way is low. With the additional dak check, there is no additional risk to the archive.
git SHA-1 hash collision. This risk is largely already accepted, through our intenstive use of git for packaging work. git has a specific countermeasure which is effective to block the currently known attack on SHA-1. tag2upload service obtains its actual git objects from salsa, which is usually the same place the maintainer is using, so in case of any collision everyone would usually see the same hash preimage. Tag data specifying source package and version is directly verifiable without relying on SHA-1. (git upstream are working on replacing SHA-1, but that is a very long term project.) Little if any additional risk over constructing source packages on uploader machines.
tag2upload service might makes some mistake and we might not be able to tell what happened. Existing source package construction (from git on uploader machines) is completely uncontrolled. Incoming tags sent to public list before processing. Processing logs sent to public list. git tag metadata included in .dsc and tag data in .changes. Source packages are reproducible from git tags. tag2upload is a significant improvement.
tag2upload key might be misused to upload binary packages Maintainer-uploaded binary packages are generally not audited or inspected. Privilege separation: the trusted service component that has access to the key will only sign source-only uploads. Binary uploads by tag2upload could be rejected by dak. Any binaries would not enter testing regardless. Additional risk is minimal; there is no additional risk if the additional check is implemented in dak.
Data needed to understand where the .dscs came from might later, become unavailable. Most existing methods do not guarantee discoverability or permanence of git history, so we can (and sometimes do) lose track of the git data from which the source packages are constructed. Input git objects (referenced to by the incoming tag) are transferred to the Debian (DSA-managed) dgit git service before the source package goes to the archive. The mailing list containing audit data will be archived (on Debian systems). tag2upload is a significant improvement.
Communications (eg emails and tracking web pages) which currently go to (or for the attention of) the signing uploader might go to the wrong place. Some mails (eg REJECTs) are currently not made public, which is arguably wrong. The uploader's identity will be included in new metadata fields, which could be used by services (including dak). Until changes are made to dak, etc., it will appear as if the tag2upload service itself was the "sponsor" of each upload. Mails will go to the service's public audit trail mailing list, and not to the real uploader. ddpo etc. will be wrong. Only uploaders who choose to use the tag2upload service will suffer these problems.

Discussion of "additional checks ... in dak"

The "additional checks" referred to above are not described in detail in the primary design document, because we are proposing that the service be deployed without these additional checks. However, the design leaves room for additional integrity checks by dak which would reduce or eliminate many of the risks discussed above.

Broadly, our idea is that:

  1. tag2upload would include the signed git tag object contents (the output of git cat-file tag debian/version) in a file in the upload (eg a file ....git-tag mentioned in the .changes and uploaded to the archive).
  2. dak would verify the signature on this tag, using the DD and DM keyrings.
  3. dak would parse the metadata in the tag to extract and cross-check the following information:
  4. dak would apply its own DD/DM permissions check, to check that the key which signed the tag is authorised to upload that source package. (This repeats a check already done by tag2upload.)
If implemented, this would enable dak to indepdendently verify the key metadata information about the upload, including the identity of the original tag signer (the DD or DM). This verification could be done without any further external data and does not rely on git's SHA-1 object system.

It would not enable dak to indepdendently verify that the the contents of the source package are those that the original tag signer intended. The mapping from tag to source package is complicated and involves feeding the incoming git tree and history to a variety of sophisticated source manipulation tools. This is why in tag2upload this process is sandboxed. It would be unwise to do this conversion again somewhere close to the dak database and master archive signing key.

We propose that the deployed tag2upload include all the proposed information which dak currently tolerates; and that there be tested code to supply the rest, ready to be enabled when dak is changed to tolerate it. Thus these additional controls can be implemented as and when ftpmaster consider them necessary.

Particularly, we believe that currently dak would reject the new .git-tag file, if it were to be included in uploads. So we will make arrangements to be able to provide this when dak allows it, but not enable this right away.