Enforce package (deb and rpm) signatures

Stephen_Herr · February 14, 2023, 4:41pm

I’d like to propose (and help contribute) a new feature where Pulp is able to enforce a policy of THOU SHALT NOT UPLOAD UNSIGNED (dep and rpm) PACKAGES. I’m mostly interested in an installation-wide enforcement of that policy, however if it were configurable so that you could selectively enable it only for particular repos that’s fine too.

I started the discussion in pulp-dev and was told to bring it here, so I’ll just briefly summarize / link things that have already been talked about:

There exists today a SigningService, which allows you to call out to a shell script to sign packages. Maybe that could be extended to provide verification services too?
Would people want different signatures for eg beta repos? Maybe a default key and you allow per-repo overrides?
A way to make the verification script location more obvious / less confusing?

quba42 · June 23, 2022, 10:16pm

Ok, I just saw the discussion on matrix, and I have a few thoughts to add:

In Debian world, it is pretty unusual to sign individual packages (though I believe it is possible), but almost unheard of not to sign your repository metadata. This provides integrity for the packages as well, since they are all referenced by checksum from the repository metadata. As a result, I am a little dubious how widespread a usecase this is for pulp_deb (of course I am not opposed to providing options).
Second: pulp_deb already provides some amount of repository metadata signature verification if the user provides a GPG key to a remote. The mechanism is currently pretty basic and could be improved in various ways, but it does already exist. (The way it works is that if a gpgkey is provided during sync, any Release files that cannot be verified via that gpgkey are simply discarded, which can cause nothing to be synced.)
I don’t think the analogy between signing service and some kind of proposed “verification service” makes sense. Signing services, with their messy script placement, only exist, because we did not want Pulp to take responsibility for handling users secret keys, and that is the interface we came up with. For verification, we don’t need secret keys, so we don’t need to jump through any hoops. Simply having a field on some model for users to provide keys to verify with is sufficient. What that verification actually entails will most likely be plugin specific. It will be for plugin authors to decide, so no external scripts needed.

dralley · June 23, 2022, 11:28pm

There’s an existing issue for the RPM plugin here, it hasn’t gotten much attention as of yet but it’s certainly a reasonable feature request and we’d be glad to work with you on implementing it.

I don’t know how much infrastructure could be shared between Deb and RPM. Probably not much, although the general approaches can be shared.

Some open questions:

Do we want to enforce that you can’t ever upload a raw artifact without a signature, or that you can’t create a Package content without a signature? The former would be much more challenging.
There is no good API currently for verifying RPM signatures, so probably you need to shell out to rpm --verify on the command line from within Pulp. But, I don’t immediately see any need for this to require a shell-script based API or anything like that.

In Debian world, it is pretty unusual to sign individual packages (though I believe it is possible), but almost unheard of not to sign your repository metadata. This provides integrity for the packages as well, since they are all referenced by checksum from the repository metadata.

In the RPM world it’s nearly the exact opposite, RPMs are typically signed, but the metadata itself is often not (although it is becoming more common over time). The idea being, wherever you get the package from you can be confident in its authenticity.

I don’t think the analogy between signing service and some kind of proposed “verification service” makes sense.

Agreed about this. There shouldn’t be any overlap between signing and verification.

x9c4 · June 24, 2022, 6:25am

You are taking the words right out of my mouth. I’m usually phrasing: Signing is hard. Verifying is easy. Let’s not make verifying hard for the sake of symmetry.
The plugin is really the place where the knowledge about a certain type of content and how a signature looks like converges. So I’d expect us to go the Pulp way, by looking at the pulp_deb implementation (verifying Release.gpg and InRelease), develop the feature in the corresponding plugins (ansible, container, deb and rpm should be in the loop here) and see if there are any valuable primitives to factor out into pulpcore.
Now that i’m thinking about this, i think we have some validation in pulp_ansible and pulp_container already.
What i really like is the idea of defining verification policies. We should try to make them consistent across plugins.

bmbouter · June 24, 2022, 2:19pm

+1 to the idea of using the strategy of pioneering this in the plugins and if there are commonalities, later moving them to pulpcore. If we do this, it will allow @Stephen_Herr to basically work with the RPM and DEB teams to define this feature and implement it more quickly than consuming the code from pulpcore.

Big +1 to no external scripts. They don’t make sense for verification; they just aren’t needed. I expect it’ll basically be plugin-provided code that takes in a public key and verifies it based on the semantics of how the signature is provided.

Stephen_Herr · February 14, 2023, 4:41pm

Something I learned after I submitted this is that Microsoft’s policy is that we should be upfront about working for Microsoft when interacting with open source communities, so let me state that right now: I work for Microsoft and this request stems from that fact. Microsoft has a very deep interest in solving the “supply chain security” problem.

So yes, we will be signing debs, even if the community doesn’t typically check for signatures on individual debs (debsig-verify is maintained by dpkg.org, so yes there is tooling for that). We will also be signing yum repodata, even if the yum world doesn’t typically check for that. We will sign all the things, and want to enforce signing all the things as a matter of policy.

I am unopinionated on this topic. In my use-case the packages are built and signed before people would be interested in pushing them to Pulp, so it seems easier to me to just reject the package upload immediately. But I understand that not everyone will have the same workflow.

This is my understanding too (thanks for the issue link). Which means that if the pulp_rpm plugin wants to validate signatures directly, the project will have a dependency on rpm being installed. Which maybe isn’t that big of a deal, rpm is actually available on deb-based systems too.

On the deb side debsig-verify is not available on rpm-based systems, however deb signing is actually a relatively simplistic thing, it just cats the content of the ar archive together and adds a detached signature. You can recreate the check that debsig-verify does with a 3-line bash script that requires only gpg and ar (and I think there are python libs that can replace ar, so you probably don’t even need that dependency) that looks something like this:

ar x *.deb
cat debian-binary control.* data.* > combined
gpg --verify _gpgorigin combined

x9c4 · June 24, 2022, 2:43pm

From the verification effort in pulp_container we have already moved from pulpcore.plugin.util import verify_signature to pulpcore.

dralley · June 24, 2022, 3:24pm

It might also be possible to leverage https://github.com/Richterrettich/rpm-rs with some Python bindings, which I have some experience with doing. But let’s call that a backup plan for now, since the exact method of doing the verification is more of an implementation detail than the enforcement mechanisms Pulp would need to build around it.

quba42 · June 25, 2022, 8:37am

Now that the broad approach has been mostly hammered out, I have a few more thoughts on implementation:

I gather from the above discussion, there need to be at least four possible values for something called verification_policy:

none: Do not require verification for anything (probably the default)
packages_only: Require signatures for packages but not for metadata.
metadata_only: Require signatures for metadata but not for packages
strict: Require signatures for both metadata and packages

Obviously not all of these make sense for all plugins (but they probably do for both deb and rpm). Some open questions:

Does that cover everything?
- Is there perhaps a use case for differing verification policies depending on whether a given package is obtained via sync or via upload?
- Where it is possible for things to be signed using multiple, or different GPG keys (within a single upstream repo), do users need to be able to specify that a specific key needs to have been used for a specific thing? (This could get really complicated fast…)
Where should the verification policy be set? Possible places might include “Pulp instance wide setting”, “Plugin wide setting”, on a repository, on a remote. (Remote does not capture package upload, so repository probably makes more sense?) A combination is also plausible, so perhaps users could set a plugin wide setting, and then override that plugin wide setting for individual repositories or remotes.

Whatever we choose here, I feel the interface for “verification plicies” between pulp_deb and pulp_rpm should try to be identical.

My other thought concerns the implementation on the pulp_deb side: The first thing to check, is if the python-debian library we are using already provides functions for verifying signed packages. If yes we can simply use that. If no, we might even consider opening a MR on the library, rather than simply adding that code to pulp_deb. (I have successfully contributed a small change to this library before. The only caveat is that the release cycle is a bit slow.) Even if we don’t end up going this route, we will almost certainly want to use the library to extract any constituent parts from the .deb package file.

Regarding the verification of pulp_deb repo metadata during sync, I think pulp_deb currently only allows a single GPG key to be specified for checking the signatures. However, it is perfectly legal for different distributions/releases within a single APT repo to be signed using different keys. I think it is also possible for a single distribution/release to be signed with multiple keys. This is just something to keep in mind while working on this feature.

@Stephen_Herr If you do start working on this, I am probably the person to ask any pulp_deb specific questions you may have. I would also encourage you to open a draft PR against pulp_deb early (even with a non-functional initial state), just to start that conversation.

Stephen_Herr · January 26, 2023, 4:39pm

I have some extra time again, so I’m starting to look at this feature once more. I’m taking to heart the advice to focus on specific features of specific plugins, so I’m starting with pulp_rpm. Signing Services of course already exist, so the most fundamental feature that is still missing is the ability to specify / validate / enforce that RPMs are signed by a particular key(s). I’m starting to look at potential implementations, and have some thoughts that people can chime in on.

My first thought on implementation was that we could add a new text field on RpmRepository that is a (concatenated list of) public key(s) that we expect packages in this repo to be signed with, and then add an extra validation step in the finalize_new_version hook that shells out to rpmkeys to verify that the added packages are actually signed by one of those keys. This has the benefit of being relatively simple with straightforward semantics; it only applies to changes made after the user has set a public key. The downside though is that it requires Pulp to write the RPMs to the local disk every time they are added to a repository, which seems sub-optimal with all the different variety of storage backends Pulp supports.

Another option would be to store some kind of thumbprint for the RPM on the Package, and then allow the user to set a list of acceptable thumbprints on the repo (or, potentially, upload a list of acceptable public keys and generate the thumbprints ourselves). Then the finalize_new_version hook can simply compare the package metadata with the list, and doesn’t need to fetch the RPM itself. However this approach seems to have a lot more unknowns and corner cases. Is there a field in the RPM metadata already that would server as a consistent and reliable thumbprint of the public key? Do we care about verifying the thumbprint, or just trusting it (if verifying, does that require we have imported all public keys that we’ll care about before uploading rpms?)? What about packages that were created before the patch was applied?

If anyone has suggestions or additional points in the process that I should consider I’m all ears.

Conan_Kudo · January 27, 2023, 12:51pm

RPM metadata not being signed is mostly caused by Red Hat infrastructure not being capable of doing it, despite many customers asking for it over the years. Same goes for Fedora. SLE and openSUSE sign their repository metadata because their infrastructure can just do it.

This is actually quite similar to the BZ I have on DNF about gpgcheck policy. I would be very happy to see this in Pulp.

Stephen_Herr · February 14, 2023, 4:41pm

I have a PR available for validating the signatures of RPMs before adding them to pulp_rpm repos. Feedback welcome. This invites a discussion about Signature Verification vs Thumbprint Checking.

Actual cryptographically-secure Signature Verification is harder for generic pulp_rpm installations. It requires:

That you have the rpm/rpmkeys binary installed, which is not currently required. It’s not clear to me how easy or acceptable that requirement would be.
You’d have to have the actual public key imported into pulp before you ingest RPMs. It would then create a new keyring for rpmkeys, import the public key(s), and then verify that the packages are signed appropriately.

Similar for debs, you have to either install a binary (which is not available on rpm-base systems) or recreate the process that the binary follows: unpack the deb, cat the contents back together in a particular order, and then gpg --verify that blob with the detached signature in the deb.

Thumbprint checking by contrast is simpler. The signature of signed rpms is available in the rpm header and inside the unpacked deb, so you just have to read it with PGPy or similar to fetch the public key ID that generated the signature.

The gpgcheck=1 option in the repodata already requires yum/dnf to cryptographically verify that the packages are signed by an already-installed public key. So maybe all we really care about at upload time is a “data sanity-check” to ensure that no one accidentally uploads an unsigned package or something.

If all we’re doing is Thumbprint Checking then a motivated attacker could potentially get bad packages into Pulp in two ways:

They could generate a keypair whos 16-digit hex public key id collides with one that you’ve set as trusted for the repo. That’s an address space of about 18 and a half sextillion possible keys, but the RFC4880 does explicitly say that you should not assume they’re unique. The attacker would then also have to get the clients to import their public key, or turn off gpg checking on the repo.
We are not actually verifying the signature, just reading it, so it may actually be possible that the attacker just edited the public key id to be whatever they wanted. I am not a crypto expert so I don’t know if that’s possible. In that case though the package should not be validatable anywhere regardless of what pub keys you’ve imported.

But if we’re assuming that this is just a server-side data-sanity checking feature and the actual security verification is done client-side then maybe that’s an acceptable risk.

x9c4 · February 12, 2023, 2:28pm

For starters, the final validation needs to be done by the client. I think there is no safe alternative to that. For proper metadata validation performed in pulp_deb, we require the user to put the whole key onto the repository model (or is it the remote?), so there is no key-id-space attack.
So the question i read from your post is, should we (on upload of the detached signature?) unpack the package and verify the signature? On sync, we do not have a chance to do just that, because in deferred download land, we do not get to see the artifacts to verify.

BTW, there is a util in pulpcore to help with the key handling for verification: https://github.com/pulp/pulpcore/blob/4b5c5c0b91aea140094430ed12cdf6a14bc49189/pulpcore/app/util.py#L218

Stephen_Herr · February 14, 2023, 4:35pm

Agreed, I think anything we do in pulp needs to be in addition to the gpg verification in the clients.

That gpg_verify method would be useful for verifying deconstructed debs, but unfortunately I don’t think it’s useful for rpms. And I would assume that we’d want the behavior to be consistent across both plugins.

Can you (or someone else) tell me more about the “deferred download” sync? The pulp_rpm PR above does assume that the file is actually available at one point in the sync. If that’s not a reliable assumption then what could I do instead?

ipanova · February 14, 2023, 6:18pm

@Stephen_Herr https://github.com/pulp/pulpcore/blob/main/docs/workflows/on-demand-downloading.rst deffered download does not download rpm binary data, it only parses the repo metadata - repomd, primary ,etc. Based on this information it creates rpm content without artifact, but with a reference to RemoteArtifact that contains all the information about how and where to retrieve the binary data. So when client asks for a certain rpm, if it is not immediately available , pulp with the help of RemoteArtifact will go and download and serve data back to the client. That’s why we need to add add a check and refuse sync task initiation given certain remote/repo option combinations https://github.com/pulp/pulp_rpm/pull/2954#discussion_r1106058257

Chr1st0f · June 12, 2024, 2:22pm

Hello Quba42 ,
I have a question regarding this verification_policy feature.
I have installed pulp on a k8s with the operator.
I have the metatdata issue in my deb repositories. So I have found your message very interesting.
However I didn’t found how to setup this value in particular.
I have only found a similar parator for the operator ( Content Checksums - Pulp Operator ) who is not the same name ( allowed_content_checksum ) .
I have changed this value and the creation of indexes are in progress.

Do you think it is correct ? Is it the same parameter ?
Thanks you for your help.
Chr1st0f

quba42 · June 12, 2024, 2:28pm

There is no verification_policy parameter. It is a proposal we have discussed in this forum.