Unable to upload packages to rpm repo if allready stored in pulp

Problem:
Hi folks,
i try to create custom “Tools repo”, where we have a collection of packages provided for different OS versions.

Example:
Tools_RHEL8
Tools_RHEL9

if i try to add a package to both repos, the upload failes with errors “There is already a package with …”
and the package will not be added to this repo.

I’m using pulp-cli to do that:
“pulp rpm content upload --file $filename --repository $repo_name”

Example:
UPLOAD : of file /repos/EL-9-ULS-ABX/Packages/a/atop-2.7.1-1.el9.x86_64.rpm FAILED!
Started background task /pulp/api/v3/tasks/504f743d-0265-4433-ba00-bb05b07d86c9/
Error: Task /pulp/api/v3/tasks/504f743d-0265-4433-ba00-bb05b07d86c9/ failed: ‘{‘non_field_errors’: [ErrorDetail(string=‘There is already a package with: arch=x86_64, checksum_type=sha256, epoch=0, name=atop, pkgId=eacccf6df00b444c68259f15ff65db2c99b35cd571ce18fee463e02cfe4247ce, release=1.el9, version=2.7.1.’, code=‘invalid’)]}’

Note: Example here is a package from epel repo which we sync via remote. So i’m not able to upload and add this package. But there are custom packages identical for several OS also…

Expected outcome:
I expect the upload to complete. (maybe with warning). Since the package was identified as duplicate, it should still be added to the repo, but of course not stored twice!

Do i really have to deal with artifacts / content manually?

**pulp-cli version **
pulp-cli 0.19.0

Pulpcore version:
“versions”: [
{
“component”: “core”,
“version”: “3.23.3”,
“package”: “pulpcore”,
“domain_compatible”: true
},

Pulp plugins installed and their versions:
{
“component”: “rpm”,
“version”: “3.19.5”,
“package”: “pulp-rpm”,
“domain_compatible”: false
}

Operating system - distribution and version:
RHEL9

Other relevant data:

1 Like

There is a feature provided by pulpcore, that plugins can opt into, to convert the upload behavior from “If the thing the user is trying to create exists already, I will throw an error” to “If it already exists, I return it’s href to the user/do the things the user expects me to do”.

IMHO pulp_rpm (and all plugins) should implement this ASAP, since offloading this logic onto users is highly inconvenient, and counterintuitive. If all plugins would implement this, a bunch of highly complicated logic could be stripped from Katello as well. Please consider prioritizing this!

Edit: This should be handled directly in Pulp and not on the Pulp CLI level either!

1 Like

Have any plugins opted into this feature? I would expect such a change to require bumping the major version since plugins should be using semver and this would be a backwards incompatible change. For one thing, it would break our system as it expects to encounter an error if a duplicate exists. I do like the idea that pulp just returns the existing content unit if it exists but my concern is that plugins are abiding by semver.

@davidd pulp_deb has this merged to main branch, but not yet released. See: https://github.com/pulp/pulp_deb/pull/669

I just retested using this endpoint though there are several more pulp_deb content endpoints that use this change.

The behavior is no matter how often you try to create the same content, the task will always report back having created that content, if you also provided the repository parameter, and the repository does not yet have the content in it’s latest version, this will look like this:

{
  ...
  "created_resources": [
    "/pulp/api/v3/repositories/deb/apt/695b6158-36b1-4a7a-bf28-c97e77961960/versions/1/",
    "/pulp/api/v3/content/deb/packages/e3fd9afe-bc66-4e7e-bbf1-9073e721e86b/"
  ],
  ...
}

If the repository already has the content no new repo version will be created, and no new version_href will be reported back.

Did not consider that this change would require a major version according to semver conventions. I suppose if somebody is using the upload error in order to check for existence of the package then this will break that workflow. On the other hand the intuitive way to check for the existence of a package would be to query for it? I find this “try to upload and catch to the error” workflow to be super painful myself.

Thanks for the heads up. 100% agree that erroring when uploading a duplicate package is super painful. This change simplifies our code for us so I’m also in favor of it but our code currently depends on an error being raised so I’ll need to update it.

I see that pulp_file has already shipped this change. Again, I would have expected a bump in the major version to signify the behavior but a “removal” entry in the changelog would have sufficed (not sure if plugins are using the latter but I know pulpcore does). I filed an issue against pulp_file to hopefully ask for greater visibility when features with backwards incompatible changes are released.

1 Like

Plugins can define removal changelog messages, e.g.: Changelog — Pulp deb Support 2.21.0.dev documentation Since this has not yet released for pulp_deb we could still add one. On my part (and probably in the pulp_file case as well) this was simply an oversight. Did not think through the implications.

2 Likes

Maybe to add some context: In the face of RBAC, you may not even be able to see that the content you want to upload exists in Pulp, so throwing an error would completely prevent you from adding that content (someone else owns) into your repositories. That is why I believe we needed to make this change.
Maybe what you say is that adopting RBAC in a plugin (accompanied by this change) is a good reason to bump the major version.

1 Like

Yes, or at least a “removal” entry.

1 Like

See here: https://github.com/pulp/pulp_deb/pull/759

@davidd Do you want to have a look at it before I merge it?