About rpm publication and clean up

In our pulp3 instance, there is a cron job doing repo sync once per day. The retain_repo_versions is set to 1. What the cron does for each rpm repo:

  • do sync
  • do publication (for this step, even there is not change in repo, a new publication still created)
  • do distribution
    For the moment, we have 39 rpm repo. The pulp rpm distribution list shows 39 rpm distributions, which corresponds to the total number of repos.

The pulp rpm distribution list shows 431 publications, much higher, which is normal, since the cron job do repo publication every. The repo which changes less frequently has more publication “duplicated”, eg rocky8_extras, there are 2 versions, but there are 47 rpm publications pointing to the same version 1:
eg:

     1     "repository_version": "/pulp/api/v3/repositories/rpm/rpm/019097b6-a5db-7d0d-bf25-4cf2f9aa0c4a/versions/0/",
    47     "repository_version": "/pulp/api/v3/repositories/rpm/rpm/019097b6-a5db-7d0d-bf25-4cf2f9aa0c4a/versions/1/",

I am wondering, is it worth to cleaning rpm publications to remove the duplicated ones? or reduce the sync frequence for these repo? Or Am I worried for nothing cos it is a normal thing, just leave as it is?

I suppose rpm publication resouce doesn’t take much disk resources?

Pulpcore version:
“component”: “core”,
“version”: “3.54.1”,
“package”: “pulpcore”,
“module”: “pulpcore.app”,
“domain_compatible”: true

Pulp plugins installed and their versions:
{
“component”: “rpm”,
“version”: “3.27.1”,
“package”: “pulp-rpm”,
“module”: “pulp_rpm.app”,
“domain_compatible”: true
},

Operating system - distribution and version:
RedHat 9
Other relevant data:

Any reason you are not using autopublish feature? It will automatically create a new publication in case a new repo-version was created on sync. Newly created publications (from autopublish) will then be made available automatically upon creation.

This way you will avoid all the publication duplicates.

2 Likes

Hello @ipanova , thanks for your reply.

I don’t understand how the autopublish works, that is why I didn’t use it :stuck_out_tongue:

So according to your explanation, I see like this:

  1. create a repo with --autopublish enable, remote and retain_repo_versions 1
  2. then rpm repository sync,
  • If some new update in repo sync, a new repo version will be created, thus a new rpm publication created
  • if no new update in repo sync, no new repo version, no new rpm publication
  • how to get the repo version href? by pulp rpm repository version list ?
  1. now how to do rpm distribution part? Should I use --publication or --repository?
    The pulp-cli rpm distribution mentions auto-distributing, can you explain more about it?
In the pulp-cli man, it says:
pulp rpm distribution update --help
Usage: pulp rpm distribution update [OPTIONS]

  Update a rpm distribution.

Options:
  --href TEXT                     HREF of the rpm distribution
  --name TEXT                     Name of the rpm distribution
  --distribution TEXT             A resource to look for identified by <name>
                                  or by <href>.
  --publication TEXT              Publication to be served. This will unset
                                  the 'repository' and disable auto-
                                  distribute.
  --generate-repo-config / --no-generate-repo-config
                                  Option specifying whether ``*.repo`` files
                                  will be generated and served.
  --repository TEXT               Repository to be used for auto-distributing.
                                  When set, this will unset the 'publication'.
                                  Specified as '[[<plugin>:]<type>:]<name>' or
                                  as href.
  --content-guard TEXT            Content Guard used to protect the
                                  distribution. Specified as
                                  '<plugin>:<type>:<name>' or as href.
  --labels TEXT                   JSON dictionary of labels to set on rpm
                                  distribution (or @file containing a JSON
                                  dictionary)
  --base-path TEXT
  --help                          Show this message and exit.
  1. create repo, set autopublish=true, retain_repo_version=1
  2. create remote
  3. assign remote to repo
  4. create distribution and assign created repo to it via:
--repository TEXT               Repository to be used for auto-distributing.
                                  When set, this will unset the 'publication'.
                                  Specified as '[[<plugin>:]<type>:]<name>' or
                                  as href.
  1. sync repo.

That’s all. Whenever new content is available, new repo-version hence publication is created and it will be automatically auto-distributed with the latest created publication.

In your case, so you do not create objects from scratch just update the repo with autopublish set to true and the distribution to point to repo href instead of pointing to publication href.

2 Likes

Hello @ipanova Sorry for the late reply, was occupied with other things for the last two weeks.

I did the change as you said, the autopublich works !!! now just a cron to only do sync repo is enough. That simplifies a lot of things , thanks :slight_smile:

I have another question. For a repo , where we only manually upload some rpm packages, no sync: What i am doing now is like this:

  1. pulp rpm content upload --file “${RPM_FILE}” --repository "${REPO_NAME}
  2. Get the created package href
  3. pulp rpm repository content add --repository “${REPO_NAME}” --package-href “${PACKAGE_HREF}”
  4. do publication pulp rpm publication create --repository "${REPO_NAME}"
  5. do distribution with the created publication href pulp rpm distribution update --name "${REPO_NAME}" --publication "${PUBLICATION_HREF}"

My questions are:

  • Is it possible to use autopublish in this case to skip the publication and distrobution?
  • what are the difference between step 1 and Step 3?

Ah, maybe I am using the old doc? According to the latest doc: Upload Content - Pulp Project, it mentions only:
pulp rpm content -t package upload --file "${PACKAGE}" --repository "${REPOSITORY}"

just did the test, autopublish also works for my second case.
When autopublish is true and repo distribution is pointed to repo href, after content is upoaded, no need to do publication and distribution manually.

1 Like

step 3 is redundant after step 1( this step already upload the package into the destination repo) and yes auto-publish works in upload case too, but you figured that out on your own!

1 Like