Design question: Why do publications not have a unique name?

quba42 · January 4, 2022, 2:08pm

Almost every entity in Pulp 3 has a unique name, except publications.

Even distributions, which really don’t need it since they already have a unique base_path that can absolutely function as a unique name as well.

One of the great advantages with pulp-cli is that I no longer need to parse any pulp_href from API answers, except that is, with respect to nameless distributions! See the following example of a complete sync workflow using pulp-deb-cli to illustrate the problem:

# Constants:
ENTITIES_NAME='test'
SIGNING_SERVICE_NAME='Pulp_QE'

# Test the API:
pulp status

# Create signing service:
curl -L https://github.com/pulp/pulp-fixtures/raw/master/common/GPG-PRIVATE-KEY-pulp-qe | gpg --import
echo "6EDF301256480B9B801EBA3D05A5E6DA269D9D98:6:" | gpg --import-ownertrust
pulpcore-manager add-signing-service --class 'deb:AptReleaseSigningService' "${SIGNING_SERVICE_NAME}" ~/devel/pulp_deb/pulp_deb/tests/functional/sign_deb_release.sh '6EDF301256480B9B801EBA3D05A5E6DA269D9D98'

# Some test case:
REMOTE_OPTIONS='--url=http://ftp.de.debian.org/debian/ --distribution=bullseye-backports --component=non-free --policy=on_demand'

# Sync workflow steps using CLI:
pulp deb remote create --name=${ENTITIES_NAME} ${REMOTE_OPTIONS}
pulp deb repository create --name=${ENTITIES_NAME} --remote=${ENTITIES_NAME}
pulp deb repository sync --name=${ENTITIES_NAME}
APT_PUBLICATION_HREF=$(pulp deb publication create --repository=${ENTITIES_NAME} --simple=True --structured=True --signing-service="${SIGNING_SERVICE_NAME}" | jq -r '.pulp_href')
VERBATIM_PUBLICATION_HREF=$(pulp deb publication --type=verbatim create --repository=${ENTITIES_NAME} | jq -r '.pulp_href')
pulp deb distribution create --name=${ENTITIES_NAME} --base-path=${ENTITIES_NAME} --publication=${APT_PUBLICATION_HREF}
pulp deb distribution create --name=${ENTITIES_NAME}_verbatim --base-path=${ENTITIES_NAME}_verbatim --publication=${VERBATIM_PUBLICATION_HREF}

This isn’t the only issue I have with how publications vs. distributions work in Pulp.
The reason there still isn’t any autodistribution feature for pulp_deb is because I can’t figure out how to handle empty distributions that don’t know what kind of publication they should use (why are empty distributions even a thing?).

Every other thing relating to publications and distributions feels awkward and upside down to me.

So I have a couple of questions:

What is the design logic behind these two entities?
Do others experience these design issues or is it just me? (The above pulp-cli-deb workflow definitely feels wrong to me).
What if anything can/should be changed going forward?

wibbit · January 4, 2022, 2:54pm

I can answer one of those questions.

Why are empty distributions a thing.

I believe those are there to allow for people to effectively “reserve” a URL in advance.

quba42 · January 4, 2022, 3:27pm

That does sound like a valid use case. The only way I can think of how to combine that with autopublish, is for the empty distribution to record what kind of publication it should contain, which does not seem very elegant. (Then what is the point of having separate publications and distributions?) Maybe I can think of something else…

x9c4 · January 4, 2022, 3:39pm

Publications not having a name really is annoying. I agree.

But separating distributions from publications is at the core of Pulp3’s design goals:

It allows the process to make a publication available (or in the future maybe a set of publications) atomic. (The publication process may take a very long time for some content types.)
It allows the same publication to be served in different locations.
It allows to reserve a location.
It allows to switch a distribution (rel_path on the content server) from one repository version to another one in an instant.
It allows distributions to serve a repository version without a publication if the content allows it.

quba42 · January 4, 2022, 4:04pm

Given these design requirements, I guess the question becomes: Is it possible to provide users with an interface to uniquely pick out the publication they want that is NOT the pulp_href (and that is simple enough to actually reduce, rather than increase user confusion)?

Looking at the pulp_deb case, the alternative to the pulp_href to uniquely identifying a publication is:

repo_version + publication type (apt vs verbatim) + publication mode (simple and/or structured, only needed for the APT publisher)

This is clearly a pretty complicated set of uniquely identifying parameters. Hence, a recipe for user confusion!

x9c4 · January 4, 2022, 4:10pm

Do publications have labels?
What if we tell the publishing code to apply a specific label to the publications, then we could discriminate by (repo_version, label).

markgoddard · December 12, 2022, 10:19am

On the RPM side, repo version is insufficient to uniquely identify a publication (as mentioned in https://github.com/pulp/pulpcore/issues/3450#issuecomment-1344440315), since metadata can be updated in the upstream repo. This leads to the inability to identify publications by anything other than href.