Matching Ubuntu Repository Directory Structure in pulp_deb

I’ve been working with pulp_deb to mirror upstream Ubuntu repositories and noticed a difference in the directory structure.

Upstream Ubuntu repos typically look like this:

dists/<release>/

But in Pulp, since each distribution requires a unique base_path, the resulting structure ends up being something like:

<release>/dists/<release>/
e.g. noble/dists/noble, bionic/dists/bionic

This adds an extra level compared to the upstream layout.

My question is: Is there a way (or future plan) to mirror the exact upstream directory structure (dists/<release>) in Pulp distributions? Or is the current base_path constraint the expected and only supported approach?

Thanks!

I am not sure I understand the question. The general form of an APT repository structure is:

<URL>/<repository_base_path>/dists/<distribution>/<release_files>`

This is true of upstream Ubuntu, e.g.:

http://archive.ubuntu.com/ubuntu/dists/bionic/InRelease
<URL> = http://archive.ubuntu.com/
<repository_base_path> = ubuntu
<distribution> = bionic

Are you asking if <repository_base_path> can be the empty string in Pulp? If so, the answer is no. Otherwise we could only provide a single repository on the Pulp content app. There are no plans to change the Pulp requirement to have a non empty base path.

Am I missing something?

Thanks for clarifying. Let me show what I meant with two possible repo structures:

Structure 1 (current layout I was testing):

<URL>/
β”œβ”€β”€ noble/
β”‚   └── dists/
β”‚       └── noble/
β”‚           └── <Release files>
β”œβ”€β”€ focal/
β”‚   └── dists/
β”‚       └── focal/
β”‚           └── <Release files>
└── bionic/
    └── dists/
        └── bionic/
            └── <Release files>

Structure 2 (with ubuntu as repository base path):

<URL>/
└── ubuntu/
    └── dists/
        β”œβ”€β”€ noble/
        β”‚   └── <Release files>
        β”œβ”€β”€ focal/
        β”‚   └── <Release files>
        └── bionic/
            └── <Release files>

So my question was really about whether Structure 1 (where <repository_base_path> is effectively the distribution name itself, which has to be unique per repo) is supported in Pulp, or if only Structure 2 (with a shared non-empty <repository_base_path>, e.g. ubuntu) is valid.

It looks like after mirroring upstream, the repo structure in Pulp always follows the pattern:

<base_path>/dists/<distribution>

For example:

<base_path>/dists/noble

So <base_path> is mandatory and uniq, and <distribution> sits under dists as expected.

So in case this is still unclear and also for future reference: Both Strucutre 1 and Structure 2 are possible. To get Structure 1 you would create three different Pulp repositories with three different base paths (β€œnoble”, β€œfocal”, β€œbionic”) that contain one distribution each. Structure 2 would be one Pulp repository with base path β€œubuntu” and containing 3 distributions (β€œnoble”, β€œfocal”, β€œbionic”) within it. In other words, and assuming you are getting your content via sync, you would be synchronizing multiple distributions from the upstream Ubuntu repository into just one Pulp repository.

2 Likes

Thanks for clarifying the difference between Structure 1 and Structure 2.

I went ahead and tested syncing multiple distributions (each with its own remote) into a single Pulp repository, and the resulting layout does match the upstream Ubuntu structure. That’s exactly what we’re aiming for since we’d like to maintain a one-to-one mirror of the upstream directory layout.

Just to confirm: when combining multiple distributions into a single repository like this, we can still use the mirror and optimize sync options as usual, correct?

Then I have one more clarification for you: You can do this using a single remote. A remote stores a list of distributions, components, and architectures it should sync. If you are using Pulp CLI to create the remote, you can simply supply the command with multiple --distribution= flags. For example, you can do something like:

REMOTE_OPTIONS='--url=https://fixtures.pulpproject.org/debian/ --distribution=ragnarok --distribution=ginnungagap --architecture=ppc64 --architecture=armeb --policy=on_demand'
pulp deb remote create --name='my_test_remote' ${REMOTE_OPTIONS}

I suppose you could also use separate remotes and sync them into a single repo one after the next using policy β€œadditive”. I believe this should end up with the same result, but I have not done a lot of testing, because we always assumed a single remote with multiple distributions would be used.

Thanks for the clarification β€” that helps.

I also noticed the documentation here:
https://pulpproject.org/pulp_deb/docs/user/guides/sync/

It recommends:

β€œUse a single --distribution per remote. While it is possible to set multiple distributions on a single remote, and sync them into a single Pulp repository, this can easily lead to huge confusing repositories with performance issues.”

This seems to contradict the suggestion that we should just use a single remote with multiple --distribution flags.

One concern we have with the alternative approach (separate remotes + sync with policy additive) is that the resulting repository would not exactly match upstream if any packages were removed upstream β€” they would remain in our Pulp repo.

Could you help clarify the recommended best practice here? Specifically:

  • Is using one remote with multiple distributions the preferred approach today?
  • Or is one-distribution-per-remote still the safe/best practice for long-term maintainability and performance?
  • And what’s the recommended way to ensure that our repo mirrors upstream exactly, even when upstream removes packages?

Thanks!

That best practice recommendation is not intended to be absolute. If you want that repository structure, by all means use a remote with multiple distributions. Just be aware that if you are going to sync a dozen large distributions at a single time, that sync will take some time.

Maybe we should rephrase the recommendation to start with β€œAll else being equal, …”. In your case (you specicifally want a certain structure as the result), all else is not equal.