Multi-site pulp3 implementation

baz2028 · December 14, 2022, 4:01am

Hi all,

I’m looking to implement pulp3 across 4 geographic regions, but am unable to find anything regarding multi-site implementations.

Do I just have a full instance at each location and manage all the config with ansible to keep them in sync?

Is there a remote server install available like a Redhat Satellite Capsule server?

Looking for any reference architectures or real life implementation feedback.

Thanks in advance !!!

x9c4 · December 14, 2022, 10:36am

Hi, there is no general way to do this, and i believe it depends on your use-cases.
In general pulp is able to sync from another pulp. This should allow to represent the same repositories in multiple Pulp installations spanning multiple data centers. With import-export this should be possible over an air-gap even. In any case, that downstream Pulp would just be a full independent installation receiving it’s content from the upstream Pulp.
There is not yet an official automated way to replicate a pulp instance, but it could be something along the lines of [0].

[0] https://github.com/mdellweg/squeezer/tree/replicate_pulp

wibbit · December 15, 2022, 11:11am

I’m doing just this.

We currently have 5 pulp instances across the globe, with ~6TB of content, predominantly RPM based, but some file, and soon we are liable to have containers.

As has been stated pulp has no native support for this, so we built our own tooling using their Python API client libraries.

For us, as we needed to write our own tooling to deal with our release process supporting multiple installations wasn’t too much overhead.

We have a separate DB that stores a list of pulp servers and repositories we want to manage.

When we need to create a new repository to mirror (or just internally managed ones), the script will add the DB entries, and then go around the pulp instances creating said repositories. The primary node will point upstream, the secondaries will point back to the primary.

When adding/removing content, we do everything on the primary first, and then the tooling will ensure that the secondaries are kept up to date.

It works reasonably well.

I’m curious to investigate alternate options, that make better use of cloud-based object storage, that can replicate, and then locally-based pulp content nodes that point to said replicated object storage, however, I don’t have the time to invest in that at the moment (I don’t know if it would even work).

x9c4 · December 15, 2022, 11:59am

Have you thought about using labels to mark repositories for sync?

x9c4 · December 15, 2022, 12:09pm

Hey, maybe i didn’t say it explicitly enough. We are very interested in hearing about your specific use case.

wibbit · December 15, 2022, 12:16pm

Me? I did a small presentation on it for the 2021 pulpcon.

That may not address your questions though, happy to talk to you about it.

Not sure what you mean by using labels to mark repositories for Syncing. However we may have been able to use those instead of using our internal DB, but that’s so baked in now, I doubt I would be looking to change it.

x9c4 · December 15, 2022, 12:21pm

I meant the original poster when asking for specifics about the usecase. Somehow discourse didn’t want me to reply directly to the first post in the topic.

Anyway, for the labels, yes, I thought it might be interesting to add a label like sync_downstreams: "A,D,E" so any sync up script would look for these labels to identify repositories worth syncing to the other Pulps.

wibbit · December 15, 2022, 12:26pm

That makes far more sense, fairly certain you were aware of my use case

baz2028 · December 18, 2022, 10:33pm

Thanks for the replies all !!!

To be honest I haven’t even installed and started using it yet, we have three difference geographical regions that we have compute in. Mostly Windows, Linux (Centos, RHEL, Ubuntu) and K8S. We have Satellite for the RHEL servers and I was thinking Pulp would help with the CentOS and Ubuntu patching as well as any images being used in K8S.

My main requirement is to be able to set a baseline for patching whether it be weekly or monthly so that I know all servers are running the same kernel and package sets.

I think I’ll just go with a server in each region for now and have them pull their own content from a local mirror rather than creating extra traffic across our links.

Thanks again for the help !!!

Geoffrey_Wilson · June 12, 2025, 6:58pm

Hi there,

My company has developed an in-house application for managing multiple Pulp instances from a central manager application. I posted a more complete description here: Interest in a multi-site Pulp3 manager application?

Let me know if you think that might be helpful!

Geoff Wilson
G-Research Open Source Software