Problem:
I have two pulp instances deployed in AWS USE1 region, each instance runs all pulp components and each instance is using its own S3 bucket for backend since in a different request it was confirmed to us that, that was the correct approach. At this point we wanted to Balance the load to pulp content service so we configured ALB for that purpose and also in order to support valid Certificate for access. We are mirroring RHEL and CentOS repos using autopublish. However we recently found out that pulp generates different metadata filenames per instance so we faced the issue where a client requested the repomd.xml data then tried to fetch for example primary file:
<data type="primary">
<location href="repodata/d2032611d893bf5957c313fb352bc9cc7999cf69-primary.xml.gz"/>
But then at next request ALB routed to secondary server where this filename is different hence causing a HTTP 404 response, we tried using sticky sessions but it seems not to have any effect, I assume because clients won’t support ALB cookies. Is there any option for us to avoid this behavior? Do you have any experience with Load Balancing for content service in cloud environments?
Only options we see are:
- Use Sync with mirror_complete which seems to fetch metadata as it is in remote repo. However we have concerns since documentation warns of limited support for some repos.
- Remove ALB and use stand alone instances as part of a yum mirrorlist and setup reverse proxy on each instance to support valid SSL certificate.
Thanks in advance for your input.
Expected outcome:
Pulpcore version:
“core”: “3.48.0”
Pulp plugins installed and their versions:
“versions”: {
“rpm”: “3.25.1”,
“core”: “3.48.0”,
“file”: “3.48.0”,
“certguard”: “3.48.0”
},
Operating system - distribution and version:
RHEL 9
Other relevant data: