Question on Component Interdepency

bearrito · May 27, 2022, 2:38pm

Hello All,

I’m exploring Pulp for my company. I have a few simple questions.

Some context:

I am not not using the ansible installer. Various reasons, but doesn’t fit out architecture. I’m “porting” the work done from the Podman example from the Kong folks to AWS ECS. In the end it will be similar to Clustering - Pulp Installer

We have a scenario where some users will access api and content from the internet, other users will access content internally. Because of compliance reasons, I need to keep internal users traffic inside our VPC.

In general, this would be straightforward because AWS ECS targets can be shared by Application Load Balancers. So, I could have two Load Balancers, one external other internal. pointing at the same set of containers. However, I believe the CONTENT_ORIGIN will be an issue. The API/Content containers will need to have different CONTENT_ORIGINS, that means running two differently configured api/content containers. Not a big deal.

Questions

Is there any inter-dependency between components (api,content,worker) that isn’t mediated through the data layer (postgres and s3 in my case)
Will the existence of independent API/Content containers sharing a data layer be an issue. My thought here is that something is written to the database or s3 that references that CONTENT_ORIGIN as a path or key in some way.

Please let me know if my question isn’t clear and I can clarify.

Thanks.

x9c4 · May 30, 2022, 7:35am

To my knowledge, CONTENT_ORIGIN is used in the Distribution serializers to allow displaying the correct url. Also pulp_container uses the constant to issue redirects from the api to the content app (not in the case of external cloud storage though).
What you seem to accomplish has probably not been attempted before and is not supported for sure, but may just work is certain configurations.

bmbouter · May 31, 2022, 6:37pm

Rolling your own deployment I think is a fine thing. I’d like to improve the documentation there some too, so if you have any suggestions on revisions I could help with please let me know.

As long as the settings are configured correctly deploying the services separately should be fine. Other than forming responses referring traffic between the services, they only ever speak to S3 and/or PostgreSQL (and optionally Redis for caching if you want).

The CONTENT_ORIGIN data I don’t think is in the db anywhere, since I believe Pulp uses only relative pathing internally. You’d have to test it, and if there are any issues please let us know.

If this doesn’t answer your question, please let us know!