How to install Pulp on AWS EKS Cluster with S3 Storage

I am unable to attach zip or tar file. Looks like this site only allows different formats of images. Can you please share me a link where I can upload the logs.

You could use https://paste.centos.org/ to share the logs

Thank you @decko , here are the log links:
https://paste.centos.org/view/eeae2c3d
https://paste.centos.org/view/c1375944

Here is the link for pods logs:

https://paste.centos.org/view/4590252c

Thank you for the outputs.
It seems like the Deployments have not been created. Can you please send the logs from the operator and the Events, it will probably be in /tmp/cluster-info/test/pulp-operator-controller-manager-5d5fb5bbc4-2rft7/logs.txt and /tmp/cluster-info/test/events.json

1 Like

Somehow I am unable to attach further links. So I attached logs on below links. Please have a look.
@hyagi

https://textdoc.co/LRfwZMhGkP1ySxcT
https://textdoc.co/hfbWc90QVN15lasP

hum … from the operator logs, the test-s3 Secret seems to be missing fields:

{"Secret.Namespace": "test", "Secret.Name": "test-s3", "error": "could not find "s3-access-key-id" key in test-s3 secret"}

when you tried to run

kubectl -ntest apply -f- <<EOF
apiVersion: v1
kind: Secret
metadata:
  name: 'test-s3'
stringData:
  s3-access-key-id: "<S3_ACCESS_KEY>"
  s3-secret-access-key: "<S3_SECRET_ACCESS_KEY>"
  s3-bucket-name: "<S3_BUCKET_NAME>"
  s3-region: "<S3_REGION>"
EOF

did you get any errors?

you can double-check if the Secret has been created:

kubectl -n test get secrets test-s3
NAME      TYPE     DATA   AGE
test-s3   Opaque   4      5h42m

and you can also check the content through:

kubectl -ntest get secrets test-s3 -ojsonpath='{.data}'|jq
{
  "s3-access-key-id": "...",
  "s3-bucket-name": "...",
  "s3-region": "...",
  "s3-secret-access-key": "..."
}
2 Likes

Great! I just updated the config with s3-secret-access-key and s3-access-key-id and it started working. Thank you so much @hyagi. Initially I did not add it as EKS & S3 are In AWS and I assumed that keys are not needed.

2 Likes

Now, do we have a way to use external PostgreSQL & Redis instead of using a containerised PostgreSQL & Redis?

Nice! Glad to know that it worked :smiley:

Initially I did not add it as EKS & S3 are In AWS and I assumed that keys are not needed.

Unfortunately, we don’t have such a level of integration yet so we need to pass the credentials manually.

Now, do we have a way to use external PostgreSQL & Redis instead of using a containerised PostgreSQL & Redis?

Yes, we do. You can create a secret pointing to the external databases in this case.

  • create a secret for postgres:
kubectl -ntest create secret generic external-database \
        --from-literal=POSTGRES_HOST=my-postgres-host.example.com  \
        --from-literal=POSTGRES_PORT=5432  \
        --from-literal=POSTGRES_USERNAME=pulp-admin  \
        --from-literal=POSTGRES_PASSWORD=password  \
        --from-literal=POSTGRES_DB_NAME=pulp \
        --from-literal=POSTGRES_SSLMODE=prefer
  • create a secret for redis (make sure to define all the keys (REDIS_HOST, REDIS_PORT, REDIS_PASSWORD, REDIS_DB) even if Redis cluster has no authentication, like in this example):
kubectl -ntest create secret generic external-redis \
        --from-literal=REDIS_HOST=my-redis-host.example.com  \
        --from-literal=REDIS_PORT=6379  \
        --from-literal=REDIS_PASSWORD=""  \
        --from-literal=REDIS_DB=""
  • update Pulp CR with them:
kubectl -ntest patch pulp test --type=merge -p '{"spec": {"database": {"external_db_secret": "external-database"}, "cache": {"enabled": true, "external_cache_secret": "external-redis"} }}'

pulp-operator should notice the new settings and start to redeploy the pods.
After all the pods get into a READY state you can verify the status through:

kubectl -ntest exec deployment/test-api -- curl -sL localhost:24817/pulp/api/v3/status | jq '{"postgres": .database_connection, "redis": .redis_connection}'
{
  "postgres": {
    "connected": true
  },
  "redis": {
    "connected": true
  }
}

Here are the docs for more information:
https://docs.pulpproject.org/pulp_operator/configuring/database/#configuring-pulp-operator-to-use-an-external-postgresql-installation
https://docs.pulpproject.org/pulp_operator/configuring/cache/#configuring-pulp-operator-to-use-an-external-redis-installation

4 Likes

Awesome, surely I will give it a try. One last question I have about exposing the UI using a Kubernetes Load Balancer instead of listening on port localhost. Do we have a way for this as well?

Awesome, surely I will give it a try.

Cool! We would like to hear about your test results.

One last question I have about exposing the UI using a Kubernetes Load Balancer instead of listening on port localhost. Do we have a way for this as well?

Yes. Since this is an EKS cluster I can think of 3 ways to do this:

  1. .spec.ingress_type: loadbalancer
    the operator will create a k8s Service of type LoadBalancer (the k8s integration with the cloud provider will create the lb for you)

  2. .spec.ingress_type: ingress
    the operator will deploy a k8s Ingress resource with the defined IngressController (more info in Reverse Proxy - Pulp Operator).

  3. .spec.ingress_type: nodeport
    the operator will create a k8s Service type NodePort and you will need to manually create the AWS LoadBalancer to point to the k8s <nodes>:<node-port>

3 Likes

Definitely. I will keep you posted.

While I was going through my artifacts on S3, I got a question about protocol used to upload/download artifacts on S3 bucket. Does it use S3FS in backend. I am asking this because S3FS is slower than actual S3 api calls. S3fs does a mounting and uses for upload/download files. Can you please clarify on this as well.

Pulp is using the s3 apis via django-storages backend.
And if not otherwise configured, will redirect client access to the artifacts to s3 directly using presigned redirect urls.
Does this answer your question?

2 Likes

Yep, thank you so much.

To install Pulp on an AWS EKS cluster with S3 storage, you’ll need to follow these general steps:

Set up AWS EKS Cluster:
    Create an EKS cluster using the AWS Management Console or AWS CLI.
    Ensure your IAM roles and policies grant necessary permissions for EKS, S3, and other AWS services you'll be using.

Prepare S3 Bucket:
    Create an S3 bucket to store your Pulp content.
    Set appropriate permissions and access controls for the bucket.

Deploy Pulp on EKS:
    Create Kubernetes manifests or Helm charts for deploying Pulp on your EKS cluster.
    Configure Pulp to use S3 storage by providing the S3 bucket details in the configuration.

Configure Networking:
    Set up network policies and ensure proper communication between your EKS cluster and the S3 bucket.

Test and Verify:
    Deploy Pulp resources to your EKS cluster.
    Verify that Pulp is running correctly and can access the S3 storage.

Monitor and Maintain:
    Implement monitoring and logging for your Pulp deployment.
    Regularly update and maintain your EKS cluster and Pulp installation.

Backup and Disaster Recovery:
    Implement backup strategies for your Pulp content stored in S3.
    Plan for disaster recovery scenarios and ensure data integrity.

Remember to refer to the official documentation of Pulp, AWS EKS, and S3 for detailed instructions specific to your setup and requirements.

2 Likes

Do we have a option to attach the service account (that will have access to s3) to the pod.

Do we have a option to attach the service account (that will have access to s3) to the pod.

Unfortunately, no, we don’t. Right now, we are configuring the pods with a ServiceAccount based on the Pulp CR name, and we don’t provide a way to modify the SA for the pods.
As a workaround, it is possible to put the operator in an unmanaged state and manually config the Deployments with the expected ServiceAccount/ServiceAccountName.

1 Like

Thanks for helping with the info. Will try it when the rest of the setup is sorted.
I was trying to for now configure pulp and check the flow of deb packages with s3 as backend storage (using the creds itself). I encountered the below error while accessing the content
File “/usr/local/lib/python3.9/site-packages/django/core/files/storage/base.py”, line 132, in path raise NotImplementedError(“This backend doesn’t support absolute paths.”)
Any clue on this error would help.

To have a better understanding of this issue, would you mind providing the following outputs in https://pastebin.centos.org?

kubectl get pulp -ojson
kubectl get pods -ojson
kubectl logs deployment/pulp-content