How to install Pulp on AWS EKS Cluster with S3 Storage

Awesome, surely I will give it a try.

Cool! We would like to hear about your test results.

One last question I have about exposing the UI using a Kubernetes Load Balancer instead of listening on port localhost. Do we have a way for this as well?

Yes. Since this is an EKS cluster I can think of 3 ways to do this:

  1. .spec.ingress_type: loadbalancer
    the operator will create a k8s Service of type LoadBalancer (the k8s integration with the cloud provider will create the lb for you)

  2. .spec.ingress_type: ingress
    the operator will deploy a k8s Ingress resource with the defined IngressController (more info in Reverse Proxy - Pulp Operator).

  3. .spec.ingress_type: nodeport
    the operator will create a k8s Service type NodePort and you will need to manually create the AWS LoadBalancer to point to the k8s <nodes>:<node-port>

3 Likes

Definitely. I will keep you posted.

While I was going through my artifacts on S3, I got a question about protocol used to upload/download artifacts on S3 bucket. Does it use S3FS in backend. I am asking this because S3FS is slower than actual S3 api calls. S3fs does a mounting and uses for upload/download files. Can you please clarify on this as well.

Pulp is using the s3 apis via django-storages backend.
And if not otherwise configured, will redirect client access to the artifacts to s3 directly using presigned redirect urls.
Does this answer your question?

2 Likes

Yep, thank you so much.

To install Pulp on an AWS EKS cluster with S3 storage, you’ll need to follow these general steps:

Set up AWS EKS Cluster:
    Create an EKS cluster using the AWS Management Console or AWS CLI.
    Ensure your IAM roles and policies grant necessary permissions for EKS, S3, and other AWS services you'll be using.

Prepare S3 Bucket:
    Create an S3 bucket to store your Pulp content.
    Set appropriate permissions and access controls for the bucket.

Deploy Pulp on EKS:
    Create Kubernetes manifests or Helm charts for deploying Pulp on your EKS cluster.
    Configure Pulp to use S3 storage by providing the S3 bucket details in the configuration.

Configure Networking:
    Set up network policies and ensure proper communication between your EKS cluster and the S3 bucket.

Test and Verify:
    Deploy Pulp resources to your EKS cluster.
    Verify that Pulp is running correctly and can access the S3 storage.

Monitor and Maintain:
    Implement monitoring and logging for your Pulp deployment.
    Regularly update and maintain your EKS cluster and Pulp installation.

Backup and Disaster Recovery:
    Implement backup strategies for your Pulp content stored in S3.
    Plan for disaster recovery scenarios and ensure data integrity.

Remember to refer to the official documentation of Pulp, AWS EKS, and S3 for detailed instructions specific to your setup and requirements.

2 Likes

Do we have a option to attach the service account (that will have access to s3) to the pod.

Do we have a option to attach the service account (that will have access to s3) to the pod.

Unfortunately, no, we don’t. Right now, we are configuring the pods with a ServiceAccount based on the Pulp CR name, and we don’t provide a way to modify the SA for the pods.
As a workaround, it is possible to put the operator in an unmanaged state and manually config the Deployments with the expected ServiceAccount/ServiceAccountName.

1 Like

Thanks for helping with the info. Will try it when the rest of the setup is sorted.
I was trying to for now configure pulp and check the flow of deb packages with s3 as backend storage (using the creds itself). I encountered the below error while accessing the content
File “/usr/local/lib/python3.9/site-packages/django/core/files/storage/base.py”, line 132, in path raise NotImplementedError(“This backend doesn’t support absolute paths.”)
Any clue on this error would help.

To have a better understanding of this issue, would you mind providing the following outputs in https://pastebin.centos.org?

kubectl get pulp -ojson
kubectl get pods -ojson
kubectl logs deployment/pulp-content

Thanks for your time. Somehow reinstalling pulp helped to solve the problem.

I faced another issue while installing the pulpoperator from git directly. The operator installation through helm work fine.
Have raised a separate issue as well Pulp operator directly from git isnt installing the rest of the pods. The same steps had worked a few days back though.

Any guidance on this would be helpful.

Currently, while configuring the S3 as only storage for the Pulp in eks, the only way to provide access for pods to s3 is through access keys specified in a k8 secret.
I did give a try with giving necessary permissions to a service account role to access s3, and attaching that service account role arn to the pods using spec.sa_annotations option in Pulp CR . But the pulp cr always expects spec.object_storage_s3_secret. Thus wasnt able to use only the service account option (without a s3 secret) with pulpcr. As mentioned earlier probably its not supported currently, my request is, could this support be added to the Pulp CR, so that access key as secret can be avoided. Especially when the pulp deployment is taken care using git, the storing of the access keys in git is slightly challenging, and cant be fully automated.

Hi @tarangini-shetty

As mentioned earlier probably its not supported currently, my request is, could this support be added to the Pulp CR, so that access key as secret can be avoided.

Yes, it seems like a good idea to support S3 installation providing a SA instead of a Secret. Would you mind opening an issue in https://github.com/pulp/pulp-operator/issues/new/choose?

2 Likes

Thanks @hyagi for looking into this.
As suggested, have opened an issue https://github.com/pulp/pulp-operator/issues/1424 for the same.

2 Likes