How to install Pulp on AWS EKS Cluster with S3 Storage

Problem:
I am trying to evaluate Pulp to storage Artifacts. This is something I am doing to remove our JFrog dependency. We need to install Pulp on a EKS Cluster with S3 storage. I couldn’t see any document which can help me.

Expected outcome:
Pulp should be installed on a EKS cluster with S3 as backend.

Operating system - distribution and version:
AWS EKS

I don’t know much about EKS. Maybe the operator is the best bet there.
But about configuring s3 as a storage backend there are docs:
https://docs.pulpproject.org/pulpcore/installation/storage.html#amazon-s3
and
https://django-storages.readthedocs.io/en/latest/backends/amazon-S3.html

I believe the necessary packages are already included in our container builds, so only the configuration is upon you.

I tried Operator and it only installed pulp operator pod. Do I need to install pulp content, web, api, worker individually on the cluster?

I’m really not an expert in the operator, but i believe the operator pod should manage all the other resources once deployed and “operated”!?

I got you. Looks like I need to install all the required pods on the cluster and once pods are up, Operator will take care of managing ’em, right?

I’m the wrong one to answer here.
@mikedep333 @decko @hyagi any advice on this?

Can you give us more details on what you mean by “installed pulp operator pod”? I guess you ran a pod with the quay.io/pulp/pulp-operator image, correct?

Here are some “getting started” steps if you’d like to test pulp-operator:

  • create the namespace to run pulp
kubectl create ns test
  • install the operator
git clone https://github.com/pulp/pulp-operator.git /tmp/pulp-operator
cd /tmp/pulp-operator
NAMESPACE=test make install deploy
  • create a secret with the S3 credentials
kubectl -ntest apply -f- <<EOF
apiVersion: v1
kind: Secret
metadata:
  name: 'test-s3'
stringData:
  s3-access-key-id: "<S3_ACCESS_KEY>"
  s3-secret-access-key: "<S3_SECRET_ACCESS_KEY>"
  s3-bucket-name: "<S3_BUCKET_NAME>"
  s3-region: "<S3_REGION>"
EOF
  • create an instance of pulp-operator
kubectl -ntest apply -f-<<EOF
apiVersion: repo-manager.pulpproject.org/v1beta2
kind: Pulp
metadata:
  name: test
spec:
  object_storage_s3_secret: test-s3
  api:
    replicas: 1
  content:
    replicas: 1
  worker:
    replicas: 1
  web:
    replicas: 1
EOF

after that the operator should deploy the api, content, and worker pods:

$ kubectl -ntest get pods
NAME                                                READY   STATUS    RESTARTS   AGE
pulp-operator-controller-manager-5596b5c575-9vxpg   2/2     Running   0          93s
test-api-76fc65bb95-pgfnj                           1/1     Running   0          89s
test-content-7ddbc78647-l56bl                       1/1     Running   0          89s
test-database-0                                     1/1     Running   0          90s
test-web-5ff7666c9-v7bqp                            1/1     Running   0          88s
test-worker-d997b87b7-f92f2                         1/1     Running   0          88s

For more information on specific configurations:
https://docs.pulpproject.org/pulp_operator/configuring/storage/#configure-aws-s3

2 Likes

Thank you @hyagi and @x9c4. These step were really helpful. Now when I list pods I see only two pods running (Screenshot attached). API, Content, Worker, Web pods are not spinned up. Not sure if I am missing anything.

Hi @Chandan_Mishra,

Nice! Now you have the operator running, but we need to understand why these pods are missing.
In this case, can you please provide us the cluster-info dump so we can get more information on Deployment and Replicaset status?

kubectl cluster-info dump -ntest --output-directory=/tmp/cluster-info
tar cvaf cluster-info.tar.gz /tmp/cluster-info/
2 Likes

I am unable to attach zip or tar file. Looks like this site only allows different formats of images. Can you please share me a link where I can upload the logs.

You could use https://paste.centos.org/ to share the logs

Thank you @decko , here are the log links:
https://paste.centos.org/view/eeae2c3d
https://paste.centos.org/view/c1375944

Here is the link for pods logs:

https://paste.centos.org/view/4590252c

Thank you for the outputs.
It seems like the Deployments have not been created. Can you please send the logs from the operator and the Events, it will probably be in /tmp/cluster-info/test/pulp-operator-controller-manager-5d5fb5bbc4-2rft7/logs.txt and /tmp/cluster-info/test/events.json

1 Like

Somehow I am unable to attach further links. So I attached logs on below links. Please have a look.
@hyagi

https://textdoc.co/LRfwZMhGkP1ySxcT
https://textdoc.co/hfbWc90QVN15lasP

hum … from the operator logs, the test-s3 Secret seems to be missing fields:

{"Secret.Namespace": "test", "Secret.Name": "test-s3", "error": "could not find "s3-access-key-id" key in test-s3 secret"}

when you tried to run

kubectl -ntest apply -f- <<EOF
apiVersion: v1
kind: Secret
metadata:
  name: 'test-s3'
stringData:
  s3-access-key-id: "<S3_ACCESS_KEY>"
  s3-secret-access-key: "<S3_SECRET_ACCESS_KEY>"
  s3-bucket-name: "<S3_BUCKET_NAME>"
  s3-region: "<S3_REGION>"
EOF

did you get any errors?

you can double-check if the Secret has been created:

kubectl -n test get secrets test-s3
NAME      TYPE     DATA   AGE
test-s3   Opaque   4      5h42m

and you can also check the content through:

kubectl -ntest get secrets test-s3 -ojsonpath='{.data}'|jq
{
  "s3-access-key-id": "...",
  "s3-bucket-name": "...",
  "s3-region": "...",
  "s3-secret-access-key": "..."
}
2 Likes

Great! I just updated the config with s3-secret-access-key and s3-access-key-id and it started working. Thank you so much @hyagi. Initially I did not add it as EKS & S3 are In AWS and I assumed that keys are not needed.

2 Likes

Now, do we have a way to use external PostgreSQL & Redis instead of using a containerised PostgreSQL & Redis?

Nice! Glad to know that it worked :smiley:

Initially I did not add it as EKS & S3 are In AWS and I assumed that keys are not needed.

Unfortunately, we don’t have such a level of integration yet so we need to pass the credentials manually.

Now, do we have a way to use external PostgreSQL & Redis instead of using a containerised PostgreSQL & Redis?

Yes, we do. You can create a secret pointing to the external databases in this case.

  • create a secret for postgres:
kubectl -ntest create secret generic external-database \
        --from-literal=POSTGRES_HOST=my-postgres-host.example.com  \
        --from-literal=POSTGRES_PORT=5432  \
        --from-literal=POSTGRES_USERNAME=pulp-admin  \
        --from-literal=POSTGRES_PASSWORD=password  \
        --from-literal=POSTGRES_DB_NAME=pulp \
        --from-literal=POSTGRES_SSLMODE=prefer
  • create a secret for redis (make sure to define all the keys (REDIS_HOST, REDIS_PORT, REDIS_PASSWORD, REDIS_DB) even if Redis cluster has no authentication, like in this example):
kubectl -ntest create secret generic external-redis \
        --from-literal=REDIS_HOST=my-redis-host.example.com  \
        --from-literal=REDIS_PORT=6379  \
        --from-literal=REDIS_PASSWORD=""  \
        --from-literal=REDIS_DB=""
  • update Pulp CR with them:
kubectl -ntest patch pulp test --type=merge -p '{"spec": {"database": {"external_db_secret": "external-database"}, "cache": {"enabled": true, "external_cache_secret": "external-redis"} }}'

pulp-operator should notice the new settings and start to redeploy the pods.
After all the pods get into a READY state you can verify the status through:

kubectl -ntest exec deployment/test-api -- curl -sL localhost:24817/pulp/api/v3/status | jq '{"postgres": .database_connection, "redis": .redis_connection}'
{
  "postgres": {
    "connected": true
  },
  "redis": {
    "connected": true
  }
}

Here are the docs for more information:
https://docs.pulpproject.org/pulp_operator/configuring/database/#configuring-pulp-operator-to-use-an-external-postgresql-installation
https://docs.pulpproject.org/pulp_operator/configuring/cache/#configuring-pulp-operator-to-use-an-external-redis-installation

4 Likes

Awesome, surely I will give it a try. One last question I have about exposing the UI using a Kubernetes Load Balancer instead of listening on port localhost. Do we have a way for this as well?