Preserving client IP in pulp-web logs

We’ve set up a Pulp EKS cluster with the AWS ALB Ingress Controller, and an Nginx proxy in front that forwards requests to the Pulp ALB. Everything works as expected, except for one issue: when a client downloads an artifact, the client IP appears correctly in the Nginx access logs, but in the pulp-web logs, it shows the AWS ALB’s private IP instead (the ALB is internal-facing).

We’ve already configured the necessary proxy headers in Nginx (proxy_set_header X-Real-IP $remote_addr, proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for, and proxy_set_header X-Forwarded-Proto $scheme). Is there something we need to enable in pulp-web to ensure the original client IP is forwarded all the way to the Pulp web application?

There might be a better answer, but I think you need to change the gunicorn access_log_format of the content app to get the value from the header. [0] Maybe something like this:

'%({x-real-ip}i)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s"'

This is the default string with the first item (request’s ip) replaced by the header. Now how you specify this value for the content app inside the Pulp operator I do not know, @hyagi any ideas?

[0] Settings — Gunicorn 23.0.0 documentation

1 Like

Hi @sheshi!

“[…] AWS ALB Ingress Controller, and an Nginx proxy in front that forwards requests to the Pulp ALB […]”

Hmm… sorry, I could not understand your setup.
Can you please confirm if it is something like this?

[1] AWS ALB (k8s ingress controller) -> [2] Nginx -> [3] Pulp ALB (is this another AWS ALB?) -> [4] pulp-web -> pulpcore pods (api and content)

From what I could understand your [4] pulp-web logs are not showing the real IP addresses from clients, but the [3] AWS ALB IP instead. If that is the case, we could check if the issue is in the AWS ALB configuration: HTTP headers and Application Load Balancers - Elastic Load Balancing

“Now how you specify this value for the content app inside the Pulp operator I do not know, @hyagi any ideas?”

To define the gunicorn access_log_format, we would need to modify the container args with it, but we can’t do that through pulp-operator.
As a workaround, we can set the operator as unmanaged and manually modify the Deployment.

We figured out the issue.

Here’s our setup:

[End User] → [Public NLB] → [External NGINX] → [Internal ALB (Pulp)] → [pulp-web]

We use this configuration to apply Nexus rewrite rules and route traffic to the Pulp backend.

The fix was to update the pulp-web NGINX configuration to enable the Real IP module. This required editing the pulp-web ConfigMap and recreating the pulp-web Pod. After doing so, the original client IP is now correctly preserved in the pulp-web logs.

This is what we ended up adding to the pulp-web nginx configuration:

set_real_ip_from <ALB_SUBNET_CIDR>; # Trust the Pulp ALB
set_real_ip_from <EXTERNAL_NGINX_VPC_CIDR>; # Trust the External Nginx VPC CIDR
real_ip_header X-Forwarded-For;
real_ip_recursive on;

3 Likes