Sizing WORKING_DIRECTORY (/var/lib/pulp/tmp) when artifacts are stored in S3 (post-#1936 fix)

bli111 · March 11, 2026, 8:04pm

We previously ran into this issue during remote syncs (RPM plugin) where temporary downloaded files would accumulate and not be cleared until the end of the sync job when artifact storage was not on the same filesystem (we use S3 for artifacts):

github.com/pulp/pulpcore

Sync issue when FILE_UPLOAD_TEMP_DIR volume different from STATIC_ROOT

opened 02:32PM - 17 Jan 22 UTC

closed 02:47PM - 20 Sep 23 UTC

pulpbot

Issue

Author: wibbit (wibbit) Redmine Issue: 7635, https://pulp.plan.io/issues/7635 … --- When running a remote sync (tested against RPM plugin), if the FILE_UPLOAD_TEMP_DIR is a separate file system to STATIC_ROOT, the downloaded assets are not cleared down until the end of the sync job is completed. This will become a significant issue when setting up multiple large repositories at the same time. It would be ideal if as each asset is downloaded and copied to the STATIC_ROOT directory, that the item would be cleared out of the FILE_UPLOAD_TEMP_DIR, as opposed to waiting for the sync job to complete entirely.

That issue is now closed and we’ve upgraded to a version where it should be fixed, but we’re still trying to size local disk/ephemeral storage appropriately for staging downloads in WORKING_DIRECTORY (/var/lib/pulp/tmp).

Even with artifacts stored in S3, syncs still stage downloads locally, and we sometimes sync multiple large repos in parallel. We want to avoid worker pods/nodes running out of local/ephemeral storage.

Questions:

For a setup with artifact storage in S3 (STORAGES default backend = S3), what’s the recommended way to size local WORKING_DIRECTORY storage? Any rule of thumb (per worker, per concurrent sync, etc.)?
In current pulpcore, is a temp file deleted immediately after it’s successfully moved/uploaded to the artifact storage backend (S3), or can it persist until the end of a batch or task?
Are there tunables that materially reduce peak temp disk usage?
- number of workers / concurrent sync tasks
- Remote download_concurrency
- MAX_CONCURRENT_CONTENT
- anything else that affects how many artifacts can be staged locally at once
The pulpcore hardware requirements page mentions that “30GB should be enough in the majority of cases” when using S3 for artifact storage — does that assume a certain concurrency (workers / parallel syncs) or repo profile?
Hardware requirements - Pulp Project

Any guidance or real-world sizing examples would be appreciated.

Thanks!