Orphan_cleanup Task Fails with “portal does not exist” After DB Reset (pulpcore 3.95.3)

Hi all,

We are running pulpcore 3.95.3.

After resetting the Pulp database, we are seeing failures in the automatic orphan_cleanup task.

Below is the task detail:

{
“pulp_href”: “/pulp/api/v3/tasks/019caca3-6ab5-7339-be5b-209463b26221/”,
“state”: “failed”,
“name”: “pulpcore.app.tasks.orphan.orphan_cleanup”,
“started_at”: “2026-03-02T04:18:51.844101Z”,
“finished_at”: “2026-03-02T04:18:51.907020Z”,
“error”: {
“description”: “portal “_django_curs_140137546135360_sync_1” does not exist”
}
}

Stack Trace

“traceback”: " File “/usr/local/lib/python3.11/site-packages/pulpcore/tasking/tasks.py”, line 72, in _execute_task\n result = task_function()\n ^^^^^^^^^^^^^^^\n File “/usr/local/lib/python3.11/site-packages/pulpcore/app/tasks/orphan.py”, line 62, in orphan_cleanup\n for bulk_content in queryset_iterator(content):\n File “/usr/local/lib/python3.11/site-packages/pulpcore/app/tasks/orphan.py”, line 34, in queryset_iterator\n primary_key_buffer.append(next(iterator))\n ^^^^^^^^^^^^^^\n File “/usr/local/lib/python3.11/site-packages/django/db/models/query.py”, line 518, in _iterator\n yield from iterable\n File “/usr/local/lib/python3.11/site-packages/django/db/models/query.py”, line 287, in iter\n for row in compiler.results_iter(\n ^^^^^^^^^^^^^^^^^^^^^^\n File “/usr/local/lib/python3.11/site-packages/django/db/models/sql/compiler.py”, line 1513, in results_iter\n results = self.execute_sql(\n ^^^^^^^^^^^^^^^^^\n File “/usr/local/lib/python3.11/site-packages/django/db/models/sql/compiler.py”, line 1562, in execute_sql\n cursor.execute(sql, params)\n File “/usr/local/lib/python3.11/site-packages/django/db/backends/utils.py”, line 67, in execute\n return self._execute_with_wrappers(\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File “/usr/local/lib/python3.11/site-packages/django/db/backends/utils.py”, line 80, in _execute_with_wrappers\n return executor(sql, params, many, context)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File “/usr/local/lib/python3.11/site-packages/django/db/backends/utils.py”, line 84, in _execute\n with self.db.wrap_database_errors:\n File “/usr/local/lib/python3.11/site-packages/django/db/utils.py”, line 91, in exit\n raise dj_exc_value.with_traceback(traceback) from exc_value\n File “/usr/local/lib/python3.11/site-packages/django/db/backends/utils.py”, line 89, in _execute\n return self.cursor.execute(sql, params)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File “/usr/local/lib/python3.11/site-packages/psycopg/_server_cursor.py”, line 98, in execute\n raise ex.with_traceback(None)\n",
“description”: “portal “_django_curs_140137546135360_sync_1” does not exist”

Questions

What could cause the portal "... does not exist" error during orphan_cleanup?

We are using:

  • pulpcore 3.95.3
  • PostgreSQL 13
  • External Postgres and Redis

Any guidance would be appreciated.

Thanks.

A little digging shows that “portal does not exist” is a pretty low-level Postgres error. It appears to happen if an attempt is made to use a cursor after committing a transaction (which closes the ‘portal’ the cursor is using). It’s definitely not something I/we have seen before.

What do you mean by “resetting the pulp database” here, btw?

1 Like

By “resetting the Pulp database,” I mean that we dropped and recreated the external PostgreSQL database entirely as part of a fresh installation. After recreating the database, we ran the standard Pulp initialization/migrations and brought the services back up.

The orphan_cleanup failure occurred after this fresh DB setup.

Let me know if there’s any additional information about the DB initialization steps or deployment setup that would help narrow this down.

So you’re running orphan-cleanup on an essentially empty DB? That…may never have been tried before :slight_smile: You only need it to remove artifacts from the filesystem that are for content that is no longer in any repository, and a brand-new installation has no content/repositories/artifacts to clean up.

Thanks — understood on the intent of orphan cleanup.

Just to clarify: even on a fresh install / empty DB, we’re consistently able to reproduce the same behavior.

What we’re seeing

  • We use S3 storage, and for this test the bucket is empty (no objects at all).
  • We dropped and recreated the external Postgres DB for a clean installation.
  • After starting the first pulp-worker, an orphan_cleanup task is created and sits in waiting.
  • Once we start a second worker instance, the orphan_cleanup task is picked up and then fails with the same low-level Postgres error:portal "_django_curs_..." does not exist
  • After that failure, any subsequent tasks we try to run (e.g. repo sync) remain in waiting forever and never get picked up by a worker.

Why this is confusing

We previously ran with a bucket that already contained objects/files and did not see this issue. This only happens now that we’re doing a truly clean install with an empty bucket/empty DB.

Questions

  1. Is orphan_cleanup expected to run automatically on startup even when there is no content/artifacts yet? If so, should it be safe/no-op in this scenario?
  2. Could this be related to our DB setup (e.g., PgBouncer transaction pooling, server-side cursors, etc.) causing the “portal does not exist” failure?
  3. Why would a failed orphan_cleanup cause all subsequent tasks to stay in waiting indefinitely?
  4. Is there a recommended way to recover from this state (clear stuck reservations / reset worker state), and/or a recommended way to disable orphan cleanup until after initial content is synced?

Quick update: this looks like it was caused by PgBouncer.

When we pointed Pulp at our PgBouncer endpoint, we consistently saw the orphan-cleanup failure with:

portal "_django_curs_..." does not exist

After switching the DB endpoint to go through HAProxy directly to Postgres (bypassing PgBouncer), the error disappeared and tasks started running normally again.

So it seems related to PgBouncer behavior (likely pooling mode / cursor handling). If anyone has guidance on the recommended PgBouncer settings for Pulp/Django (e.g., session pooling vs disabling server-side cursors), we’d appreciate it.

1 Like

Well, this sure is a new way to fail, but pgbouncer has proven to silently break existing contracts between pulp and postgresql (as described in the postgresql documentation) before.
I don’t think there is a safe way to set this up, that not also defeats the very idea, why you might want to use an external connection pool.

Thanks — that makes sense.

The main reason we introduced PgBouncer is that our internal DBA-supported database offering standardizes on a pooled access pattern, and PgBouncer is the supported way to provide stable connectivity/HA characteristics from the platform side. We’re trying to stay within that supported model.

Given this behavior, we’ll plan to engage our DBA team to review options.