Multiple Repo Syncs Fail When Run Concurrently

We’re seeing failures when running multiple repository syncs concurrently. When several sync tasks are started at the same time, they fail with errors instead of completing successfully.

Environment

pulpcore: 3.95.5
pulp_deb: 3.8.0

What we’re doing

We trigger multiple ubuntu distributions repository syncs in parallel (different repos).
When run individually, sync works fine. When run concurrently, we start seeing failures.

Errors observed

  1. “error”: {
    “traceback”: " File “/usr/local/lib/python3.11/site-packages/pulpcore/tasking/tasks.py”, line 72, in _execute_task\n result = task_function()\n ^^^^^^^^^^^^^^^\n File “/usr/local/lib/python3.11/site-packages/pulp_deb/app/tasks/synchronizing.py”, line 223, in synchronize\n DebDeclarativeVersion(first_stage, repository, mirror=mirror).create()\n File “/usr/local/lib/python3.11/site-packages/pulpcore/plugin/stages/declarative_version.py”, line 163, in create\n loop.run_until_complete(pipeline)\n File “/usr/lib64/python3.11/asyncio/base_events.py”, line 653, in run_until_complete\n return future.result()\n ^^^^^^^^^^^^^^^\n File “/usr/local/lib/python3.11/site-packages/pulpcore/plugin/stages/api.py”, line 220, in create_pipeline\n await asyncio.gather(*futures)\n File “/usr/local/lib/python3.11/site-packages/pulpcore/plugin/stages/api.py”, line 41, in call\n await self.run()\n File “/usr/local/lib/python3.11/site-packages/pulpcore/plugin/stages/artifact_stages.py”, line 319, in run\n await self._handle_remote_artifacts(batch)\n File “/usr/local/lib/python3.11/site-packages/pulpcore/plugin/stages/artifact_stages.py”, line 419, in _handle_remote_artifacts\n raise ValueError(\n",
    “description”: “No declared artifact with relative path “pool/universe/g/gcc-defaults/gdc-for-build_13.2.0-7ubuntu2_all.deb” for content “<Package (pulp_type=deb.package): pk=019cba69-cc19-7836-99fb-cac255e2caa2>””

  2. "error": {
    

    “traceback”: " File “/usr/local/lib/python3.11/site-packages/pulpcore/tasking/tasks.py”, line 72, in _execute_task\n result = task_function()\n ^^^^^^^^^^^^^^^\n File “/usr/local/lib/python3.11/site-packages/pulp_deb/app/tasks/synchronizing.py”, line 223, in synchronize\n DebDeclarativeVersion(first_stage, repository, mirror=mirror).create()\n File “/usr/local/lib/python3.11/site-packages/pulpcore/plugin/stages/declarative_version.py”, line 163, in create\n loop.run_until_complete(pipeline)\n File “/usr/lib64/python3.11/asyncio/base_events.py”, line 653, in run_until_complete\n return future.result()\n ^^^^^^^^^^^^^^^\n File “/usr/local/lib/python3.11/site-packages/pulpcore/plugin/stages/api.py”, line 220, in create_pipeline\n await asyncio.gather(*futures)\n File “/usr/local/lib/python3.11/site-packages/pulpcore/plugin/stages/api.py”, line 41, in call\n await self.run()\n File “/usr/local/lib/python3.11/site-packages/pulpcore/plugin/stages/content_stages.py”, line 198, in run\n await sync_to_async(process_batch)()\n File “/usr/local/lib/python3.11/site-packages/asgiref/sync.py”, line 439, in call\n ret = await asyncio.shield(exec_coro)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File “/usr/lib64/python3.11/concurrent/futures/thread.py”, line 58, in run\n result = self.fn(*self.args, **self.kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File “/usr/local/lib/python3.11/site-packages/asgiref/sync.py”, line 493, in thread_handler\n return func(*args, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^\n File “/usr/local/lib/python3.11/site-packages/pulpcore/plugin/stages/content_stages.py”, line 105, in process_batch\n with transaction.atomic():\n File “/usr/local/lib/python3.11/site-packages/django/db/transaction.py”, line 263, in exit\n connection.commit()\n File “/usr/local/lib/python3.11/site-packages/django/utils/asyncio.py”, line 26, in inner\n return func(*args, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^\n File “/usr/local/lib/python3.11/site-packages/django/db/backends/base/base.py”, line 336, in commit\n self._commit()\n File “/usr/local/lib/python3.11/site-packages/django/db/backends/base/base.py”, line 311, in _commit\n with debug_transaction(self, “COMMIT”), self.wrap_database_errors:\n File “/usr/local/lib/python3.11/site-packages/django/db/utils.py”, line 91, in exit\n raise dj_exc_value.with_traceback(traceback) from exc_value\n File “/usr/local/lib/python3.11/site-packages/django/db/backends/base/base.py”, line 312, in _commit\n return self.connection.commit()\n ^^^^^^^^^^^^^^^^^^^^^^^^\n File “/usr/local/lib/python3.11/site-packages/psycopg/connection.py”, line 305, in commit\n self.wait(self._commit_gen())\n File “/usr/local/lib/python3.11/site-packages/psycopg/connection.py”, line 484, in wait\n return waiting.wait(gen, self.pgconn.socket, interval=interval)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File “psycopg_binary/_psycopg/waiting.pyx”, line 241, in psycopg_binary._psycopg.wait_c\n File “/usr/local/lib/python3.11/site-packages/psycopg/_connection_base.py”, line 585, in _commit_gen\n yield from self._exec_command(b"COMMIT")\n File “/usr/local/lib/python3.11/site-packages/psycopg/_connection_base.py”, line 483, in _exec_command\n raise e.error_from_result(result, encoding=self.pgconn._encoding)\n",
    “description”: “insert or update on table “core_contentartifact” violates foreign key constraint “core_contentartifact_artifact_id_f5d9a66b_fk_core_arti”\nDETAIL: Key (artifact_id)=(019cbadd-3a10-780d-bfc9-9215f61eff3c) is not present in table “core_artifact”.”

Observations

  • These errors only occur when multiple syncs run at the same time
  • Single sync runs complete successfully
  • Failures appear related to missing artifacts or inconsistent DB state

Questions

  1. Is it expected that concurrent syncs could lead to missing artifact or FK constraint issues?
  2. Could this indicate a race condition or concurrency issue in artifact handling?
  3. Are there recommended limits or configurations for running multiple syncs in parallel?
  4. Would increasing worker count or adjusting DB settings help, or should syncs be serialized?

I have not heard of symptoms like this before. I wonder whether the packages you are getting the errors on are present in multiple of the repositories you are syncing in parallel. I am not sure how Pulp is meant to handle a situation where two identical packages are simultaneously being added by two independent syncs. Since this should result in only one artifact (de-duplication on disk is a core Pulp feature), I suppose one could imagine some race conditions in this situation?

How reliably are you getting these effects? Are they happening every time or just sometimes? Always on the same packages or different packages?

I speculate, but it sounds like system resource bottlenecks could be a factor here (especially slow or fast disk or DB for example). Sometimes decreasing the number of workers is more helpful than increasing in order to avoid bottlenecks. In some cases this can even increase overall speed. Of course I can’t say if this is the case for your system but it may be worth experimenting in both directions.

1 Like