have tried to capture as much as logs possible here . Please let me know if incase more are required.
#################
Pod details :
[be3075@yb1404 ~]$ oc get pods
NAME READY STATUS RESTARTS AGE
pulp-api-6bbc4df7b5-lbqjf 1/1 Running 0 19h
pulp-content-774594bd8d-bhhrh 1/1 Running 0 20h
pulp-content-774594bd8d-fwmzg 1/1 Running 0 20h
pulp-database-0 1/1 Running 0 20h
pulp-operator-controller-manager-598fbc76b7-cvhx6 2/2 Running 15 (4h3m ago) 2d21h
pulp-redis-584d45fffb-ggwqp 1/1 Running 0 20h
pulp-worker-7c9cddbfb-5mbsw 1/1 Running 1 (2m37s ago) 15m
pulp-worker-7c9cddbfb-qnmnv 1/1 Running 1 (2m36s ago) 16m
#####################################################
API pod logs :
(‘pulp [f482273951244785bdb48ab8a3ea048f]: ::ffff:10.140.12.1 - admin [18/Apr/2024:07:04:54 +0000] “GET /pulp/api/v3/tasks/018eeffd-a278-703a-9515-237845ba2819/ HTTP/1.1” 500 145 “-” “Pulp-CLI/0.24.1”’,)
pulp [None]: pulpcore.app.entrypoint:INFO: Api App ‘14@pulp-api-6bbc4df7b5-lbqjf’ failed to write a heartbeat to the database, sleeping for ‘45.0’ seconds.
pulp [None]: pulpcore.app.entrypoint:INFO: Api App ‘14@pulp-api-6bbc4df7b5-lbqjf’ failed to write a heartbeat to the database, sleeping for ‘45.0’ seconds.
pulp [None]: pulpcore.app.entrypoint:INFO: Api App ‘14@pulp-api-6bbc4df7b5-lbqjf’ failed to write a heartbeat to the database, sleeping for ‘45.0’ seconds.
pulp [None]: pulpcore.app.entrypoint:INFO: Api App ‘15@pulp-api-6bbc4df7b5-lbqjf’ failed to write a heartbeat to the database, sleeping for ‘45.0’ seconds.
pulp [None]: pulpcore.app.entrypoint:INFO: Api App ‘14@pulp-api-6bbc4df7b5-lbqjf’ failed to write a heartbeat to the database, sleeping for ‘45.0’ seconds.
pulp [8878b12e0b59403399ec985b937c5601]: django.request:ERROR: Internal Server Error: /pulp/api/v3/status/
Traceback (most recent call last):
File “/usr/local/lib/python3.9/site-packages/django/db/backends/base/base.py”, line 289, in ensure_connection
self.connect()
File “/usr/local/lib/python3.9/site-packages/django/utils/asyncio.py”, line 26, in inner
return func(*args, **kwargs)
File “/usr/local/lib/python3.9/site-packages/django/db/backends/base/base.py”, line 270, in connect
self.connection = self.get_new_connection(conn_params)
File “/usr/local/lib/python3.9/site-packages/django/utils/asyncio.py”, line 26, in inner
return func(*args, **kwargs)
File “/usr/local/lib/python3.9/site-packages/django/db/backends/postgresql/base.py”, line 275, in get_new_connection
connection = self.Database.connect(**conn_params)
File “/usr/local/lib/python3.9/site-packages/psycopg/connection.py”, line 748, in connect
raise last_ex.with_traceback(None)
psycopg.OperationalError: connection failed: FATAL: the database system is in recovery mode
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File “/usr/local/lib/python3.9/site-packages/django/core/handlers/exception.py”, line 55, in inner
response = get_response(request)
File “/usr/local/lib/python3.9/site-packages/django/core/handlers/base.py”, line 185, in _get_response
response = middleware_method(
File “/usr/local/lib/python3.9/site-packages/pulpcore/middleware.py”, line 35, in process_view
domain = Domain.objects.get(name=domain_name)
File “/usr/local/lib/python3.9/site-packages/django/db/models/manager.py”, line 87, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File “/usr/local/lib/python3.9/site-packages/django/db/models/query.py”, line 633, in get
num = len(clone)
File “/usr/local/lib/python3.9/site-packages/django/db/models/query.py”, line 380, in len
self._fetch_all()
File “/usr/local/lib/python3.9/site-packages/django/db/models/query.py”, line 1881, in _fetch_all
self._result_cache = list(self._iterable_class(self))
File “/usr/local/lib/python3.9/site-packages/django/db/models/query.py”, line 91, in iter
results = compiler.execute_sql(
File “/usr/local/lib/python3.9/site-packages/django/db/models/sql/compiler.py”, line 1560, in execute_sql
cursor = self.connection.cursor()
File “/usr/local/lib/python3.9/site-packages/django/utils/asyncio.py”, line 26, in inner
return func(*args, **kwargs)
File “/usr/local/lib/python3.9/site-packages/django/db/backends/base/base.py”, line 330, in cursor
return self._cursor()
File “/usr/local/lib/python3.9/site-packages/django/db/backends/base/base.py”, line 306, in _cursor
self.ensure_connection()
File “/usr/local/lib/python3.9/site-packages/django/utils/asyncio.py”, line 26, in inner
return func(*args, **kwargs)
File “/usr/local/lib/python3.9/site-packages/django/db/backends/base/base.py”, line 289, in ensure_connection
self.connect()
File “/usr/local/lib/python3.9/site-packages/django/db/utils.py”, line 91, in exit
raise dj_exc_value.with_traceback(traceback) from exc_value
File “/usr/local/lib/python3.9/site-packages/django/db/backends/base/base.py”, line 289, in ensure_connection
self.connect()
File “/usr/local/lib/python3.9/site-packages/django/utils/asyncio.py”, line 26, in inner
return func(*args, **kwargs)
File “/usr/local/lib/python3.9/site-packages/django/db/backends/base/base.py”, line 270, in connect
self.connection = self.get_new_connection(conn_params)
File “/usr/local/lib/python3.9/site-packages/django/utils/asyncio.py”, line 26, in inner
return func(*args, **kwargs)
File “/usr/local/lib/python3.9/site-packages/django/db/backends/postgresql/base.py”, line 275, in get_new_connection
connection = self.Database.connect(**conn_params)
File “/usr/local/lib/python3.9/site-packages/psycopg/connection.py”, line 748, in connect
raise last_ex.with_traceback(None)
django.db.utils.OperationalError: connection failed: FATAL: the database system is in recovery mode
(‘pulp [8878b12e0b59403399ec985b937c5601]: ::ffff:10.141.12.1 - - [18/Apr/2024:07:04:57 +0000] “GET /pulp/api/v3/status/ HTTP/1.1” 500 145 “-” “kube-probe/1.26”’,)
pulp [None]: pulpcore.app.entrypoint:INFO: Api App ‘15@pulp-api-6bbc4df7b5-lbqjf’ failed to write a heartbeat to the database, sleeping for ‘45.0’ seconds.
pulp [None]: pulpcore.app.entrypoint:INFO: Api App ‘15@pulp-api-6bbc4df7b5-lbqjf’ failed to write a heartbeat to the database, sleeping for ‘45.0’ seconds.
(‘pulp [20c81711c0174f90b4a1f159b0b28e3e]: ::ffff:10.141.12.1 - - [18/Apr/2024:07:05:17 +0000] “GET /pulp/api/v3/status/ HTTP/1.1” 200 4111 “-” “kube-probe/1.26”’,)
(‘pulp [a292b3a8cd574bb1809725c77d344a20]: ::ffff:10.141.12.1 - - [18/Apr/2024:07:05:37 +0000] “GET /pulp/api/v3/status/ HTTP/1.1” 200 4111 “-” “kube-probe/1.26”’,)
(‘pulp [d008d208280d490f878e268a005213e3]: ::ffff:10.141.12.1 - - [18/Apr/2024:07:05:57 +0000] “GET /pulp/api/v3/status/ HTTP/1.1” 200 4111 “-” “kube-probe/1.26”’,)
(‘pulp [1d648ed8c5154204b49e23eea08581fa]: ::ffff:10.141.12.1 - - [18/Apr/2024:07:06:17 +0000] “GET /pulp/api/v3/status/ HTTP/1.1” 200 4111 “-” “kube-probe/1.26”’,)
(‘pulp [89994f437629458fb185dba3b0848188]: ::ffff:10.141.12.1 - - [18/Apr/2024:07:06:37 +0000] “GET /pulp/api/v3/status/ HTTP/1.1” 200 4111 “-” “kube-probe/1.26”’,)
(‘pulp [bb05b7aad4fc4c4e8b1a63cbf5b12df6]: ::ffff:10.141.12.1 - - [18/Apr/2024:07:06:57 +0000] “GET /pulp/api/v3/status/ HTTP/1.1” 200 4111 “-” “kube-probe/1.26”’,)
(‘pulp [fe2995194f05416287c4697421e6119b]: ::ffff:10.141.12.1 - - [18/Apr/2024:07:07:17 +0000] “GET /pulp/api/v3/status/ HTTP/1.1” 200 4111 “-” “kube-probe/1.26”’,)
(‘pulp [ab14a9d1547041a1b9415990268d5b72]: ::ffff:10.141.12.1 - - [18/Apr/2024:07:07:37 +0000] “GET /pulp/api/v3/status/ HTTP/1.1” 200 4111 “-” “kube-probe/1.26”’,)
#################################################################################################33
Worker -1
[be3075@yb1404 ~]$ oc logs pod/pulp-worker-7c9cddbfb-5mbsw --previous
Waiting on postgresql to start…
Postgres started.
Checking for database migrations
error: Failed to initialize NSS library
Database migrated!
error: Failed to initialize NSS library
pulp [None]: pulpcore.tasking.entrypoint:INFO: Starting distributed type worker
pulp [None]: pulpcore.tasking.worker:INFO: New worker ‘1@pulp-worker-7c9cddbfb-5mbsw’ discovered
Traceback (most recent call last):
File “/usr/local/bin/pulpcore-worker”, line 8, in
sys.exit(worker())
File “/usr/local/lib/python3.9/site-packages/click/core.py”, line 1157, in call
return self.main(*args, **kwargs)
File “/usr/local/lib/python3.9/site-packages/click/core.py”, line 1078, in main
rv = self.invoke(ctx)
File “/usr/local/lib/python3.9/site-packages/click/core.py”, line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File “/usr/local/lib/python3.9/site-packages/click/core.py”, line 783, in invoke
return __callback(*args, **kwargs)
File “/usr/local/lib/python3.9/site-packages/pulpcore/tasking/entrypoint.py”, line 43, in worker
PulpcoreWorker().run(burst=burst)
File “/usr/local/lib/python3.9/site-packages/pulpcore/tasking/worker.py”, line 413, in run
self.sleep()
File “/usr/local/lib/python3.9/site-packages/pulpcore/tasking/worker.py”, line 300, in sleep
connection.connection.execute(“SELECT 1”)
File “/usr/local/lib/python3.9/site-packages/psycopg/connection.py”, line 891, in execute
raise ex.with_traceback(None)
psycopg.OperationalError: consuming input failed: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
##################################################################################################################
Worker -2 :
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File “/usr/lib64/python3.9/multiprocessing/process.py”, line 315, in _bootstrap
self.run()
File “/usr/lib64/python3.9/multiprocessing/process.py”, line 108, in run
self._target(*self._args, **self._kwargs)
File “/usr/local/lib/python3.9/site-packages/pulpcore/tasking/_util.py”, line 156, in perform_task
execute_task(task)
File “/usr/local/lib/python3.9/site-packages/pulpcore/tasking/tasks.py”, line 54, in execute_task
contextvars.copy_context().run(_execute_task, task)
File “/usr/local/lib/python3.9/site-packages/pulpcore/tasking/tasks.py”, line 78, in _execute_task
task.set_failed(exc, tb)
File “/usr/local/lib/python3.9/site-packages/pulpcore/app/models/task.py”, line 199, in set_failed
rows = Task.objects.filter(pk=self.pk, state=TASK_STATES.RUNNING).update(
File “/usr/local/lib/python3.9/site-packages/django/db/models/query.py”, line 1206, in update
rows = query.get_compiler(self.db).execute_sql(CURSOR)
File “/usr/local/lib/python3.9/site-packages/django/db/models/sql/compiler.py”, line 1984, in execute_sql
cursor = super().execute_sql(result_type)
File “/usr/local/lib/python3.9/site-packages/django/db/models/sql/compiler.py”, line 1560, in execute_sql
cursor = self.connection.cursor()
File “/usr/local/lib/python3.9/site-packages/django/utils/asyncio.py”, line 26, in inner
return func(*args, **kwargs)
File “/usr/local/lib/python3.9/site-packages/django/db/backends/base/base.py”, line 330, in cursor
return self._cursor()
File “/usr/local/lib/python3.9/site-packages/django/db/backends/base/base.py”, line 306, in _cursor
self.ensure_connection()
File “/usr/local/lib/python3.9/site-packages/django/utils/asyncio.py”, line 26, in inner
return func(*args, **kwargs)
File “/usr/local/lib/python3.9/site-packages/django/db/backends/base/base.py”, line 289, in ensure_connection
self.connect()
File “/usr/local/lib/python3.9/site-packages/django/db/utils.py”, line 91, in exit
raise dj_exc_value.with_traceback(traceback) from exc_value
File “/usr/local/lib/python3.9/site-packages/django/db/backends/base/base.py”, line 289, in ensure_connection
self.connect()
File “/usr/local/lib/python3.9/site-packages/django/utils/asyncio.py”, line 26, in inner
return func(*args, **kwargs)
File “/usr/local/lib/python3.9/site-packages/django/db/backends/base/base.py”, line 270, in connect
self.connection = self.get_new_connection(conn_params)
File “/usr/local/lib/python3.9/site-packages/django/utils/asyncio.py”, line 26, in inner
return func(*args, **kwargs)
File “/usr/local/lib/python3.9/site-packages/django/db/backends/postgresql/base.py”, line 275, in get_new_connection
connection = self.Database.connect(**conn_params)
File “/usr/local/lib/python3.9/site-packages/psycopg/connection.py”, line 748, in connect
raise last_ex.with_traceback(None)
django.db.utils.OperationalError: connection failed: FATAL: the database system is in recovery mode
###############################################################################################################
Database pods :
2024-04-18 07:04:54.306 UTC [1] LOG: server process (PID 162302) was terminated by signal 9: Killed
2024-04-18 07:04:54.306 UTC [1] DETAIL: Failed process was running: INSERT INTO “rpm_package” (“content_ptr_id”, “name”, “epoch”, “version”, “release”, “arch”, “pkgId”, “checksum_type”, “summary”, “description”, “url”, “changelogs”, “files”, “requires”, “provides”, “conflicts”, “obsoletes”, “suggests”, “enhances”, “recommends”, “supplements”, “location_base”, “location_href”, “rpm_buildhost”, “rpm_group”, “rpm_license”, “rpm_packager”, “rpm_sourcerpm”, “rpm_vendor”, “rpm_header_start”, “rpm_header_end”, “size_archive”, “size_installed”, “size_package”, “time_build”, “time_file”, “is_modular”, “_pulp_domain_id”) VALUES (‘018ef004c972770c8664e2ea7f46a113’::uuid, ‘java-latest-openjdk-src-slowdebug’, ‘1’, ‘22.0.0.0.36’, ‘1.rolling.el8’, ‘x86_64’, ‘10ca285cee269d505314ac69b5286f38b77d6d87bfadcd7c241cb9a9375a59c7’, ‘sha256’, ‘OpenJDK 22 Source Bundle for packages with debugging on and no optimisation’, 'The java-22-openjdk-src-slowdebug sub-package contains the complete OpenJDK 22
class library source code for use by IDE indexers and debuggers, for packages with debugging on and
2024-04-18 07:04:54.306 UTC [1] LOG: terminating any other active server processes
2024-04-18 07:04:54.306 UTC [164118] WARNING: terminating connection because of crash of another server process
2024-04-18 07:04:54.306 UTC [164118] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2024-04-18 07:04:54.306 UTC [164118] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2024-04-18 07:04:54.306 UTC [162299] WARNING: terminating connection because of crash of another server process
2024-04-18 07:04:54.306 UTC [162299] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2024-04-18 07:04:54.306 UTC [162299] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2024-04-18 07:04:54.307 UTC [161847] WARNING: terminating connection because of crash of another server process
2024-04-18 07:04:54.307 UTC [161847] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2024-04-18 07:04:54.307 UTC [161847] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2024-04-18 07:04:54.307 UTC [160732] WARNING: terminating connection because of crash of another server process
2024-04-18 07:04:54.307 UTC [160732] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2024-04-18 07:04:54.307 UTC [160732] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2024-04-18 07:04:54.307 UTC [160731] WARNING: terminating connection because of crash of another server process
2024-04-18 07:04:54.307 UTC [160731] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2024-04-18 07:04:54.307 UTC [160731] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2024-04-18 07:04:54.307 UTC [160734] WARNING: terminating connection because of crash of another server process
2024-04-18 07:04:54.307 UTC [160734] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2024-04-18 07:04:54.307 UTC [160734] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2024-04-18 07:04:54.307 UTC [160735] WARNING: terminating connection because of crash of another server process
2024-04-18 07:04:54.307 UTC [160735] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2024-04-18 07:04:54.307 UTC [160735] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2024-04-18 07:04:54.308 UTC [161727] WARNING: terminating connection because of crash of another server process
2024-04-18 07:04:54.308 UTC [161727] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2024-04-18 07:04:54.308 UTC [161727] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2024-04-18 07:04:54.406 UTC [160707] WARNING: terminating connection because of crash of another server process
2024-04-18 07:04:54.406 UTC [160707] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2024-04-18 07:04:54.406 UTC [160707] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2024-04-18 07:04:54.408 UTC [164122] FATAL: the database system is in recovery mode
2024-04-18 07:04:54.409 UTC [164123] FATAL: the database system is in recovery mode
2024-04-18 07:04:54.410 UTC [164124] FATAL: the database system is in recovery mode
2024-04-18 07:04:54.507 UTC [164125] FATAL: the database system is in recovery mode
2024-04-18 07:04:54.508 UTC [164126] FATAL: the database system is in recovery mode
2024-04-18 07:04:54.513 UTC [164127] FATAL: the database system is in recovery mode
2024-04-18 07:04:54.607 UTC [164128] FATAL: the database system is in recovery mode
2024-04-18 07:04:54.707 UTC [164129] FATAL: the database system is in recovery mode
2024-04-18 07:04:54.709 UTC [164130] FATAL: the database system is in recovery mode
2024-04-18 07:04:54.712 UTC [1] LOG: all server processes terminated; reinitializing
Also db pods says
2024-04-18 09:26:46.307 UTC [1325] ERROR: relation “core_artifact” does not exist at character 28
2024-04-18 09:26:46.307 UTC [1325] STATEMENT: SELECT count(pulp_id) FROM core_artifact WHERE sha224 IS NULL
2024-04-18 09:26:47.327 UTC [1327] ERROR: relation “core_artifact” does not exist at character 28
2024-04-18 09:26:47.327 UTC [1327] STATEMENT: SELECT count(pulp_id) FROM core_artifact WHERE sha224 IS NULL
2024-04-18 09:26:50.137 UTC [1330] ERROR: relation “core_artifact” does not exist at character 28
2024-04-18 09:26:50.137 UTC [1330] STATEMENT: SELECT count(pulp_id) FROM core_artifact WHERE sha224 IS NULL
2024-04-18 09:26:50.176 UTC [1329] ERROR: relation “core_artifact” does not exist at character 28
2024-04-18 09:26:50.176 UTC [1329] STATEMENT: SELECT count(pulp_id) FROM core_artifact WHERE sha224 IS NULL
2024-04-18 09:26:50.761 UTC [1333] ERROR: relation “core_artifact” does not exist at character 28
2024-04-18 09:26:50.761 UTC [1333] STATEMENT: SELECT count(pulp_id) FROM core_artifact WHERE sha224 IS NULL
2024-04-18 09:26:53.809 UTC [1350] ERROR: relation “core_artifact” does not exist at character 28
2024-04-18 09:26:53.809 UTC [1350] STATEMENT: SELECT count(pulp_id) FROM core_artifact WHERE sha224 IS NULL
2024-04-18 09:26:55.911 UTC [1352] ERROR: relation “core_artifact” does not exist at character 28
2024-04-18 09:26:55.911 UTC [1352] STATEMENT: SELECT count(pulp_id) FROM core_artifact WHERE sha224 IS NULL
2024-04-18 09:26:56.117 UTC [1354] ERROR: relation “core_artifact” does not exist at character 28
2024-04-18 09:26:56.117 UTC [1354] STATEMENT: SELECT count(pulp_id) FROM core_artifact WHERE sha224 IS NULL
2024-04-18 09:26:56.712 UTC [1356] ERROR: relation “core_artifact” does not exist at character 28
2024-04-18 09:26:56.712 UTC [1356] STATEMENT: SELECT count(pulp_id) FROM core_artifact WHERE sha224 IS NULL