Create a single remote for Galaxy roles

Problem:
Got an error when try to mirror galaxy roles
$ pulp ansible repository sync --name “roles”
Started background task /pulp/api/v3/tasks/0b2fe108-218d-4949-9317-a5a2dbc321a1/
…Error: Task /pulp/api/v3/tasks/0b2fe108-218d-4949-9317-a5a2dbc321a1/ failed: '404, message=‘Not Found’, url=URL(‘https://github.com/0x5a17ed/ansible-role-netbox/archive/v0.7.8.tar.gz’)

Expected outcome:
Expect sync to continue

Pulpcore version:
3.21.2

Pulp plugins installed and their versions:
pulp-ansible 0.16.0

Operating system - distribution and version:
Rhel 8.6

Other relevant data:
Try use one remote to mirror all roles with the url “https://galaxy.ansible.com/api/v1/roles/

I could create a remote for each role in galaxy. It works this way but like to mirror entire galaxy roles with a single remote.

Hey, can you post the information of the remote and check that it is a Role remote? pulp_ansible has three remote types: Role, Collection & Git remotes. To sync roles you must create a role type remote to pass to the sync endpoint. When using the CLI the default remote type is Collection, in order to create a Role remote you have to pass the -t role to the command. See the roles workflow docs: Role Workflows — Pulp Ansible 0.17.0 documentation

From what i read, the sync starts, but galaxy is missing a file, so the sync fails and you end up with nothing. Can you confirm that? And yes, commands to reproduce are always useful.

I created a role repo

    "pulp_href": "/pulp/api/v3/repositories/ansible/ansible/09db43c0-d10e-42c8-b687-193cd186f8cd/",
    "pulp_created": "2023-03-28T12:28:49.750467Z",
    "versions_href": "/pulp/api/v3/repositories/ansible/ansible/09db43c0-d10e-42c8-b687-193cd186f8cd/versions/",
    "pulp_labels": {},
    "latest_version_href": "/pulp/api/v3/repositories/ansible/ansible/09db43c0-d10e-42c8-b687-193cd186f8cd/versions/2/",
    "name": "roles",
    "description": null,
    "retain_repo_versions": null,
    "remote": "/pulp/api/v3/remotes/ansible/role/b6711ad1-5f26-4329-8d4f-2bf012078ca8/",
    "last_synced_metadata_time": null,
    "gpgkey": null
  }

with a default remote

$ pulp ansible remote -t "role"  list
...
 {
    "pulp_href": "/pulp/api/v3/remotes/ansible/role/b6711ad1-5f26-4329-8d4f-2bf012078ca8/",
    "pulp_created": "2023-03-28T12:28:41.199329Z",
    "name": "roles",
    "url": "https://galaxy.ansible.com/api/v1/roles/",
    "ca_cert": null,
    "client_cert": null,
    "tls_validation": true,
    "proxy_url": "http://myproxy",
    "pulp_labels": {},
    "pulp_last_updated": "2023-03-28T12:28:41.199362Z",
    "download_concurrency": 10,
    "max_retries": null,
    "policy": "immediate",
    "total_timeout": null,
    "connect_timeout": null,
    "sock_connect_timeout": null,
    "sock_read_timeout": null,
    "headers": null,
    "rate_limit": 8
  }

Below is how to reproduce the error…

pulp ansible repository create --name "roles" --remote "role:roles"
pulp ansible distribution create --name "roles" --base-path "roles" --repository "roles"
pulp ansible repository sync --name "roles"

@bli111 I have tried out your commands and have replicated the issue. Currently there is no way to get around this error so I’ve filed an issue to track this: https://github.com/pulp/pulp_ansible/issues/1415. Sorry about that.

2 Likes

I noticed this Sync should not fail if namespace's avatar fails to download · Issue #1543 · pulp/pulp_ansible · GitHub was closed last year. Does issue 1543 also fix the Investigate ways to prevent syncs from failing when artifact download links return 404 or are invalid. · Issue #1415 · pulp/pulp_ansible · GitHub? We have latest version of pulpcore and pulp-ansilbe installed and are still experiencing the failure when sync roles.
“error”: {
“traceback”: " File “/usr/local/lib/python3.8/site-packages/pulpcore/tasking/tasks.py”, line 60, in _execute_task\n result = func(*args, **kwargs)\n File “/usr/local/lib/python3.8/site-packages/pulp_ansible/app/tasks/roles.py”, line 53, in synchronize\n return d_version.create()\n File “/usr/local/lib/python3.8/site-packages/pulpcore/plugin/stages/declarative_version.py”, line 161, in create\n loop.run_until_complete(pipeline)\n File “/usr/lib64/python3.8/asyncio/base_events.py”, line 616, in run_until_complete\n return future.result()\n File “/usr/local/lib/python3.8/site-packages/pulpcore/plugin/stages/api.py”, line 220, in create_pipeline\n await asyncio.gather(*futures)\n File “/usr/local/lib/python3.8/site-packages/pulpcore/plugin/stages/api.py”, line 41, in call\n await self.run()\n File “/usr/local/lib/python3.8/site-packages/pulpcore/plugin/stages/artifact_stages.py”, line 186, in run\n pb.done += task.result() # download_count\n File “/usr/local/lib/python3.8/site-packages/pulpcore/plugin/stages/artifact_stages.py”, line 241, in _handle_content_unit\n await asyncio.gather(*downloaders_for_content)\n File “/usr/local/lib/python3.8/site-packages/pulpcore/plugin/stages/models.py”, line 119, in download\n raise e\n File “/usr/local/lib/python3.8/site-packages/pulpcore/plugin/stages/models.py”, line 111, in download\n download_result = await downloader.run(extra_data=self.extra_data)\n File “/usr/local/lib/python3.8/site-packages/pulpcore/download/http.py”, line 269, in run\n return await download_wrapper()\n File “/usr/local/lib/python3.8/site-packages/backoff/_async.py”, line 151, in retry\n ret = await target(*args, **kwargs)\n File “/usr/local/lib/python3.8/site-packages/pulpcore/download/http.py”, line 254, in download_wrapper\n return await self._run(extra_data=extra_data)\n File “/usr/local/lib/python3.8/site-packages/pulpcore/download/http.py”, line 290, in _run\n self.raise_for_status(response)\n File “/usr/local/lib/python3.8/site-packages/pulpcore/download/http.py”, line 187, in raise_for_status\n response.raise_for_status()\n File “/usr/local/lib64/python3.8/site-packages/aiohttp/client_reqrep.py”, line 1059, in raise_for_status\n raise ClientResponseError(\n",
“description”: “404, message=‘Not Found’, url=URL(‘https://github.com/hellofresh/ansible-deployment/archive/0.0.3.tar.gz’)”
},

What workaround would you suggest? There are total 33433 roles available. Create a remote for each role doesn’t seem to be feasible.

Not a great workaround, but you could switch the remote to on_demand and this will allow Pulp to finish the sync by avoiding downloading the role’s artifacts. I’ll try to figure out all the roles that have broken links and inform the Galaxy team about them in the meantime.

As for a long-term solution we could try what we did for Namespace Avatar 404s and ignore any download failure. For these namespace logos we switch the artifact to on-demand and let the galaxy-ui handle any future 404s. However, for role downloads would this be the appropriate behavior or should we remove the role from the synced content if its artifact fails to download?

Thanks Gerrod. Unfortunately, unlike rpm plugins, the pulp ansible remote policy doesn’t take on_demand as a policy. It looks like the only option is “immediate”. Do you have any other suggestions that we can mirror the roles?

1 Like

I also tried to create a remote for each owner_name. I tried a few different owner_name such as datadog and geerlingguy. I got the same error. I don’t quite understand why all these roles have the same dependency. I remember I was able to do it a while back with an older version of pulpcore and ansible plugin.

Here is what I run:
pulp ansible remote -t “role” create --name “elastic”
–url “https://galaxy.ansible.com/api/v1/roles/?owner__name=elastic
–download-concurrency 10 --rate-limit 8 --proxy-url=“someproxy”
pulp ansible repository sync --name “roles” --remote “role:elastic”

Started background task /pulp/api/v3/tasks/018e713e-fef1-79cf-afa4-b55db820a45e/
…Error: Task /pulp/api/v3/tasks/018e713e-fef1-79cf-afa4-b55db820a45e/ failed: '404, message=‘Not Found’, url=URL('https://github.com/0x5a17ed/ansible-role-netbox/archive/v0.7.7.tar.gz’)’

I upgraded the plugins to 0.21.3 and still getting the same error.

pulp-ansible 0.21.3
pulpcore 3.46.0