Pulp export pulp run table of content will not have the files if --chunk-size is specified

Problem:
pulp export pulp run table of content will have empty files section if –chunk-size is specified.

When running following export command:

pulp export pulp run --exporter “$REPO_EXPORTER” –chunk-size 1GB

It runs successfully with all the chunks created. But the ** generated toc json files section** will be empty.
{“meta”: {“checksum_type”: “crc32”, “chunk_size”: 10485760}, “files”: {}}

Expected outcome:
According to:
https://pulpproject.org/pulpcore/docs/admin/guides/import-export-repos/

Chunk file list should be included in the toc. e.g.:
“files”: {
“export-780822a4-d280-4ed0-a53c-382a887576a6-20200522_2325.tar.0000”: “8156874798802f773bcbaf994def6523888922bde7a939bc8ac795a5cbb25b85”,
“export-780822a4-d280-4ed0-a53c-382a887576a6-20200522_2325.tar.0001”: “e52fac34b0b7b1d8602f5c116bf9d3eb5363d2cae82f7cc00cc4bd5653ded852”,
“export-780822a4-d280-4ed0-a53c-382a887576a6-20200522_2325.tar.0002”: “df4a2ea551ff41e9fb046e03aa36459f216d4bcb07c23276b78a96b98ae2b517”,
“export-780822a4-d280-4ed0-a53c-382a887576a6-20200522_2325.tar.0003”: “27a6ecba3cc51965fdda9ec400f5610ff2aa04a6834c01d0c91776ac21a0e9bb”,
“export-780822a4-d280-4ed0-a53c-382a887576a6-20200522_2325.tar.0004”: “f35c5a96fccfe411c074463c0eb0a77b39fa072ba160903d421c08313aba58f8”,
“export-780822a4-d280-4ed0-a53c-382a887576a6-20200522_2325.tar.0005”: “13458b10465b01134bde49319d6b5cba9948016448da9d35cb447265a25e3caa”,
“export-780822a4-d280-4ed0-a53c-382a887576a6-20200522_2325.tar.0006”: “a1986a0590943c9bb573c7d7170c428457ce54efe75f55997259ea032c585a35”
}

Pulpcore version:
3.75.2
Pulp plugins installed and their versions:
“deb”: “3.5.1”,
“npm”: “0.3.3”,
“rpm”: “3.29.1”,
“core”: “3.75.2”,
“file”: “3.75.2”,
“maven”: “0.10.0”,
“ostree”: “2.4.7”,
“python”: “3.14.0”,
“ansible”: “0.24.5”,
“certguard”: “3.75.2”,
“container”: “2.25.0”

Operating system - distribution and version:
Alma Linux 8 - Docker Compose - Multi-container

Other relevant data:
It works as expected when chunk-size is not specified (i.e. single file):
{“meta”: {“checksum_type”: “crc32”}, “files”: {“export-01963b40-71f8-792c-8442-055619975e65-20250415_2100.tar”: “603c5604”}}

That feels like a regression - I’ll do some testing locally and get back to you.

1 Like

Thanks

On core/3.76.0.dev, and I can’t recreate that problem at all. Are you seeing any errors in the logs from the export-task? I’m trying to figure out what kind of failure could possibly result in this error, without failing the whole task!

We are facing it at 3.75.2 (stable) but no errors in the logs

Here are the steps:
pulp status

“versions”: {
“deb”: “3.5.1”,
“npm”: “0.3.3”,
“rpm”: “3.29.1”,
“core”: “3.75.2”,
“file”: “3.75.2”,
“maven”: “0.10.0”,
“ostree”: “2.4.7”,
“python”: “3.14.0”,
“ansible”: “0.24.5”,
“certguard”: “3.75.2”,
“container”: “2.25.0”
}

Get repository HREF

export REPO_HREF=$(pulp python repository show --name “TEST-PyPI” | jq -r ‘.pulp_href’)
echo $REPO_HREF
/pulp/api/v3/repositories/python/python/01963eda-d225-7705-9ab2-1efeb85b3b58/

Creating exporter

pulp exporter pulp create --name “TEST-PyPI-exporter” --repository-href “$REPO_HREF” --path “/tmp/PyPI”
{
“pulp_href”: “/pulp/api/v3/exporters/core/pulp/01963edd-3640-70de-bf45-916d750eaded/”,
“prn”: “prn:core.pulpexporter:01963edd-3640-70de-bf45-916d750eaded”,
“pulp_created”: “2025-04-16T13:50:06.913213Z”,
“pulp_last_updated”: “2025-04-16T13:50:06.913230Z”,
“name”: “TEST-PyPI-exporter”,
“path”: “/tmp/PyPI”,
“repositories”: [
“/pulp/api/v3/repositories/python/python/01963eda-d225-7705-9ab2-1efeb85b3b58/”
],
“last_export”: null
}

testing WITHOUT specifying chunk size

pulp export pulp run --exporter “TEST-PyPI-exporter”
{
“pulp_href”: “/pulp/api/v3/exporters/core/pulp/01963edd-3640-70de-bf45-916d750eaded/exports/01963ee2-d71e-7a78-81df-0379d151c8fe/”,
“prn”: “prn:core.pulpexport:01963ee2-d71e-7a78-81df-0379d151c8fe”,
“pulp_created”: “2025-04-16T13:56:15.775582Z”,
“pulp_last_updated”: “2025-04-16T13:56:19.348691Z”,
“task”: “/pulp/api/v3/tasks/01963ee2-d6b9-7abd-addb-fca0fb7055ac/”,
“exported_resources”: [
“/pulp/api/v3/repositories/python/python/01963eda-d225-7705-9ab2-1efeb85b3b58/versions/1/”
],
“params”: {
“full”: true
},
“output_file_info”: {
“/tmp/PyPI/export-01963ee2-d71e-7a78-81df-0379d151c8fe-20250416_1356.tar”: “e5ec2a55”,
“/tmp/PyPI/export-01963ee2-d71e-7a78-81df-0379d151c8fe-20250416_1356-toc.json”: “a293aeed28859bbad888cdccc557a9330bcf8c4b52b176d7f3509ca83e90a82a”
},
“toc_info”: {
“file”: “/tmp/PyPI/export-01963ee2-d71e-7a78-81df-0379d151c8fe-20250416_1356-toc.json”,
“sha256”: “a293aeed28859bbad888cdccc557a9330bcf8c4b52b176d7f3509ca83e90a82a”
}
}

ls -lh
total 290M
-rw-r–r-- 1 700 700 290M Apr 16 13:56 export-01963ee2-d71e-7a78-81df-0379d151c8fe-20250416_1356.tar
-rw-r–r-- 1 700 700 124 Apr 16 13:56 export-01963ee2-d71e-7a78-81df-0379d151c8fe-20250416_1356-toc.json

cat export-01963ee2-d71e-7a78-81df-0379d151c8fe-20250416_1356-toc.json
{“meta”: {“checksum_type”: “crc32”}, “files”: {“export-01963ee2-d71e-7a78-81df-0379d151c8fe-20250416_1356.tar”: “e5ec2a55”}}

As can be seen above, the tar file name has been correctly captured in toc file.

testing WITH specifying chunk size

pulp export pulp run --exporter “TEST-PyPI-exporter” –chunk-size 50MB
{
“pulp_href”: “/pulp/api/v3/exporters/core/pulp/01963edd-3640-70de-bf45-916d750eaded/exports/01963ee6-ac09-7955-904b-d543b534f44f/”,
“prn”: “prn:core.pulpexport:01963ee6-ac09-7955-904b-d543b534f44f”,
“pulp_created”: “2025-04-16T14:00:26.891043Z”,
“pulp_last_updated”: “2025-04-16T14:00:28.107996Z”,
“task”: “/pulp/api/v3/tasks/01963ee6-ab83-731d-a128-58a5e1e68ca5/”,
“exported_resources”: [
“/pulp/api/v3/repositories/python/python/01963eda-d225-7705-9ab2-1efeb85b3b58/versions/1/”
],
“params”: {
“full”: true,
“chunk_size”: “50MB”
},
“output_file_info”: {
“/tmp/PyPI/export-01963ee6-ac09-7955-904b-d543b534f44f-20250416_1400-toc.json”: “f2e1713b12f455e213221a282ccad47a7af79abf90476ab558a4743da2fe208e”
},
“toc_info”: {
“file”: “/tmp/PyPI/export-01963ee6-ac09-7955-904b-d543b534f44f-20250416_1400-toc.json”,
“sha256”: “f2e1713b12f455e213221a282ccad47a7af79abf90476ab558a4743da2fe208e”
}
}

ls -lh
total 290M
-rw-r–r-- 1 700 700 50M Apr 16 14:00 export-01963ee6-ac09-7955-904b-d543b534f44f-20250416_1400.tar.0000
-rw-r–r-- 1 700 700 50M Apr 16 14:00 export-01963ee6-ac09-7955-904b-d543b534f44f-20250416_1400.tar.0001
-rw-r–r-- 1 700 700 50M Apr 16 14:00 export-01963ee6-ac09-7955-904b-d543b534f44f-20250416_1400.tar.0002
-rw-r–r-- 1 700 700 50M Apr 16 14:00 export-01963ee6-ac09-7955-904b-d543b534f44f-20250416_1400.tar.0003
-rw-r–r-- 1 700 700 50M Apr 16 14:00 export-01963ee6-ac09-7955-904b-d543b534f44f-20250416_1400.tar.0004
-rw-r–r-- 1 700 700 40M Apr 16 14:00 export-01963ee6-ac09-7955-904b-d543b534f44f-20250416_1400.tar.0005
-rw-r–r-- 1 700 700 73 Apr 16 14:00 export-01963ee6-ac09-7955-904b-d543b534f44f-20250416_1400-toc.json

cat export-01963ee6-ac09-7955-904b-d543b534f44f-20250416_1400-toc.json
{“meta”: {“checksum_type”: “crc32”, “chunk_size”: 52428800}, “files”: {}}

As can be seen above, the tar file chunk names have not been captured in toc file. i.e. .files is empty.

It works for pulp_rpm just fine, there may be a pulp-python-specific issue going on. That will take me some time, I don’t have python-enabled env to work with right now.

1 Like

Thanks.

Ok, that is repository type-specific then.

Please let me know if you need more information from me in the meantime.

Running pulp_python/3.14.0 and everything worked here:

(oci-env) ~/github/Pulp3/pulpcore $ pulp status  | jq .versions
Notice: Cached api is outdated. Refreshing...
[
  {
    "component": "core",
    "version": "3.75.2",
    "package": "pulpcore",
    "module": "pulpcore.app",
    "domain_compatible": true
  },
  {
    "component": "rpm",
    "version": "3.30.0.dev",
    "package": "pulp-rpm",
    "module": "pulp_rpm.app",
    "domain_compatible": true
  },
  {
    "component": "python",
    "version": "3.14.0",
    "package": "pulp-python",
    "module": "pulp_python.app",
    "domain_compatible": true
  },
  {
    "component": "certguard",
    "version": "3.75.2",
    "package": "pulpcore",
    "module": "pulp_certguard.app",
    "domain_compatible": true
  },
  {
    "component": "file",
    "version": "3.75.2",
    "package": "pulpcore",
    "module": "pulp_file.app",
    "domain_compatible": true
  }
]
(oci-env) ~/github/Pulp3/pulpcore $ pulp python remote create --name bar --url https://pypi.org/ --includes '["shelf-reader"]'
{
  "pulp_href": "/pulp/api/v3/remotes/python/python/01963f80-df30-750e-adcd-49f287846258/",
  "prn": "prn:python.pythonremote:01963f80-df30-750e-adcd-49f287846258",
  "pulp_created": "2025-04-16T16:48:52.529334Z",
  "pulp_last_updated": "2025-04-16T16:48:52.529346Z",
  "name": "bar",
  "url": "https://pypi.org/",
  "ca_cert": null,
  "client_cert": null,
  "tls_validation": true,
  "proxy_url": null,
  "pulp_labels": {},
  "download_concurrency": null,
  "max_retries": null,
  "policy": "on_demand",
  "total_timeout": null,
  "connect_timeout": null,
  "sock_connect_timeout": null,
  "sock_read_timeout": null,
  "headers": null,
  "rate_limit": null,
  "hidden_fields": [
    {
      "name": "client_key",
      "is_set": false
    },
    {
      "name": "proxy_username",
      "is_set": false
    },
    {
      "name": "proxy_password",
      "is_set": false
    },
    {
      "name": "username",
      "is_set": false
    },
    {
      "name": "password",
      "is_set": false
    }
  ],
  "includes": [
    "shelf-reader"
  ],
  "excludes": [],
  "prereleases": true,
  "package_types": [],
  "keep_latest_packages": 0,
  "exclude_platforms": []
}

(oci-env) ~/github/Pulp3/pulpcore $ pulp python repository create --name foo --remote bar
{
  "pulp_href": "/pulp/api/v3/repositories/python/python/01963f81-7319-705b-b70c-5efa0197e4ac/",
  "prn": "prn:python.pythonrepository:01963f81-7319-705b-b70c-5efa0197e4ac",
  "pulp_created": "2025-04-16T16:49:30.395057Z",
  "pulp_last_updated": "2025-04-16T16:49:30.402942Z",
  "versions_href": "/pulp/api/v3/repositories/python/python/01963f81-7319-705b-b70c-5efa0197e4ac/versions/",
  "pulp_labels": {},
  "latest_version_href": "/pulp/api/v3/repositories/python/python/01963f81-7319-705b-b70c-5efa0197e4ac/versions/0/",
  "name": "foo",
  "description": null,
  "retain_repo_versions": null,
  "remote": "/pulp/api/v3/remotes/python/python/01963f80-df30-750e-adcd-49f287846258/",
  "autopublish": false
}
(oci-env) ~/github/Pulp3/pulpcore $ pulp python repository sync --name foo
Started background task /pulp/api/v3/tasks/01963f82-16d0-758e-8c1e-ad2d167bcda4/
.Done.
(oci-env) ~/github/Pulp3/pulpcore $ pulp export pulp run --exporter foo --chunk-size 1GB
Started background task /pulp/api/v3/tasks/01963f85-ef2a-7ddf-ba17-9010afad0dfc/
Done.
{
  "pulp_href": "/pulp/api/v3/exporters/core/pulp/01963f83-9e30-7cf2-b68c-d820254a1e23/exports/01963f85-ef80-7265-ba16-30672afa58d4/",
  "prn": "prn:core.pulpexport:01963f85-ef80-7265-ba16-30672afa58d4",
  "pulp_created": "2025-04-16T16:54:24.385342Z",
  "pulp_last_updated": "2025-04-16T16:54:24.481777Z",
  "task": "/pulp/api/v3/tasks/01963f85-ef2a-7ddf-ba17-9010afad0dfc/",
  "exported_resources": [
    "/pulp/api/v3/repositories/python/python/01963f81-7319-705b-b70c-5efa0197e4ac/versions/1/"
  ],
  "params": {
    "full": true,
    "chunk_size": "1GB"
  },
  "output_file_info": {
    "/src/exports/export-01963f85-ef80-7265-ba16-30672afa58d4-20250416_1654-toc.json": "399a92afbd60d51f5d634449a60d27c486af8317115eef8a04e60b28c00a60db",
    "/src/exports/export-01963f85-ef80-7265-ba16-30672afa58d4-20250416_1654.tar.0000": "ce8348c8"
  },
  "toc_info": {
    "file": "/src/exports/export-01963f85-ef80-7265-ba16-30672afa58d4-20250416_1654-toc.json",
    "sha256": "399a92afbd60d51f5d634449a60d27c486af8317115eef8a04e60b28c00a60db"
  }
}

(oci-env) ~/github/Pulp3/exports $ cat export-01963f85-ef80-7265-ba16-30672afa58d4-20250416_1654-toc.json | jq
{
  "meta": {
    "checksum_type": "crc32",
    "chunk_size": 1073741824
  },
  "files": {
    "export-01963f85-ef80-7265-ba16-30672afa58d4-20250416_1654.tar.0000": "ce8348c8"
  }
}
(oci-env) ~/github/Pulp3/exports $ 

I notice the following in your --chunk-size 50MB task-output above:

“output_file_info”: {
“/tmp/PyPI/export-01963ee6-ac09-7955-904b-d543b534f44f-20250416_1400-toc.json”: “f2e1713b12f455e213221a282ccad47a7af79abf90476ab558a4743da2fe208e”
},

Note the difference in the “working/no-chunk-size” task output:

“output_file_info”: {
“/tmp/PyPI/export-01963ee2-d71e-7a78-81df-0379d151c8fe-20250416_1356.tar”: “e5ec2a55”,
“/tmp/PyPI/export-01963ee2-d71e-7a78-81df-0379d151c8fe-20250416_1356-toc.json”: “a293aeed28859bbad888cdccc557a9330bcf8c4b52b176d7f3509ca83e90a82a”
},

It looks like the task doesn’t know the chunks were created - the TOC is correct, given where the task thinks we ended up. If you look in your Pulp logs from the minute around 20250416_1400, do you see any errors from the task? Also, how big is your /tmp filesystem? Could you be running out of space for the .tar pieces, while having enough for the .json?

(Note that I still have no explanation for “something clearly went wrong, but the task says it completed successfully” - that shouldn’t be possible. But it’s clear that the task thinks no files were generated)

Thanks a lot for your help. There was an issue with our scripts.

Since we are running on a Docker multi-container setup, the directory where the export files are generated inside the container (/tmp/PyPI) is mapped to a host machine directory (export/PyPI). The information in this location needs to be accessible by the user running inside the Pulp Docker container. To ensure this, we need to set chmod o+r on the host mount directory. Prior to raising the issue with you, the directory was missing read permissions.

I have not observed any related errors in the Pulp logs regarding this issue though.

1 Like

When chunking we split the tarfile as it gets created, here. I wonder if the popen() is failing in this instance in a way we don’t notice (ie doesn’t throw an Exception), so that the loop that creates the files stanza here finds no files.

Def something to investigate and fix…eventually. And by “fix” I mean “fail the task with some useful error message” :slight_smile:

For the moment, I’m glad the discussion helped us figure out the cause, and that you’re making progress!

1 Like