Delete content units that haven't been requested for a certain amount of time

Problem:
I have a need to retain deb packages until they are not requested for a certain amount of time but it doesn’t look like Pulp has this functionality as far as I can tell. If the last requested timestamp of each artifact is available then maybe something can be done but that piece of info doesn’t seem to be available. Are there any other alternatives for what I want to do?

Expected outcome:
Have a way to clean up content units that haven’t been requested for a specified amount of time.

Pulpcore version:
3.110.1

Pulp plugins installed and their versions:
pulp-ansible 0.29.8
pulp-certguard 1.8.0
pulp-container 2.27.8
pulp-deb 3.8.1
pulp-file 1.16.0
pulp-glue 0.39.1
pulp-hugging-face 0.3.0
pulp-maven 0.12.0
pulp-npm 0.7.1
pulp-ostree 2.6.0
pulp-python 3.29.0
pulp-rpm 3.36.0

Operating system - distribution and version:
Ubuntu 22.04.2 LTS

Other relevant data:

So, you want to keep published repositories around, but you want artifacts from those repositories to be deleted if they have not been requested in some specified amount of time?
Do I understand you correctly?

If so, the obvious question is: What should happen if the artifact is then requested after it was deleted after all? Re-download? In that case it sounds like you want the ability to turn a already downloaded artifact back into an on_demand artifact.

The closest thing to this that currently exists is streamed sync mode, which is different, in that it always downloads artifacts from the upstream source when requested.

The usual workflow to get rid of old content, is to keep re-syncing and serving new states, and then to delete old repo versions and run orphan cleanup. Any packages that are no longer being served, and are no longer part of any repo version, are “orphaned” and will be cleaned up.

Does any of that sound like something you can work with?

1 Like