Is database backup enough to restore a pulp server

Hello, the context is like this:
Using all-in-one image to set up one pulp sever, it has containers pgsql pulp_storage settings:

[xxx@xxx]# ls /opt/pulp/
drwxr-xr-x 2 xxx      xxx    6       Feb  6 17:38     containers
drwxr-x--- 3   26      26     39     Feb  8 10:19 pgsql
drwxr-xr-x 6  700     700  59      Feb  8 10:10 pulp_storage
drwxr-xr-x 3  xxx       xxx 332    Mar 10 17:41 settings

For the restore plan:
if I don’t want to back up pulp_storage because they are mainly rpm repos synchnized from remotes.
if I just back up containers pgsql settings, when there is a problem(eg, disk failure, os crashed), is it possible to restore this pulp server with the backup I have?

Thanks in advance

Hello!

A simple answer is no. The situation you have described poses challenges for restoring the Pulp server with only partial backups.

Without backing up the pulp_storage , which could contain the synchronized RPM packages, you risk losing crucial content data. While you mentioned that these repositories are mainly synchronized from remotes, there could still be locally uploaded content or metadata that will not be recoverable without the pulp_storage . Am I right? Thus, a straightforward restore would not be possible without it.

However, there might be some scenarios where a partial backup could suffice. Suppose you synchronized all the repositories with the on-demand policy. If the RPM plugin does not store (it would be interesting to hear this from plugin developers themselves) any content data on the storage for respective packages, and downloads them when needed, you should be fine as long as the reference to the remote server is established. After the restore, this could resemble the state where you have a repository with remote artifacts pointing to the remote source and you ran the reclaim space task to free up the disk space.

You may want to consider using the import/export workflow to create a backup plan too. This is not what we usually advise to do but it should work out of the box. Note that this partially backs up the pulp_storage into a single bundle.

For further insights and discussions on disaster recovery strategies with Pulp, you might find this thread helpful: Proof of Concept (PoC) for the Pulp project and its implementation in a high availability (HA) setup with a focus on disaster recovery (DR).

Feel free to ask if you have any more questions or need further clarification.

2 Likes

If all of your content is synced RPMS and you don’t upload RPM packages into your Pulp, then you will be able to recover from a database backup and your database encryption key backup (/etc/pulp/certs).

The procedure for recovery would be the following:

  1. Restore the database
  2. Restore /etc/pulp
  3. Start Pulp
  4. Run repository repair task to re-download all the missing files[0].

[0] Repair Pulp - Pulp Project

1 Like

Don’t forget about the database encryption key.

1 Like

In our use case, we have mixed content, maybe 90% sync content, 10% our rebuild rpms, so looks like it is better we back up all.

Thanks for all the information :pray:

1 Like