Proof of Concept (PoC) for the Pulp project and its implementation in a high availability (HA) setup with a focus on disaster recovery (DR)

akhilreddynalla · July 18, 2023, 3:08pm

I am writing to discuss the Proof of Concept (PoC) for the Pulp project and its implementation in a high availability (HA) setup with a focus on disaster recovery (DR).

We have been evaluating the Pulp project and its potential benefits for our organization. Our goal is to ensure seamless availability of content repositories while also having a robust disaster recovery plan in place. We believe that the Pulp project can greatly contribute to achieving these objectives.

During our PoC, we would like to explore the following aspects:

High Availability (HA): We aim to implement Pulp in an HA configuration to ensure continuous access to content repositories in the event of hardware failures, network issues, or other similar disruptions. We would like to assess the scalability and performance of the HA setup.
Disaster Recovery (DR): We would like to evaluate the resilience of Pulp in the face of catastrophic events, such as data center failures or natural disasters. Our goal is to have a well-defined DR strategy that includes data backup, replication, and recovery procedures.

We are particularly interested in understanding the best practices, recommended configurations, and any specific considerations for deploying Pulp in an HA environment with robust DR capabilities. Additionally, we would appreciate any guidance or resources you can provide to help us set up the PoC effectively.

We highly value your expertise and would greatly appreciate your support throughout this process. If there are any specific requirements or technical details you would like us to provide, please let us know, and we will be more than happy to assist.

Thank you for your attention to this matter. We look forward to your guidance and collaboration in exploring the possibilities of implementing Pulp with high availability and disaster recovery.

lubosmj · July 19, 2023, 1:57pm

I am copying here what was mentioned in the matrix channel.

@x9c4:

Let me start pointing you at some interesting documents: There is a blog post about the general idea of HA in Pulp Pulp & High Availability | software repository management ; and here is the architecture diagram showing you which components in Pulp can be scaled accordingly: Architecture — Pulp Project 3.29.1 documentation
The general idea is: Every pulp*-component is scalable and for the reverse proxy, the PostgreSQL database and the Redis cache, you’d need to find their corresponding clustered installations.
As for disaster recovery, I can only say: All data is either in the database or in the object storage. If you have a combined backup of them plus the db level encryption key, you can restore everything. (I don’t think we have a walk through doc yet.)

The [only] third place of combined knowledge of the services are the settings. But you’ll probably have a representation of them in the deployment automation.

Also have a look at Getting Help/ Getting Involved | software repository management and maybe this thread: High Avalability of pulp - #4 by x9c4