Proposal to focus on single container and improve docs for installers

mikedep333 · July 13, 2022, 5:23pm

Hi Everyone,

The following is a comprehensive proposal for what the pulp installers subteam will focus on over the next few months, and generally here on out. It is designed to address the following dilemmas:

Containers are what most non-ansible-expert users want.
There was a lot of constructive criticism for pulp_installer at pulpcon 2021.
Most of that feedback is addressable via docs updates, some via engineering changes, some cannot be feasibly addressed.

The proposal at a high-level

The single container will be recommended as the default way to install pulp.
pulp_installer’s “pip” install mode will remain, but will not be the focus of engineering. pulp_installer’s primary use will be “packages” install mode, whether for use by Katello/Galaxy, or by Pulp users.
pulp_installer will be clearly documented on how to do an “installation” (pulp_services meta role) vs an “orchestration” (pulp_all_services meta role.)
The single container will receive a large docs revamp, explaining what we are doing, not just the exact commands to run. See this as an example.
pulp_installer will receive a medium-size docs revamp.
The pulp manual installation docs will receive a medium-size revamp. Particuarly including missing areas, and webserver configuration.
Some engineering changes will be made to the installer, focused on usability.
There will be an emphasis on solving missing features in the single container from here on out, rather than missing features in pulp_installer.

The following engineering changes will be made for the installer:

Stop using duplicate variables in the installer. For example, instead of requiring users to set both “pulp_cache_dir” and “pulp_settings.working_directory”, pulp_installer will only recognize and customize the install based on “pulp_settings.working_directory”.
Systemd overrides will be used instead of jinja2 templating the systemd unit files (for the pulp services.). This allows for easier customization after installation, as well as referencing what a systemd unit file will look like for manual installations.

The following changes will be made to the installer docs:

pulp_installer will be clearly documented on how to do an “installation” (pulp_services meta role) vs an “orchestration” (pulp_all_services meta role.) When doing an “installation”, the postgresql database must be accessible, and configured via variables to the installer. (installer assumes “localhost” by default). The installer will make some changes to the system outside of simple file installation, such as the pulp service being started/restarted, “collect static” commands being run, some OS components like sudo being configured, and the pulp_health_check role will be run. The webserver will either be configured by the installer(including being started/restarted) or by the user according to provided config files in the manual installation docs.
We will advertise better the new ability to set settings in /etc/pulp/settings.local.py, which the installer will not override.
When it comes to variables, we will clearly identify which variables must be specified by the installer during the initial install, and cannot be specified in /etc/pulp/settings.local.py after installation. An example is “pulp_settings.working_directory”, which is a folder path that the installer will not move.
When it comes to variables, we will clearly identify the variables that must be specified in order for the installer to integrate with existing database/redis/object storage services.
We will document the supported versions of non-orchestrated redis, postgres and webservers.
We will provide an architectural overview of pulp, including adding diagrams to pages.

On a design note, I want to discuss the topic of “installation vs orchestration” further. The original name for pulp_installer was “ansible-pulp”. That may be an appropriate name to re-adopt. However, other than naming, I looked at every single task under the pulp_services meta role. pulp_services needs the database connection to be present, but it generally doesn’t orchestrate the system any further than doing things like starting/restarting the service, and configuring some parts of the OS like sudo. It seems like pulp_services performs the appropriate, limited behavior for an “installer.”

Also, generally an “installer” will include components at specific versions. This is behavior that could be implemented by doing something like:

running a CI job every night to install all the compatible plugins and generate a distribution
test that this new distribution works
save the distribution as a “pip freeze” list
provide the “pip freeze” list in the installer
When users run the installer in the new “pip freeze” mode, always install all the dependencies but do not install the plugins unless requested. (Because once a plugin is installed, it cannot be uninstalled. Including when you upgrade pulpcore to a new version, which it may not be compatible with.)
Constantly re-release the installer with a new list ‘pip freeze’ list.

However, this is a lot of work to do for an installation mechanism that users prefer containers over, and which galaxy_ng and katello have no use for. They provide their own RPMs, and test the distribution for their own needs. Therefore, we are strongly leaning against doing something like this.

markgoddard · August 26, 2022, 11:31am

It’s interesting to see this change of direction, given the pulp in one container is marked as not for production in the docs. Presumably sufficient improvements will be made to make it ready for production.

Where do the podman compose and k8s operator methods fit in here?

bmbouter · August 26, 2022, 2:00pm

Yes the single container is receiving improvements, and I believe it primarily needs documentation and testing more than specific gaps filled. Feedback on how to make it great is valuable.

I’m not closely involved in the development of these components, but here’s my take. I expect the Pulp project will see growth in usage of the podman compose, k8s operator, and single container, along with DIY Python based installations. The installer will likely continue to be used by those who started their installs with it, but won’t probably see much additional user adoption. The k8s operator will likely be heavily used by more enterprise environments because that’s more or less a precursor for having a k8s environment at all.

I’m hoping those who do work on those components can chime in here also.

davidd · August 29, 2022, 2:44pm

We are using containers to deploy Pulp so I’m glad to see this shift. We ended up writing our own Containerfiles based on the ones in pulp-operator. Those use ansible though which was a bit of a nuisance to reverse engineer. It would be nicer if Pulp had official Containerfiles that it shipped.

bmbouter · September 7, 2022, 4:15pm

@davidd How would you feel about sharing those? A full PR would be great, but I’ll take whatever you have time for (if this is appropriate in your opinion). I’d like to see us publish these into a section of the docs.

bmbouter · September 14, 2022, 6:53pm

I created this issue for us to add the dockerfile to the pulpcore docs. Feedback on how to accomplish to maximize usability for users would be great.

hyagi · September 16, 2022, 12:39pm

“We ended up writing our own Containerfiles based on the ones in pulp-operator.”

@davidd In addition to bmbouter comments, can you please share why you decided to use your own Containerfiles instead of deploying Pulp through pulp-operator, even with pulp-operator being able to handle the entire lifecycle of Pulp?
This feedback can help us improve pulp-operator.