RFC: Pulp 101 webpage

A number of users reported that they found it very hard to understand Pulp, but then once they got in, it was easy to use.

I’ve been trying to consolidate aspects of feedback from the community survey into a walkthrough that covers the key concepts and components related to Pulp.

I put this together with the hope that someone could read the page end to end, and have enough in mind to make a real stab at getting started with Pulp.

I don’t mean for this to replace the docs in any way.
I’m just hoping that users might gain enough to then turn to our docs for specifics.

https://melcorr.github.io/pulpproject.org/pulp-workflow-overview/

I had hoped to add more architectural examples. Syncing from one Pulp server to another, that kind of thing. I would still like to do this, but have to figure out how to get there.

Anyways, I consider this draft one. I haven’t even raised a PR as perhaps it’s not even at the PR stage.

Let me know your thoughts and how I can improve it.

4 Likes

your draft is better than my “production-ready” docs!
I really liked it, thank you for putting this together!

I think we need a doc like this (very close to this) and I’m so glad you’ve put this draft out. I wasn’t sure of the best venue to give feedback, so I’ll leave some thoughts here. Feel free to incorporate or disregard; it’s really just a bunch of little ideas.

I use the idea of a “toolbox” often when I talk about Pulp3 because it’s got all these parts, but a lot of workflows can be made from that. Perhaps thats a helpful analogy to present up front?

The order of introducing Remotes first, then repositories, then distributions is the right order to me.

I don’t think the idea of plugins is very helpful at an intro level. One thing alternatives to Pulp do really well is present their software as one thing with everything ready to go out of the box. Pulp’s plugin model is an advantage for developers and users alike, but plugins are more about how Pulp’s built practically speaking than something an end user would think about. To me, this doc is about what does pulp do, only when I’ve decided I want to use it do plugins become important.

---- this is all one idea for a new section -----

It might be helpful to call out a use case and then using the language you’ve introduced, i.e. remote, repo, sync, distribute, etc to explain a workflow for that specific use case. For example, here are a few user cases that come to mind:

  • PyPI goes down sometimes, so I want an on-premise PyPI to have our servers use that.
  • Docker is rate limiting now, so I want an on-premise registry with a pull-through cache capability to solve that.
  • I mirror the CentOS and Amazon Linux repos nightly and I want an easy way to store nightly versions of these repos so I can roll back in case a new package breaks my servers.
  • We use Ansible and we have some Ansible Collections and Roles that have hostnames and usernames that we don’t want to publish on galaxy.ansible.com.

Anyway, the idea with ^ is to put out an undisputed, common problem and easily show how the toolbox solves it. Each ^ would come with a probably numbered steps, e.g. for the PyPI mirror:

  1. make a remote to sync all of pypi
  2. make a repository
  3. configure it to sync nightly
  4. configure pip on my servers to pull from pulp instead of PyPI.

----- end of new idea ---------

For the CI/CD section, that’s kind of like one of ^ use cases in a way. To me, I think we need to make it more concrete by using a specific CI/CD in the example. Github Actions (GHA) seems to be pretty dominant now (to me) so if we could rework the example to have artifacts produced by CI/CD push into a Pulp server for storage and testing that would allow builds to be tested post-run for example. Or having the CI/CD pull from the Pulp server hosting artifacts. From a high level, I believe telling the story in a specific and concrete (with full examples) way would impact more folks than the generalized version. To get there, I suspect pairing up with a developer to provide that end-to-end example with GHA would be something a dev can do and you could help tell the story in a better way than an engineering written technical doc would.

The under the hood section is good. It serves a different kind of purpose so I wonder about making it, it’s own section? I like this section a lot though. Maybe it should stay with the rest of this content perhaps?

Side-note: for the under the hood section, I want to try to have Redis be fully optional. Even with 3.16 it’s not optional but we’re going to fix that very soon (it’s a bugfix).

For the high availability I think that could be its own article. To me, users just getting started want to know HA is a solved problem but not much more. Then again when users sit down to actually perform an HA config they need more than this section has (I think maybe your other section has all the details in the docs?) Anyways I think shortening this section so folks know it’s a solved problem and here’s how to learn more would allow the rest of the content to stand out more.

Thank you so much for this! Even if this article didn’t change one word I think it’s very good.

3 Likes

Thank you so much @bmbouter

I appreciate you posting here. I would hope that if anyone else is following the discussion, they might join in if they’ve any ideas or agree/disagree with anything we have suggested.

Interesting. I hadn’t thought of that. Easy modification to the image.

I like this idea a lot, and will go ahead with this.

I really like the idea of providing a scenario like this, just I will need help.

I’ll look for a volunteer…

I faffed around with this for a long time, Brian.
I wasn’t sure what to do. I like those modern HTML 5 websites that are just one long page. I tried to get around this by including the TOC at the top.
In my head, if you wanted to install, setup, and build some solution using Pulp, you’ll need to grasp these parts as best you can, as soon as possible.

I had tried to address the feedback in the survey about how difficult it was to find out about all of the concepts and components…
I was tempted to work on something closer to a 411 page rather than a 101, and vacillated between that approach and this, ultimately deciding to leave it less cluttered in the end.

I suppose my main concern is that the website, especially the navbar, is an intense place. I hoped by having one page with a TOC, I might get everything together but at the same time allow for people to hop around as they please. I’m still unsure. I am going to park this and hope that as I develop the rest of the information, the right approach comes to me. Ultimately this just involves chopping the file if and when the time comes.

I’ve been tracking this via the redmine notifications!

Thanks so much, Brian.
For now I will continue to work on your main points.

Great doc, Melanie! I really like it.

I’m on the fence about plugins. I see Brian’s point to refer to Pulp as a whole and not overload with details right from the start. At the same time, it somewhat helps with illustrations in your later examples. And in general, when user reaches out to us or tries to look for documentation, it’s good to know that there is the core and then a bunch of plugins.

+1 to the new section and specific examples.

As for the table of content and everything on one page vs separate short pages, I think it’s very subjective. Some will prefer one way, others the other. I personally really like when it’s on one page (the way you did) but if most websites and projects try to have multiple short pages, maybe they know something and for the majority of folks it might work better. Maybe worth asking UI folks?
With huge buttons and simple pages, it looks lighter but I usually struggle to find something which I’ve already read because it is in one of those simple pages and few clicks away :confused:

Thanks a lot for working on it! I would post it to the website soon and if more feedback comes in, we can always adjust later.

2 Likes

you just found one!

+1 for posting and updating as we go

Great idea. I’ll ask their opinion. Thank you!

I have the same experience. I’d much rather CTRL + F

:partying_face: :partying_face: :partying_face:

Melanie, this page is great!

+1 to Brian’s suggestions to show Pulp’s capabilities and fit on the concrete examples. The 4 bullets he mentioned greatly outline our main value proposition.

I’d keep the idea of the plugins, to be honest we’ve built Pulp around the idea of having plugin based system and in my opinion this is a cornerstone that deserves to be highlighted.

I do like that it’s being written on one page with logical sections that have a flawless flow from the top to the bottom. Brian is right that some of the sections might need an in-depth explanation and might deserve a separate article, that we can add later ( like HA), however I also agree that it is useful to give a reasonably wide horizon of the things to our users so they have best big picture at once before they come to a final conclusion on whether Pulp is a fit for them or not.

2 Likes

I have a draft for the CI:

it is a workflow that runs once a week, we can do it for the major plugins, so users can see basic workflows from each plugin.

This PR is a PoC, so I add only “dev” and “prod” to show interaction with more than one repository.

I tried to get the CI the simplest as possible, I may need to add comments on some CI steps

:slight_smile: thanks @fao89
I’ll take a look today!

1 Like

Howdy @fao89

Thank you so much for putting this together.
This is interesting. We had a lot of feedback on the survey about a lack of automated Day 2 scenarios.
What you’ve provided here is essentially a complete day 2 scenario for users.

Please forgive my ignorance, but let me try and summarize what is happening here and where there might be some gaps.

So, this GHA script does the following:

  1. Creates two remotes, one pointed at PyPI for dev use, the other pointed at our Pulp instance for production use.
  2. Creates two corresponding repos for each remote. One dev repo and one prod repo.
  3. Synchronizes content from each remote to the corresponding repo.
  4. Creates a publication and distribution for the repos.
  5. Install dev content using pip on a client pointed at Pulp.
  6. I have no idea what is supposed to be happening here
  7. Install packages from prod repo on a client.

So…I am a bit confused about a few points.

Regarding, 5-7, is it really a usable GHA script to install on clients like that? I get lost a bit here.
Everything about this GHA up until then seemed, how do I say, real world, but I don’t see where I would declare the client this action takes place on ? Or would it be up to the users to adjust this to best fit their own scenario?

The other thing is CI/CD …from what I understand based on the steps in my document (all stolen from @dkliban for the record), what we want here is to use GHA for testing the stability of our repos in Pulp…

So, here are the steps I have: Pulp 101 | software repository management
Very basic, as I said stolen, and then @bmbouter

So from here, I would maybe think that for a CI/CD scenario, we would need to add some steps from the point where content is synced to the dev repo.

  1. When new changes are pushe to dev repo, creating RepoVersion x
  2. GHA runs tests, pass/fail
    3a. If fail, error given, user fixes things, pushes again, tests rerun.
    3b. If passes, repoversion becomes eligible for consumption in test repo?

I actually have no clue what I am talking about, but this is what I understand that there’s automated testing and pushing and pulling of content for consumption… :see_no_evil:

At step 6, I synced content from dev to prod, my idea was: the user did the tests and now it is good to go to prod, so we sync from dev to prod.

I’ll try to explain with comments on the PR:

Comparing PR steps with your steps:

  • I synced content instead of uploading (I can change that or add an upload case);
  • Your steps are for 3 stages (dev, test, and prod), PR currently have 2 (dev and prod)
  • all the tests from your steps are represented in my PR as pip installing from dev
  • promotion to production, on my PR it is when the content is synced from dev to prod

On my PR I synced content to dev repo,
The test passing, is poorly represented by pip installing from dev source.
I need to think of some illustrative test to add there.

1 Like

Hey @fao89
Thank you for all your work on this.
Talking to @dkliban - it is unrealistic to have GHA conntected to a Pulp instance for these Integration tests.
It would be better to demo this part in Jenkins.

1 Like