Another newcomer perspective

Hi all, I’m looking into package registries and was told to have a look at Pulp. What you’re doing looks really impressive. I never heard of Pulp before. I would have loved a presentation at KubeCon or FOSDEM. I’ve spent some time going over documentation, historic presentations and YouTube videos to learn about the current state and the background. From one of the videos I got the impression you were looking for outside perspectives, so I thought to share mine.

  • I was mainly looking for case-studies of how Pulp is used. I found an example of mirroring OS-level packages (rpm, deb) for scientific use. I wasn’t sure if Pulp would cover the use-case of self-hosted registries and registries for development libraries (npm, pypi, maven). The documentation highlights the support, but I wanted to see proof. I did see some hints of it during the online meetings.
  • I did notice not all registry types are supported, like nuget, php composer, R. Perhaps some could be supported by the file-repository type, but I’m not sure. I’ve briefly looked into the plugin codebases to get a sence of the effort needed to create and maintain a new registry type plugin. I’m not quite sure how much code is boilerplate and how much code is specific.
  • Its unclear to me what management features exist. Is it possible to have a deny-list or allow-list of packages to sync? My impression is yes, but I haven’t found it in the documentation.
  • Similarly, I’m curious if there is a hook to check packages before making it available. I assume not, but it could probably be done in a custom way as well.
  • I like the concept of declarative configuration, to be managed via kubernetes. The GUI is very welcome to be more user-friendly though.

With the desk research done, I think it is now time to give it a spin and experience it for myself to get a better feel for it.

2 Likes

Pulp is used for content-curation in a number of places/projects/products. While some people are shy about their infrastructure-stack, here are some public ones:

We have plugins for all of the following - although the activity/support varies widely. This is because the core team isn’t an expert in All The Things, and so we have had plugins contributed by the community. You can find all of the following under the Pulp org:

Content-types that see a lot of active work:

  • ansible
  • container
  • deb
  • file
  • python
  • rpm

Ones that may need some TLC, but which work fine:

  • gem
  • maven
  • npm
  • ostree
  • rust

Ones that have been contributed to “scratch someone’s itch” and haven’t had much beyond that:

  • cookbook
  • hugging_face
  • r

Creating a new one requires understanding the specific content’s ecosystem. Our plugin_template has support for building a skeleton of a new plugin, and there are (as you can see) a lot of examples for where to go from there.

Some plugins allow allow/deny on sync, some don’t. It’s a common request, which we revisit regularly. In some environments, the curation happens by “grab everything”, and then use the Modify or Copy endpoints to refine what you want to expose to users into a public repository that is then managed.

As far as managing Pulp itself goes, we had a community-member contribute Pulp-Manager - you might want to check that out.

There’s been investigation into things like ClamAV, but the problem always ends up being more complicated than one thinks, to do it at ingest-time.

So far we have two donated UIs; both are incomplete, alas. The core team is short on UX expertise - we’re a bunch of backend geeks - and bandwidth, as always, is limited.

We do have a solid CLI, that is in-use by a lot of our users - check out https://github.com/pulp/pulp-cli !

Answering these questions, and the fact that you had to ask them, shows me we need to free up some sprint-time to get some more/more up-to-date overviews into our documentation stack :slight_smile:

1 Like