Planning to remove a feature from the RPM plugin: sqlite metadata

It has come to our attention that the sqlite metadata generation feature is effectively useless.

The only software in the RPM ecosystem which seems to require sqlite metadata is repoview, but:

  • Since Pulp is serving up repositories dynamically, the metadata is useless even in theory. The static pages can’t be injected into the content app for that particular repo. Pulp would have to run repoview itself, and that is not a desirable feature for the reasons listed below.
  • repoview hasn’t been packagable since Fedora 28 / EL 8 because it has some ancient unmaintained dependencies and the package itself is unmaintained. It’s built on Python 2 and uses python2-only libraries.
  • Nobody uses repoview anymore - I was only able to find one current example, the RPMFusion repos, and their pages are apparently generated by repoview on Fedora 26. Ouch.
  • Nobody uses it anymore because the auto-index pages generated by various web servers are good enough for browsing around and sites like pkgs.org do a much better job of presenting information (at least for common public repos).

Therefore, it seems sensible to deprecate this functionality and remove it in a future release, unless a compelling unknown use case emerges. This feature actually had an an outsized impact on code complexity in a few places so its removal will yield real benefits to maintainability.

To retain API compatibility, the options may need to exist for some time beyond the removal of the backing code.

3 Likes

I checked our access logs and we receive hundreds of thousands of requests for our sqlite files. I am not sure though how they are used (or if they are used at all)? Is it possible that dnf is automatically pulling these files?

Legacy yum uses them, but it also works fine without them. DNF does not use them and shouldn’t download them. Full-repo mirrors would download them if Pulp is used, and maybe rsync or other download tools do so.

1 Like

@davidd If I had to guess, it’s the RHEL 7 systems (with legacy yum) which are requesting those files, since you do have some RHEL 7 repos on packages.microsoft.com. Yum seems to prefer the sqlite metadata but falls back to XML if it can’t get it.

Or are those numbers split out by repo? Can you determine if something is downloading .sqlite files in significant quantity from the EL8+ repos?

1 Like

It’s hard to tell since a lot of our repos are grouped by product (and not broken up by RHEL version). We do have some rhel 8 specific repos and I see some requests to their sqlite files. If nothing breaks by dropping the sqlite metadata then no objection from me.

@davidd Do you have any more information about this? I’m thinking about going ahead with the change, but I suppose what we could do is put it behind a particular (Pulp) settings flag for a couple of releases before actually removing the code, that way if it creates any problems users have an easy way to move forwards.

1 Like

I responded to you already on Matrix but wanted to post here for visibility/posterity as well. We’re aiming to drop sqlite support for our users by mid-2023 so if pulp_rpm could support it until then (maybe for 1-2 more releases), that’d be great for us. A setting flag would work for us as well.

2 Likes

Do we have a way to have repoview-like functionality in Pulp itself? That would make it very useful for people who want an easy way to browse the packages ingested into Pulp.

@Conan_Kudo Not “repoview”-like, but there are index pages so you can browse it as if there was a normal directory structure.

Having some kind of “repoview” like view built into Pulp would be tremendously useful for both internal Pulp deployments and Pulp deployments for third-party repositories that wouldn’t get indexed by things like pkgs/repology/etc. COPR and OBS would benefit from it too.

I know there’s a fair bit of community demand for repoview, but actually reimplementing repoview in a way to rip out the dependency on legacy yum is a fair bit of work in itself.

How is this going? Do you have an updated timeline?

I would say that hopefully we plan to drop them in August/September and if we need them longer, we can either patch pulp_rpm or just stay on an older version.

@davidd Can you confirm if this has been completed yet? We’re in no particular hurry so it’s fine if it isn’t

1 Like

I responded to you on Matrix but posting here for visibility. We just reached out to the team that is still using sqlite but we’re hoping to upgrade to Pulp 3.40 once its released. After that, I think we should be ok with dropping sqlite support.

We’ve gone ahead and disabled sqlite metadata files for our repos so no objection from us anymore.

1 Like

It will be gone in 3.25.0

3 Likes