We will stop publishing bindings soon

TL;DR: With the upcoming breaking change release 3.55, we plan to stop publishing the autogenerated python and ruby bindings for pulpcore and plugins.

The “bindings” also termed as “clients” in the Pulp ecosystem are packages for different programming languages automatically generated by [0] from the auto-generated openapi schema. To this day, we publish these packages for pulpcore and most of the plugins with every release in python and ruby.
So why should we stop doing so? Well, there are some serious issues with this approach:

  • The bindings are tightly coupled to the very version of the pulp component they were generated from. Even if it may work on occasions, there is no guarantee (and we had reported issues in the past), that the same installation of these clients can talk to servers running different versions of the plugin.
  • Even if you have a way to always match the version of the servers plugin with the client installations bindings version, as soon as the pulpcore version the plugin bindings were created with does not match your servers installation, all bets are off.
  • There are some (at least two) settings for Pulp (rerouting and domains) that lead to significant changes in the api schema. Once you use these features, the bindings generated without them stop working.
  • On the more subtle side, the bindings do not complain that you are using an incompatible thing. They just fail with cryptic error messages, or, even worse, fail silently.

All in all, we see no way to publish generic bindings that serve in all the various possible combinations, nor that we publish individual ones for all of these.
But you still want to use the bindings? Apart from that being very clunky to be used in my opinion anyway, you can. The only thing we ask you to do, is to generate the matching bindings for your specific installation (be it a ready server, or a single set of published packages with only one version of each plugin [see settings issue here]) yourself. All the documentation needed should be found in [0].
If you are working on a python tool (or a language able to call into python) that needs to communicate with Pulp, I recommend looking into pulp-glue [1]. That library is handcrafted, handles tasks for you and knows how to deal with different versions of Pulp and plugins.

Sorry to have this on a rather short notice (we expect 3.55 in a small number of weeks), but you know you should not have depended on the published bindings anyway, right?

[0] https://github.com/pulp/pulp-openapi-generator
[1] https://staging-docs.pulpproject.org/pulp-glue/docs/dev/learn/architecture/

3 Likes

From a Katello perspective, we may need to generate the Ruby bindings for a while. If we switch to Pulp Glue, we’d likely start with implementing it for smart proxy syncing logic. We have pretty extensive use of the Ruby bindings for Pulp across our codebase, so it might take multiple releases to accomplish a full switch.

Smart proxy content management is the only case where we have mismatched client and server version, otherwise the Ruby bindings have been perfectly fine for our use case. Over the past Pulp upgrades, we’ve only found a couple occasions when the newer bindings didn’t meet our needs to manage older smart proxies.

I suppose Katello falls into the category of projects that have implemented a good policy for ensuring the proper clients are being used.

The biggest hindrance to us switching to Glue is the lack of Ruby library bindings. We’d need to come up with a safe, maintainable, and performant way to call the Python code from Ruby. There were some discussions on this before, but I’m not sure if anything promising came up.

3 Likes

Well, that was a rather tongue-in-cheek sign-off to the post… Feels very disrespectful to those who utilize these bindings for consuming your project.

The fact of the matter is, there is a sizable base of usage of pulp that rely on these bindings, in one way or another. Given such short notice, this will cause two outcomes. Either a) the affected use cases will have to version lock on something less than pulp 3.55 for a long period of time, or a lot of work is going to have to be done last minute to adapt, introducing a lot of risk and instability to the situation.

While I get the auto-generated bindings aren’t the best case scenario, they do help eliminate an entire class of work from the equation that will now need to be adopted by build systems and dependent projects. This, as we all know, is not trivial. The ugly truth is equivalent functionality will be needed for the foreseeable future. This is going to cause a major disruption to a lot of teams that are already strained by other workloads and priorities.

While I understand the desire to unilaterally drop this situation in short order, I would strongly suggest that we come together to figure out a better long term solution to this situation that we can all work towards, instead of leaving a subset of the community hanging out to dry on this decision.

I’m sorry to have caused such concerns.
Let me emphasize some points. We do not say, that you should stop using the bindings, nor that they are bad just because they are auto-generated. Instead, we still heavily use them (the python brand) for our own test suites. That also means, we constantly test that the generation works and produces the right bindings. However we need to generate these packages individually for the various test scenarios for exactly the incompatibilities lined up above. And that means the packages we currently publish claim to be something they are not. All we ask is that the generation of the bindings packages is moved one step closer to what will actually run as a Pulp installation.
What I hear (and see) is that currently the biggest hurdle on the way to bindings is, that you need to have a running instance of pulp to extract the api.json from. And that may technically not be needed. It should be possible to generate that file with just the python packages installed (no running services, no reverse proxy, …).

1 Like

Hey, thanks for the heads-up.

Isn’t this problem of the OpenAPI (the JSON file) rather than client bindings? How does dropping generating OpenAPI clients solve the API incompatibility problem for non-Python languages?

Is this the {domain_href} path parameter problem we talked about on Slack the other day? Just for others, the {domain_href} path parameter is used in the API many times I believe built on something that is vaguely defined in OpenAPI. It appears that path parameters are always url-encoded, thus one cannot pass any parameter which looks like this: /api/pulp/api/v3/domains/UUID because it gets URL encoded and the result is 404. This is a known problem without any solution and it was postponed to OpenAPI 4.0 which should be coming up in 2024 (the specification, not implementations yet).

Again, how does dropping client generation solves the problem for non-Python languages?

This looks like giving up on the OpenAPI idea completely, which is sad thing to observe. Instead, what you guys should be really planning is how to fix this if you plan to stick with OpenAPI for a while. I have no clue how much work this is because I feel like I only scratched the surface with the domain href path parameter problem, I just felt I wanted to let you know.

The reason why I think giving up on this is sad to see is that it is fair to assume that once you have a OpenAPI specification, you can just generate a client and it will work. That is not the case today and definitely not the case once you drop the bindings. You mentioned that these I cryptic errors and hard to solve issues which I can confirm as I was asking you yesterday on the channel about it. The time which ill be saved may be used on giving support to the people on channels or discourse.

In my honest opinion, OpenAPI is not great as it tries to be open and portable to the degree that it is painful to work with. Also, it is REST which I personally thing is a terrible idea from the day one. I like properly designed functions, RPC-like APIs where there are many portable options. End of rant. Sigh.

Now, it is what it is. This is not a request for feedback, the subject is clear - you guys are dropping it. I would like to suggest something: could you put documentation together about known issues with OpenAPI and how to solve them in general. You could use the OpenAPI documentation feature itself! Just make a note for each endpoint which has the href path parameter about the URL encoding issue and that it is up to the caller to set it properly. Example of the current documentation of one of the endpoints in question:

There is literally zero information for the domain_href path parameter which is causing issues, I would appreciate a word about what to pay attention to. This will be also easier for you once people come asking about the problem - you just paste the link to the docs :slight_smile:

Cheers.

Thanks for your answer. I still feel misunderstood. The plan is not to drop the concept of generating and using the bindings (no matter which language).
The issue outlined is that it’s impossible to maintain Pulp plugins compatible with a bunch of Pulpcore versions and publish correct bindings along the way.
Pulp will provide an openAPI3 schema for the foreseeable future, and we will try to keep it as compatible as possible. You can generate your clients from there. And in the future that will be the supported way.
The only addition to that is that we suggest using pulp_glue in case you want to talk to multiple servers from the same codebase and happen to be able to call into python. But this is not a you-must-change-today.

BTW: The {object_href} thing is a certain quirk of the Pulp API and not related to this topic.

1 Like

I can see why this is a heavy process for Pulp since there is a lot of work going on to produce artifacts that may never be consumed by anyone.

What I’m not fully understanding is “the packages … claim to be something they are not”. @x9c4 does this mean that, for example, generating Pulpcore bindings on a machine with only Pulpcore might produce different results than a machine that has, say, Pulpcore and pulp-rpm installed?

To @pcreech 's points above, if Pulp stops generating bindings, we’d need time to prepare to build them ourselves for future releases and for backports. If you were to stop generating bindings only for new y-versions of Pulpcore & plugins, I’d say that we could pick up the last release with generated bindings when we start upgrading Pulp in ~ 3 months.

However, if all bindings generation will stop (which is what the proposal is sounding like), then I think we should determine a unified timeline. @pcreech , perhaps we could determine our comfortable timeline for taking on the bindings building and go from there? I think we need to quantify how much of a shake-up for us this will really be.

1 Like

I will just add my 2ct.: We have been using pulp-openapi-generator to generate (and package) our own bindings (at least for some plugins and core) for some time.

In our experience, the somewhat inconvenient thing about using pulp-openapi-generator is that it provides a script that will run container images. This can be somewhat problematic when trying to run it inside a pipeline, where each job already runs in a container image. Container within container is not impossible but annoying to set up and I believe it requires privileged containers which might violate some peoples policy/infrastructure constraints.

The other challenge is that pulp-openapi-generator requires either a running Pulp instance to pull the api.json from, or else you can provide it with a api.json that you have already pulled from a running Pulp instance before (or otherwise obtained).

My recommendation: Provide some really good guides/examples on how to use pulp-openapi-generator for people to start experimenting with generating their own bindings. I am not sure how good the docs in the pulp-openapi-generator really are. IMHO this change warrants at least a blog post with a step by step example.

The commands for this would certainly be interesting. Also: How confident are we that this will result in an identical api.json as compared to pulling from the running instance?

5 Likes

After the initial post last week, I took the time to dig in and see about what it would take to take over a “client binding generation” process for katello.

The major bullet points are:

  • Familiarity: The team to take on this work doesn’t have enough of an understanding of the topic to immediately step in and take over
  • Capacity: There really isn’t much capacity at the moment for taking on this work in the initial provided timeline to ensure a seamless transition, without MAJOR disruption to other priorities for the teams involved.
  • Quality: We are not going to be able to provide a reliable, consistent, automated solution in the 3.55 time frame.

To address these above bullet points sufficiently, we believe we could have it done in 6 months, to ensure that we understand the work involved, without disrupting our current top priorities, while ensuring the final solution is of sufficient quality to not induce excess maintenance work on the parties involved. This is a cradle to grave estimate, with some overlap in responsibility transitions, etc… as well, so pulp wouldn’t necessarily be 100% on the hook the entire 6 months.

The main questions I have for pulp at the moment are:

  • How difficult is it for pulp to maintain the status quo of building and publishing the bindings at the moment?

  • How involved would pulp be willing to be in working with us on the initial phases of getting our process of building the bits off the ground? Mainly concerned about knowledge gaps and best practices.

We can cover more implementation specific questions (i.e. api.json details) at a later time.

3 Likes

Let me know if you are at all interested in the CI code we use to generate bindings. It is a GitLab CI pipeline, but it should at least contain enough information to tell you what all the requirements are.

3 Likes

I am interested for sure.
At the very least, we want to make the process easier.
Am I right that the flavor of bindings you build are the ones to be consumed my Katello?

1 Like

The short answer is: Yes. The bindings we generate are for Katello.

The relevant bits of the pipeline go something like:

  • Stage 1 (just one job): Install Pulp from the relevant Foreman packaging repo (e.g.: Index of /pulpcore/3.39/el8/x86_64). Start pulp, and pull the api.json for all relevant plugins and core. (The *-api.json files are turned into GitLab artifacts so they are available to the next job.
  • Stage 2 (one job per client gem): Take the relevant api.json from Stage 1 and use pulp-openapi-generator to generate a client gem from it.
  • Stage 3 (one job per client gem): Take the client gem from stage 2, combine it with the SPEC file from foreman-packaging repo (and yes the client gem spec files are in foreman-packaging, not pulpcore-packaging) and build an RPM package for it, upload this RPM package to our Pulp instance.
  • Stage 4 (one job): Publish and Distribute a new version of our Pulp repo containing the new RPMs.

Some things to note:

  • All jobs run in docker containers (and yes it is still docker, not podman).
  • Stage 1 takes slightly under 2 Minutes.
  • Stage 1 has the additional benefit of testing if Pulp is still installable and runnable whenever new Pulp RPMs where packaged.
  • Stage 2 jobs require docker in docker, and so need to use a specific runner.
  • Stage 2 jobs take about 1 Minute to generate a gem.
  • Stage 3 jobs take about 1 Minute 25 seconds to package an RPM.

If there are enough runners to run all jobs in each stage in parallel the whole thing takes roughly 5 Minutes until new client gem RPMs are available in repo.

We did experiment with using the foreman-installers Puppet module to install Pulp, but ultimately gave up on that (mostly because we lack Puppet experience) and just used a simple shell script instead. In spite of this, the pipeline has been running stably with minimal maintenance requirements for a long time now. I think the last time we had to touch the job logic was when switching to a newer pulpcore Y-version that had changed the entry point for starting the Pulp services.

Conclusion: There is a initial cost to setting up automation for client generation. Once that automation has been set up reasonably well, the maintenance burden is pretty minimal.

4 Likes

Can you try this in your delivery pipeline?

2 Likes

Given the amount of discussion around this topic, it is apparent that we need more time to plan this, and we already discussed that we are not going to to hold off the 3.55 release for this (technically demoting it from a release blocker).
So to read the title correctly, it is still “soon” but may not be “very soon”.
Thanks for all the involvement. I’m so glad we have this forum to probe for friction before breaking stuff.

5 Likes

I did some experimenting using the new pulpcore-manager openapi command. In oci-env, I was able to compare it to the “old way” of obtaining API schemas for client generation as follows:

[root@7090d60a8cc9 /]# curl --output pulp_deb-api-curl.json "http://localhost:24817/pulp/api/v3/docs/api.json?bindings&component=deb"
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  496k  100  496k    0     0  1028k      0 --:--:-- --:--:-- --:--:-- 1028k
[root@7090d60a8cc9 /]# pulpcore-manager openapi --component deb --bindings > pulp_deb-api-manager.json
[root@7090d60a8cc9 /]# diff pulp_deb-api-curl.json pulp_deb-api-manager.json
11488c11488
< }
\ No newline at end of file
---
> }
[root@7090d60a8cc9 /]#

=> Successfull test.

I have also tried adding the new pulpcore-manager approach to our build pipeline. There I was able to use it against a Pulp instance that was installed (from RPMs) but without a DB, DB migrations, or starting any processes. This cut the job run time from about three to about one minute (compared to the running instance approach).

However, it did not fully work, because our build pipeline is still on pulpcore 3.39/Katello 4.11. I believe this is as far as I can take it until we update to pulpcore 4.49/Katello 4.13. From what I have seen I am pretty confident the approach will work once we update.

4 Likes

We identified, that this change is not a breaking change in relation to our deprecation policy, so we are not technically bound to line it up with a breaking change release. Hence there is no need to rush it in to 3.55. But on the other hand, we cannot promise that nothing will change before 3.70 either.
So please tell me if the new openapi command (about to be released now) will fit into your packaging and deploy pipelines.
@quba42 , I take this as a yes already.

1 Like

We released the pulpcore-manager openapi command on all the supported branches.
Also there is a new gen-client.sh in pulp-openapi-generator script that ingests a provided api spec instead of trying to download it.
Building the bindings should be like this now:

# setup packages in a (fresh) virtual environment
cd .../pulp-openapi-generator
workon pulp_clients
pip install pulpcore

# Generate pulp_file-client wheel
pulpcore-manager openapi --bindings --component "file" > "file-api.json"
./gen-client.sh "file-api.json" "file" python "pulp_file"
pushd "${PACKAGE}-client"
  python setup.py sdist bdist_wheel --python-tag py3
popd

Please tell us about your experience.

1 Like

Just to be clear, “all supported branches” means the following pulpcore versions: 3.21, 3.22, 3.28, 3.39, 3.49.

3 Likes

…and everything starting from 3.55

1 Like

I think we have one of the anticipated issues in the wild now:

1 Like