When to allow removal of remotes used for on-demand syncing?

lubosmj · June 14, 2022, 2:59pm

Problem Description

Right now, users are allowed to delete a remote anytime. The deletion can disrupt content artifacts that contain the reference to remote artifacts associated with the remote.

When a remote is deleted (and there is no other backup left - another remote) the publication machinery in pulp_file fails because it cannot fetch the URL for any of the synced content units.

Proposed Solution

To prevent such from happening, we propose adding a check hooked to the PRE_DELETE signal to disallow the removal of a remote when no other remote artifacts are left. This forces users to delete all synced repositories and synced content before deleting the actual remote. As a result, content artifacts will not end up in an inconsistent state.

Disadvantages

The solution adds more burden on users when deleting regular remotes.
The solution may not be performance efficient since we need to check all content artifacts to have at least one remote artifact.
The solution does not guarantee that in the future we will have more than one remote artifact per content artifact (then considering on_delete=PROTECT would be an option as well: Should we keep using FK with on_delete=SET_NULL?).

Workflows

Remote artifacts are created during the sync procedure when the on-demand policy is used. When a user syncs a repository with two remotes pointing to the same destination, two remote artifacts per one content artifact are created:

pulp file remote create --name remo11 --url https://fixtures.pulpproject.org/file/PULP_MANIFEST --policy on_demand

pulp file remote create --name remo1 --url https://fixtures.pulpproject.org/file/PULP_MANIFEST --policy on_demand

pulp file repository create --name repo

pulp file repository sync --name repo --remote remo1

pulp file repository sync --name repo --remote remo11

So, in this case, removing one of the remotes will not cause any harm. Removing both of the remotes will cause the publication machinery to fail. The proposed protection does not allow users to delete both of the remotes.

References

The proposed solution comes from Problem: Deleting Remote deletes RemoteArtifacts - HackMD (1c). The PR that implements the solution is available at https://github.com/pulp/pulpcore/pull/2836.

quba42 · June 20, 2022, 9:44am

I have said this before elsewhere: I have always been conceptually puzzled by the very loose coupling between remotes and repositories/repo versions in Pulp. In theory one could imagine a use case of creating several remotes, and attempting to sync all of them in turn into a single repository (which would be an argument for the loosely coupled status quo).

However, I think in practice users almost always imagine a tight coupling between a particular repo and a particular remote which is always used for syncing to that repo.

Right now, if I sync some remote into a repo the only place that records this link is the sync task, and once that task is deleted (because I need to clean up the gazillion old tasks clogging up my machine), the information is lost. Even if the task still exists it requires effort to retrieve the right task, just to check what remote was used for some repo version.

I am aware that it is possible for users to link up a repo with a remote, but this puts the onus on the user to choose a particular workflow to achieve something which reasonable users would expect to be the default behaviour, and big users (like Katello) don’t do this.

To me, the problem described in this thread is just another example consequence of this underlying conceptual muddle. I think remotes should be tightly coupled to repo versions created using the remote (via a remote field on the repo version model that contains the relevant remote href), and only be deleted once those repo versions no longer exist. (It is not expensive to keep a remote around in the DB).

lubosmj · June 20, 2022, 3:13pm

I agree, coupling remotes with repositories should suffice. So, to remove the remotes, one should first remove the repositories/repository versions used for syncing.

However, there might be a problem with ACS. The remotes referenced from the ACS shall not be coupled.

quba42 · June 21, 2022, 7:19am

I just realized coupling remotes with repo versions (as I suggested) would only make sense if remotes were immutable. That is maybe something one might discuss for Pulp 4 or 5.

(Isn’t mutability another problem for your remote artifacts use case? What happens to remote artifacts when the user alters the URL on the remote?)

So yes, for the immediate problem you describe just requiring remotes that will create remote artifacts to be linked to a repo and disallowing deleting as long as that link exists might work.

x9c4 · June 22, 2022, 5:56am

Yes, indeed mutability is basically the same issue as deletability. But also coupling to the repo-version that was synced may not be enough, because (if not mirrored) this version may contain content of older versions that may be coupled to different remotes. So really the least common denominator is the RemoteArtifact. Both mutability and deletability could be circumvented by creating an immutable only garbage collected copy of a remote at sync time that is invisible to the user. Not sure if this snapshotting is a usable concept.

quba42 · June 22, 2022, 7:44am

Sounds like it would almost be easier to just store the full URL on each and every remote artifact. How expensive would that duplication be?

dralley · June 28, 2022, 2:04pm

We already do this https://github.com/pulp/pulpcore/blob/main/pulpcore/app/models/content.py#L675=

It’s not really a duplication because this URL points to the specific artifact whereas the remote URL points to the root of a repository. So the “remote” on a remote artifact is used as a configuration for the sync, and the URL present there isn’t helpful in terms of finding the artifact.

In the past I was looking at whether it might be possible to copy-on-write the remotes, but DRF isn’t designed for this at all. Also if you change the remote you might want all the remote artifacts to use the new configuration. This is another area where perhaps having the URL be part of what we currently call a Remote might not have been a good idea. How it is treated is a deviation in several ways. re: https://github.com/pulp/pulpcore/issues/2824

dralley · June 28, 2022, 2:07pm

I still have a preference towards “fake deletion”, wherein we would secretly keep the remotes around but hide them from the API by setting a flag on the model. And then orphan cleanup could do a final garbage collection of any remotes for which no RemoteArtifacts exist.

Short term I’m OK with having a better error message. I don’t think we should close the issue though.

bmbouter · July 13, 2022, 7:06pm

Apologies for the long delay, I’ve wanted to followup on this for a while. I agree complete with the problem statement, and something should be done.

I am comfortable with the user experience of preventing a remote that is in use from being deleted and providing an error message to the user when the remote can’t be deleted containing which repo versions contain content that is related to that remote is a reasonable user experience. The user can choose to delete those repo versions, then run orphan cleanup, and then they can delete the Remote.

I am uncomfortable with a fake deletion for a few reasons. When a user says to delete something and Pulp replies with “I deleted it” but in reality it hasn’t been, Pulp’s trustworthiness is at risk. It’s not that we’re duplicitous about it, the docs would be clear, but it’s just an anti-pattern. It gives rises to concerns like “oh no we have to stop using this username/password because we inappropriately were sharing it across our organization and now we need to stop”. It’s a contrived example I know, but the point is there are no corner cases when a user says delete me and Pulp deletes it. Additionally, orphan cleanup is scoped to being only about cleaning up content, and having it’s scope expanded to include hidden objects seems unnecessary.

quba42 · July 14, 2022, 7:39am

I tend to agree with all your points.

In addition, I just wanted to stress one more point (that I think has already been mentioned in the discussion above, but perhaps not as prominently/directly as it should have): I am not sure protecting remotes from deletion is enough, they would surely also need to be protected against modification, no? If so, then “you can’t modify this remote, because it has been used” is a pretty weird design that would irritate me as a user.

x9c4 · July 14, 2022, 7:56am

Then again, updating the username / password on a remote in use may be a valid usecase if you had your access-token (used as a password) revoked, if the proxy setup to reach some external source has changed, …
The only other way (and that may just as well be not-so-bad) is resynching with a new remote. And i think, that will (even if nothing changed upstream) create the new remote artifacts accordingly.

For the “we must stop using that remote now, but we don’t want to loose all the version history” do not know a reasonable solution. Also the stop using that content and orphan cleanup is not a valid answer, because that same content having another remote can still be valid in use. And we do not offer the deletion of specific remote artifacts in the api.

Have we considered doing nothing? Can we maybe have a safe-deletion that can fail as long as the remote is in use, And a force mode that just deletes?

bmbouter · July 14, 2022, 4:58pm

Having a force-delete option I think would be nice for our users.

I believe users should be free to edit all the fields to allow them to update the credentials/settings as necessary. Those new settings should be used at on_demand download time. There are lots of reasons for this (probably good ones for each field) For example, now our Pulp needs to use a http proxy, or the username/password changed (as x9c4 pointed out), or the remote server now has new rate limits, or the client certs changed, etc, etc.

ipanova · July 18, 2022, 1:01pm

I believe oprhan cleanup action is exposed only to admin, is not it?

How this will play in the RBAC world where the creator of the remote wants to DELETE it but there are other repos he does not have any access on, which were synced using this remote by another user who has only READ access to the remote?
RBAC will not allow him to even see those repo versions.

ipanova · July 18, 2022, 1:51pm

I think this approach has been considered in one of the tickets https://github.com/pulp/pulpcore/issues/2153 and https://github.com/pulp/pulpcore/issues/1975 and , given the viable options, I believe having a force flag is fine, just it will be a tough luck for the users who have READ access to the remote and just happen to use it. Maybe we can adjust the error message that will whether show list of affected RVs or simply saying that this remote is in use.

This approach can probably be extended to the other objects too https://github.com/pulp/pulpcore/issues/2153#issue-1113935088

lubosmj · July 28, 2022, 8:59pm

To conclude, we have decided to raise an error with the list of affected repositories’ versions if a user tries to delete a remote that is still in use. Using the --force flag will enable the user to eventually delete a remote regardless of its current state. The attached PR will be updated accordingly.

lubosmj · August 1, 2022, 12:38pm

The only feasible approach to address this problem in rest_framework and drf.spectacular was to introduce a new header. It is not possible to send body data along with a DELETE request through the python bindings (https://github.com/tfranzel/drf-spectacular/issues/431#issuecomment-862738643). Also, it is impossible to specify a customized query parameter since such a parameter is considered one of the filtering options (currently not allowed).

See the implementation at https://github.com/pulp/pulpcore/pull/2836.

ipanova · August 9, 2022, 4:36pm

We’ve discussed some of this approach with @pulp/core and came to the conclusion that specifying extra info in the request headers, also it being a non standard header is not the path we’d want to take.

We’ve being having back and forth conversations on this issue, sorry for that, however it is a non trivial problem to solve.
General consensus seems to be, given the complexity and absence of an explicit need from the community, to not implement any kind of force option for now.
The suggestion is to follow this Should we keep using FK with on_delete=SET_NULL? - #3 by davidd where we’d change to on_delete=PROTECT and give a friendly message to the user with a list of repository versions/repositories that use it. Then user would need himself to inspect and decide whether it is ok to DELETE those, before he can freely DELETE the remote.

Watch out for the RBAC aspect: user should not see hrefs for the resources he has no READ access to. In case it is impossible for that reason to share list of resources then simply returning a generic message that This remote is in use should suffice.

Out of scope of this remote issue, would be to audit other models Should we keep using FK with on_delete=SET_NULL? which might need to have same change.