Outcomes of fuzz testing the Pulp project

lubosmj · March 3, 2023, 10:07am

I played with the idea of running fuzz tests against our API endpoints. The fuzz testing might help to identify unhandled corner cases that can lead to internal server errors. I used https://github.com/schemathesis/schemathesis to execute the testing. It parses the OpenAPI schema and then throws organized garbage at all public endpoints.

I have identified a couple of issues. Most of the issues are related to running DB queries with data that was not sanitized. I am attaching a couple of considerable REST calls below:

curl -X GET -H 'Authorization: Basic YWRtaW46cGFzc3dvcmQ=' 'http://localhost:5001/pulp/api/v3/workers/?last_heartbeat=2000-01-01T00%3A00%3A00%2B16%3A00'

curl -X GET -H 'Authorization: Basic YWRtaW46cGFzc3dvcmQ=' 'http://localhost:5001/pulp/api/v3/tasks/?finished_at=2000-01-01T00%3A00%3A00%2B16%3A00'

curl -X POST -H 'Content-Type: application/json' -H 'Authorization: Basic YWRtaW46cGFzc3dvcmQ=' -d '{"name": "2", "password": "2", "url": "22222", "username": "\u0a01", "ca_cert": null, "max_retries": -640}' http://localhost:5001/pulp/api/v3/remotes/rpm/uln/

curl -X POST -H 'Content-Type: application/json' -H 'Authorization: Basic YWRtaW46cGFzc3dvcmQ=' -d '{"name": "0", "url": "0", "pulp_labels": {"0": null}}' http://localhost:5001/pulp/api/v3/remotes/rpm/rpm/

curl -X POST -H 'Content-Type: application/json' -H 'Authorization: Basic YWRtaW46cGFzc3dvcmQ=' -d '{"base_path": "0", "name": "0", "pulp_labels": {"0": null}}' http://localhost:5001/pulp/api/v3/distributions/container/container/

curl -X GET -H 'Authorization: Basic ADo=' http://localhost:5001/pulp/api/v3/roles/

My question is whether we should focus on sanitizing inputs on our side or outsourcing the validation to python modules we rely on, e.g., by reporting the issues in their code-base (Django or rest_framework).

ggainey · March 3, 2023, 6:41pm

This is a great idea - thanks for thinking of, and experimenting with, this effort!

Generally speaking, the Security Guy answer to that question is “Yes”. We should sanitize incoming data for our own safety, and report upstream to make things better for everyone else.

I’m def interested in fixing whatever problems this effort raises.

lubosmj · March 3, 2023, 9:39pm

That is great to hear!

I was able to detect a couple of more errors while running the negative testing. In this case, the errors were exclusively hitting the Pulp code:

curl -X POST -H 'Content-Type: application/json' -H 'Authorization: Basic YWRtaW46cGFzc3dvcmQ=' -d '[]' http://localhost:5001/pulp/api/v3/pulp_container/namespaces/

curl -X POST -H 'Content-Type: application/json' -H 'Authorization: Basic YWRtaW46cGFzc3dvcmQ=' -d '[]' http://localhost:5001/pulp/api/v3/repair/

curl -X POST -H 'Authorization: Basic YWRtaW46cGFzc3dvcmQ=' -d ca_certificate=0 http://localhost:5001/pulp/api/v3/contentguards/certguard/x509/

For the sake of completeness, I have uploaded all the erroneous requests here (gdrive). Sometimes, it is necessary to additionally specify -H 'Content-Type: application/json' to create a valid reproducer if you want to study the failures.

Now, I will open issues in respective repositories, and this is the place where my effort ends.

I am also wondering if it is worth spending time investigating performance issues. I think we could identify performance-related problems because the code is trying to parse unusual queries. This is advised for future work.