Making tests totally independent

While working on some functional test refactors in pulpcore and pulp_file I’ve learned several of the tests depend on other tests being run before them. This was done with good intentions to reduce the amount of setup and arrangement for every tests, but it creates problems like:

  1. you can’t run individual tests
  2. Some tests leave data over which interfere unexpectedly with other tests if you don’t run the right ones after also ← what I ran into

So after some discussion and input during the pulpcore meeting, I like to propose we do two things:

  1. All tests (function and units) should be 100% independent
  2. It’s ok to leave behind orphans, but not ok to leave behind any other objects your tests created

The test dependence issue can be a bit tricky to spot, so how do we do this easily? What I came up with was to consider adding code like this to your_plugin/tests/conftest.py For example, I’m in the progress of adding this to pulpcore and pulp_file. Here’s the example for pulp_file:

@pytest.hookimpl(trylast=True)
def pytest_runtest_teardown(item):
    file_client = gen_file_client()
    pulpcore_client = gen_pulpcore_client()


    # This part just makes sure all tasks are finished because some tests don't properly wait after calling delete on things that generate tasks
    tasks_api_client = TasksApi(pulpcore_client)
    while True:
        for task in tasks_api_client.list().results:
            if task.state in ["running", "waiting"]:
                time.sleep(0.2)
                break
        else:
            break

    # All the possible types your plugin could leave behind
    types_to_check = [
        RepositoriesFileApi(file_client),
        # ContentFilesApi(file_client),  <--- not checking for content because it's ok to leave behind orphans
        RemotesFileApi(file_client),
        PublicationsFileApi(file_client),
        DistributionsFileApi(file_client),
    ]
    for type_to_check in types_to_check:
        if type_to_check.list().count > 0:
            raise Exception(f"This test left over a {type_to_check}.")

This will be run at the teardown portion of every test (when run with pytest). By adding this to my PR, I was able to easily refactor the non-independent tests to be independent.

1 Like

There are several issues with this:

  1. It has to run on an empty system
  2. The tests cannot be running in parallel

Maybe we should guard this type of add-on somehow so that you’d have to add another option to pytest to enable it.

I agree that cleaning up orphans after each test is resource wasteful, however some tests rely on having 0 content/artifacts for example 1) orphan cleanup tests 2) on_demand tests that ensure the artifact has been saved after client request 3) content repair/reclaim tests 4) some other i cannot think of atm
For such tests i am suggesting to run orphan cleanup in the setup phase.

we could add an auto prefix to the objects names what are created during tests. This way they can be distinguished from that ones that were already in the system. Publications don’t have names…so that can be a problem

Having all tests require to run on an empty installation is a no go for me personally and a counterdesign to running tests in parallel. So yes, if we can add a comandline flag to our testsuite (That is actually possible with pytest) to check that tests do not leave traces one could check individual tests locally. Will it be used? Hard to say.
Maybe it’s easier to teach reviewers to look out for create calls without the auto cleanup fixture.

+1 to the general idea that orphan_cleanup should be run prior to tests that need it, and any test can leave orphans behind (in the setup phase). This is why the pytest orphan cleanup fixture only has delete_orphans_pre and there is no delete_orphans_post.

I agree, we can’t require tests have an empty installation. I believe I can offer this as a pytest CLI argument. We can run it every now and then manually and I’d be satisfied with that for now at least. It’s really hard to spot this at review time all the time, but it’s ok if we just clean it up periodically (I think).

I think we should also be able to leave behind tasks, at least for now, to make it simpler to move forward without having to also provide cleanup-correctness of tasks. Notice my example “checker” post at the top doesn’t check for Tasks.

Finished tasks for sure. Are you also thinking about running, waiting tasks here?

Yes I imagine tasks in a completed state, i.e. errored, canceled, or complete. The implication here is that all tasks dispatched by tests need to be wrapped in a monitor_task call so that the test doesn’t exit until it’s complete. That would include cleanups too. Sound ok?

Sure, i never expected tests in a final state to be of concern. Having a reason for a pending task to outlive a test would raise an eyebrow though.