Marking majority of tests to only run nightly

bmbouter · June 15, 2022, 6:45pm

I believe we need to reduce the amount of time it takes to merge a PR, especially a pulpcore PR. It has typical run times at around 40 minutes. There are a variety of ways to reduce the runtime here, but for this discussion I propose we run fewer tests at PR merge time and all tests at nightly.

For pulpcore specifically, I’d like to identify a representory sample of the tests, e.g. 30% and have these run at PR time. Then in the nightlies, let’s run 100% of the tests. To do this each test must have already been converted to a pytest style test and then it needs the pytest.mark.nightly mark see this example. The CI runs already use that mark all we have to do is apply the marks.

We may find the occasional revert, but I think it’s a small price for the ability to merge, e.g. 4x work in the same CI time. We need to look after our nightlies, but we need to do that anyway. I’ve started focusing on getting the various nightlies passing, e.g. pulpcore, pulp_file, pulp_ansible, etc.

What do others thing of this idea? Can we try this idea on pulpcore and pulp_file specifically?

lubosmj · June 16, 2022, 4:28pm

If our test suite had run for more than an hour, I would like the idea. Other than that, I think that we will be reverting commits more frequently. Or, we would outsource running the tests to developers, which is not wanted.

30% is too strict. I would personally vote for 50%+ if I had to.

davidd · June 16, 2022, 10:16pm

Definitely agree that 40 min is too long.

Maybe another option is to run tests at merge time instead of at night? That wouldn’t slow down PRs and you’d get faster feedback if there’s a problem.

bmbouter · July 6, 2022, 7:08pm

I hear there is concern (both here and at the pulpcore meeting) about the reversion rate. I share that concern, but until we try it, I don’t think we know what the tradeoff is in practice. My pitch is: let’s try it and see how it goes for say 30 days. It’s as easy as adding @pytest.mark to tests so this can very easily be reverted. Here’s the experiment I’m hoping we can run.

Mark 50% of the pulpcore and pulp_file tests as nightly (I wanted 80%, but let’s try 50% for this experiment).
Let’s document two things on this thread:

the PR runtime prior to and after making this change.
The reverts that were needed within those 30 days due to things missed at merge-time.

After trying this I think we’ll be in a much better position to agree on a strategy that balances these concerns. Any objections to trying this experiment specifically? At the last pulpcore meeting, I committed to outlining this experiment (this comment).

As an aside regarding a post-merge runtime idea, I agree with the benefit, we would find out we need to revert sooner. My main holdup with that though is that we also have PRs waiting for CI runners due to other PRs occupying the runners, which I think would lower our PR throughput by actually occupying a lot more CI minutes total. Also we’d still have to do the post-merge-revert which I believe is what most folks are worried about.