Tuesday, November 16, 2021

Red Testing Hood

Angie Jones, The Build That Cried Broken

Like the boy who cried wolf, the build that’s repeatedly red for no good reason will not be trusted. Worse, it will likely end up in a persistent red state as people stop looking at the regular failures. Eventually something important will also fail ... and no-one will notice.

At CAST 2021, Angie told us this and other stories about a team she was on, one who found themselves in that bad place but used their wits to escape and lived happily ever after. 

Well, perhaps that’s an exaggeration: they tolerated living in a short-term bearable state where a reliable kernel of tests supported development and deployment and a set of flaky tests were worked on to the side.

Separating off the flakes was a good move but it was supported by others, including assigning a team member to investigate all failures and setting limits on the percentage of tests that could be in the flaky set before all other activity stopped. Focus was firmly on fixing today’s problems rather than grand plans for tomorrow.

The team also made Jira tickets for each failure investigation and prioritised each one explicitly. As a rule of thumb, Angie would delete and rewrite from scratch any case that had been edited significantly more than three times. 

If there's one principle over all of these, it's to prefer fewer relevant, reliable, and important tests over simple large numbers. Very much not "Oh Grandma, what a large test suite you've got!"

No comments:

Post a Comment