The Association for Software Testing is crowd-sourcing a book, Navigating the World as a Context-Driven Tester, which aims to provide responses to common questions and statements about testing from a context-driven perspective.
It's being edited by Lee Hawkins who is posing questions on Twitter, LinkedIn, Mastodon, Slack, and the AST mailing list and then collating the replies, focusing on practice over theory.
I've decided to contribute by answering briefly, and without a lot of editing or crafting, by imagining that I'm speaking to someone in software development who's acting in good faith, cares about their work and mine, but doesn't have much visibility of what testing can be.
Perhaps you'd like to join me?
--00--
"When the build is green, the product is of sufficient quality to release"
An interesting take, and one I wouldn't agree with in general.
That surprises you? Well, how about this statement: when a student has passed their Computer Science degree, they are sufficiently experienced to write production code.
Hmm, you're not so comfortable with it? That doesn't surprise me.
Can I briefly explain why? OK, thanks.
What do we understand by a green build? Probably it's something like one of these: all of the unit tests ran locally and none showed a failure; unit, integration, and other tests ran in some continuous integration pipeline and all passed; or the product was built and deployed then end-to-end tests were run across it and other services without obvious error.
The common aspect of what we generally understand by green builds is that some
kinds of tests or steps ran and none were reported as unsuccesful as far as we can see.
There are a couple of things to flag here. First, what are we trying to test, how broadly, how deeply, to what level of accuracy, covering which parts of the product, according to whose expectations, in what combinations, in what environments, for what purpose?
The second is whether there's a difference between what was intended to be tested and what is actually being tested. You don't think that's likely? Have you ever had a situation where you asked for one feature and another was delivered?
Yeah, right.
Third, what do pass and fail even mean? Maybe try thinking about the kinds of tests we're discussing here less as error detectors and more as change detectors.
When a test "fails" it's saying something like "running this functionality used to produce X but now it produces Y. Is that what you want?" Sometimes you will want Y, because you're changing how a feature works, but if the unit test still finds X you'll have your green and you'll ship with X.
What we mean by quality is a whole nother BIG question but it will often include intangibles that are hard to test for in a pipeline, such as usability, and in any case the impression of quality of a product will vary across users and contexts.
I really like that you said sufficient quality
because that suggests you are thinking about the compromises that exist in a software project. One of the compromises often made is deferring time on tests in favour of more features.
If I'm sounding negative here, I'm not
saying that there's no value to automated regression tests. If you can define metrics
that represent sufficient quality for you, and they can be evaluated
programmatically, then they could certainly be added to a build pipeline.
Additionally, exploring the product and crystallising the interesting aspects of that exploration in test suites is a good way to build a description of a product's behavioural envelope, intended or not.
The tests can detect changes to that,
and a human can assess whether they're important and in what way, but that's
not the same as saying that the tests assess all of what makes a product
sufficient quality for our users.
Image: Greg Rosenke on
Unsplash
Comments
This article was curated as a part of #118th Issue of Software Testing Notes Newsletter.
https://softwaretestingnotes.substack.com/p/issue-118-software-testing-notes
Web: https://softwaretestingnotes.com
Post a Comment