Skip to main content

Compare Testing


If you believe that testing is inherently about information then you might enjoy Edward Tufte's take on that term:
Information consists of differences that make a difference.
We identify differences by comparison, something that as a working tester you'll be familiar with. I bet you ask a classic testing question of someone, including yourself, on a regular basis:
  • Our competitor's software is fast. Fast ... compared to what?
  • We must export to a good range of image formats. Good ... compared to what?
  • The layout must be clean. Clean ... compared to what?
But while comparison as a tool to get clarification by conversation is important, for me, it feels like testing is more fundamentally about comparisons.

James Bach has said "all tests must include an oracle of some kind or else you would call it just a tour rather than a test." An oracle is a tool that can help to determine whether something is a problem. And how is the value extracted from an oracle? By comparison with observation!

But we've learned to be wary of treating an oracle as an all-knowing arbiter of rightness. Having something to compare with should not lure you into this appealing trap:
I see X, the oracle says Y. Ha ha! Expect a bug report, developer!
Comparison is a two-way street and driving in the other direction can take you to interesting places:
I see X, the oracle says Y. Ho hum. I wonder whether this is a reasonable oracle for this situation?
Cem Kaner has written sceptically about the idea that the engine of testing is comparison to an oracle:
As far as I know, there is no empirical research to support the claim that testers in fact always rely on comparisons to expectations ... That assertion does not match my subjective impression of what happens in my head when I test. It seems to me that misbehaviors often strike me as obvious without any reference to an alternative expectation. One could counter this by saying that the comparison is implicit (unconscious) and maybe it is. But there is no empirical evidence of this, and until there is, I get to group the assertion with Santa Claus and the Tooth Fairy. Interesting, useful, but not necessarily true.
While I don't have any research to point to either, and Kaner's position is a reasonable one, my intuition here doesn't match his. (Though I do enjoy how Kaner tests the claim that testing is about comparisons by comparing it to his own experience.) Where we're perhaps closer is in the perspective that not all comparisons in testing are between the system under test and an oracle with a view to determine whether the system behaviour is acceptable.

Comparing oracles to each other might be one example. And why might we do that? As Elaine Weyuker suggests in On Testing Non-testable Programs, partial oracles (oracles that are known to be incomplete or unreliable in some way) are common. To compare oracles we might gather data from each of them; inspect it; look for ways in which each has utility (such as which has more predictive power in scenarios of interest).

And there we are again! The "more" in "which has more predictive power" is relative, it's telling us that we are comparing and, in fact, here we're using comparisons to make a decision about which comparisons might be useful in our testing. I find that testing is frequently non-linear like that.

Another way in which comparison is at the very heart of testing is during exploration. Making changes (e.g. to product, data, environment, ...) and seeing what happens as a result is a comparison task. Comparing two states separated by a (believed) known set of actions irrespective of whether you have an idea about what to expect is one way of building up knowledge and intuition about the system under test, and of helping to decide what to try next, what to save for later, what looks uninteresting (for now).

Again this throws up meta tasks: how to know which aspects of a system's state to compare? How to know which variables it is even possible to compare? How to access the state of those at the right frequency and granularity to make them usable? And again there's a potential cycle: gather data on what it might be possible to compare; inspect those possibilities; find ways in which they might have utility.

I started here with a Tufte quote about information being differences that make a difference, and said that identifying the differences is an act of comparison. I didn't say at that point but identifying the ones that make a difference is also a comparison task. And the same skills and tools that can be used for one can be used for both: testing skills and tools.
Image: https://flic.kr/p/q8zmqn

Comments

Anonymous said…
I got used in a previous role in assessing bugs for severity and impact. So I would often produce bug reports where the bug was reported as being severe but not very impactful (and sometimes I would even warn the dev about it and say something like "You will be marking that as Will Not Fix, I assume. I'm cool with that but don't tell anyone I said so.")

Or I might mark a bug as not very severe at all but highly impactful, such as a typo that was actually a mis-spelling of the CEO's name (in which case, my private message to the dev would be "Fix that NOW or we will all be sacked!")

Traditionally, the thing about the ancient Greek oracles was that their utterances were highly gnomic and subject to interpretation. And that interpretation was down to the interpreter's knowledge of what things were like in the Real World at that moment. On that basis, then, testing oracles are no different. I was comparing the technical nature of a bug and the ways it manifested itself with the likely impact in the Real World. And your likelihood of getting sacked because of an error in the code isn't something that can be easily quantified and definitely can't be automated! :-)

Popular posts from this blog

Meet Me Halfway?

  The Association for Software Testing is crowd-sourcing a book,  Navigating the World as a Context-Driven Tester , which aims to provide  responses to common questions and statements about testing from a  context-driven perspective . It's being edited by  Lee Hawkins  who is  posing questions on  Twitter ,   LinkedIn , Mastodon , Slack , and the AST  mailing list  and then collating the replies, focusing on practice over theory. I've decided to  contribute  by answering briefly, and without a lot of editing or crafting, by imagining that I'm speaking to someone in software development who's acting in good faith, cares about their work and mine, but doesn't have much visibility of what testing can be. Perhaps you'd like to join me?   --00-- "Stop answering my questions with questions." Sure, I can do that. In return, please stop asking me questions so open to interpretation that any answer would be almost meaningless and certa

Can Code, Can't Code, Is Useful

The Association for Software Testing is crowd-sourcing a book,  Navigating the World as a Context-Driven Tester , which aims to provide  responses to common questions and statements about testing from a  context-driven perspective . It's being edited by  Lee Hawkins  who is  posing questions on  Twitter ,   LinkedIn , Mastodon , Slack , and the AST  mailing list  and then collating the replies, focusing on practice over theory. I've decided to  contribute  by answering briefly, and without a lot of editing or crafting, by imagining that I'm speaking to someone in software development who's acting in good faith, cares about their work and mine, but doesn't have much visibility of what testing can be. Perhaps you'd like to join me?   --00-- "If testers can’t code, they’re of no use to us" My first reaction is to wonder what you expect from your testers. I am immediately interested in your working context and the way

The Best Programmer Dan Knows

  I was pairing with my friend Vernon at work last week, on a tool I've been developing. He was smiling broadly as I talked him through what I'd done because we've been here before. The tool facilitates a task that's time-consuming, inefficient, error-prone, tiresome, and important to get right. Vern knows that those kinds of factors trigger me to change or build something, and that's why he was struggling not to laugh out loud. He held himself together and asked a bunch of sensible questions about the need, the desired outcome, and the approach I'd taken. Then he mentioned a talk by Daniel Terhorst-North, called The Best Programmer I Know, and said that much of it paralleled what he sees me doing. It was my turn to laugh then, because I am not a good programmer, and I thought he knew that already. What I do accept, though, is that I am focussed on the value that programs can give, and getting some of that value as early as possible. He sent me a link to the ta

Beginning Sketchnoting

In September 2017 I attended  Ian Johnson 's visual note-taking workshop at  DDD East Anglia . For the rest of the day I made sketchnotes, including during Karo Stoltzenburg 's talk on exploratory testing for developers  (sketch below), and since then I've been doing it on a regular basis. Karo recently asked whether I'd do a Team Eating (the Linguamatics brown bag lunch thing) on sketchnoting. I did, and this post captures some of what I said. Beginning sketchnoting, then. There's two sides to that: I still regard myself as a beginner at it, and today I'll give you some encouragement and some tips based on my experience, to begin sketchnoting for yourselves. I spend an enormous amount of time in situations where I find it helpful to take notes: testing, talking to colleagues about a problem, reading, 1-1 meetings, project meetings, workshops, conferences, and, and, and, and I could go on. I've long been interested in the approaches I've evol

Not Strictly for the Birds

  One of my chores takes me outside early in the morning and, if I time it right, I get to hear a charming chorus of birdsong from the trees in the gardens down our road, a relaxing layered soundscape of tuneful calls, chatter, and chirrupping. Interestingly, although I can tell from the number and variety of trills that there must be a large number of birds around, they are tricky to spot. I have found that by staring loosely at something, such as the silhouette of a tree's crown against the slowly brightening sky, I see more birds out of the corner of my eye than if I scan to look for them. The reason seems to be that my peripheral vision picks up movement against the wider background that direct inspection can miss. An optometrist I am not, but I do find myself staring at data a great deal, seeking relationships, patterns, or gaps. I idly wondered whether, if I filled my visual field with data, I might be able to exploit my peripheral vision in that quest. I have a wide monito

ChatGPTesters

The Association for Software Testing is crowd-sourcing a book,  Navigating the World as a Context-Driven Tester , which aims to provide  responses to common questions and statements about testing from a  context-driven perspective . It's being edited by  Lee Hawkins  who is  posing questions on  Twitter ,   LinkedIn , Mastodon , Slack , and the AST  mailing list  and then collating the replies, focusing on practice over theory. I've decided to  contribute  by answering briefly, and without a lot of editing or crafting, by imagining that I'm speaking to someone in software development who's acting in good faith, cares about their work and mine, but doesn't have much visibility of what testing can be. Perhaps you'd like to join me?   --00--  "Why don’t we replace the testers with AI?" We have a good relationship so I feel safe telling you that my instinctive reaction, as a member of the Tester's Union, is to ask why we don&

Postman Curlections

My team has been building a new service over the last few months. Until recently all the data it needs has been ingested at startup and our focus has been on the logic that processes the data, architecture, and infrastructure. This week we introduced a couple of new endpoints that enable the creation (through an HTTP POST) and update (PUT) of the fundamental data type (we call it a definition ) that the service operates on. I picked up the task of smoke testing the first implementations. I started out by asking the system under test to show me what it can do by using Postman to submit requests and inspecting the results. It was the kinds of things you'd imagine, including: submit some definitions (of various structure, size, intent, name, identifiers, etc) resubmit the same definitions (identical, sharing keys, with variations, etc) retrieve the submitted definitions (using whatever endpoints exist to show some view of them) compare definitions I submitted fro

Vanilla Flavour Testing

I have been pairing with a new developer colleague recently. In our last session he asked me "is this normal testing?" saying that he'd never seen anything like it anywhere else that he'd worked. We finished the task we were on and then chatted about his question for a few minutes. This is a short summary of what I said. I would describe myself as context-driven . I don't take the same approach to testing every time, except in a meta way. I try to understand the important questions, who they are important to, and what the constraints on the work are. With that knowledge I look for productive, pragmatic, ways to explore whatever we're looking at to uncover valuable information or find a way to move on. I write test notes as I work in a format that I have found to be useful to me, colleagues, and stakeholders. For me, the notes should clearly state the mission and give a tl;dr summary of the findings and I like them to be public while I'm working not just w

Make, Fix, and Test

A few weeks ago, in A Good Tester is All Over the Place , Joep Schuurkes described a model of testing work based on three axes: do testing yourself or support testing by others be embedded in a team or be part of a separate team do your job or improve the system It resonated with me and the other testers I shared it with at work, and it resurfaced in my mind while I was reflecting on some of the tasks I've picked up recently and what they have involved, at least in the way I've chosen to address them. Here's three examples: Documentation Generation We have an internal tool that generates documentation in Confluence by extracting and combining images and text from a handful of sources. Although useful, it ran very slowly or not at all so one of the developers performed major surgery on it. Up to that point, I had never taken much interest in the tool and I could have safely ignored this piece of work too because it would have been tested by

Build Quality

  The Association for Software Testing is crowd-sourcing a book,  Navigating the World as a Context-Driven Tester , which aims to provide  responses to common questions and statements about testing from a  context-driven perspective . It's being edited by  Lee Hawkins  who is  posing questions on  Twitter ,   LinkedIn , Mastodon , Slack , and the AST  mailing list  and then collating the replies, focusing on practice over theory. I've decided to  contribute  by answering briefly, and without a lot of editing or crafting, by imagining that I'm speaking to someone in software development who's acting in good faith, cares about their work and mine, but doesn't have much visibility of what testing can be. Perhaps you'd like to join me?   --00-- "When the build is green, the product is of sufficient quality to release" An interesting take, and one I wouldn't agree with in general. That surprises you? Well, ho