Skip to main content

Compare Testing


If you believe that testing is inherently about information then you might enjoy Edward Tufte's take on that term:
Information consists of differences that make a difference.
We identify differences by comparison, something that as a working tester you'll be familiar with. I bet you ask a classic testing question of someone, including yourself, on a regular basis:
  • Our competitor's software is fast. Fast ... compared to what?
  • We must export to a good range of image formats. Good ... compared to what?
  • The layout must be clean. Clean ... compared to what?
But while comparison as a tool to get clarification by conversation is important, for me, it feels like testing is more fundamentally about comparisons.

James Bach has said "all tests must include an oracle of some kind or else you would call it just a tour rather than a test." An oracle is a tool that can help to determine whether something is a problem. And how is the value extracted from an oracle? By comparison with observation!

But we've learned to be wary of treating an oracle as an all-knowing arbiter of rightness. Having something to compare with should not lure you into this appealing trap:
I see X, the oracle says Y. Ha ha! Expect a bug report, developer!
Comparison is a two-way street and driving in the other direction can take you to interesting places:
I see X, the oracle says Y. Ho hum. I wonder whether this is a reasonable oracle for this situation?
Cem Kaner has written sceptically about the idea that the engine of testing is comparison to an oracle:
As far as I know, there is no empirical research to support the claim that testers in fact always rely on comparisons to expectations ... That assertion does not match my subjective impression of what happens in my head when I test. It seems to me that misbehaviors often strike me as obvious without any reference to an alternative expectation. One could counter this by saying that the comparison is implicit (unconscious) and maybe it is. But there is no empirical evidence of this, and until there is, I get to group the assertion with Santa Claus and the Tooth Fairy. Interesting, useful, but not necessarily true.
While I don't have any research to point to either, and Kaner's position is a reasonable one, my intuition here doesn't match his. (Though I do enjoy how Kaner tests the claim that testing is about comparisons by comparing it to his own experience.) Where we're perhaps closer is in the perspective that not all comparisons in testing are between the system under test and an oracle with a view to determine whether the system behaviour is acceptable.

Comparing oracles to each other might be one example. And why might we do that? As Elaine Weyuker suggests in On Testing Non-testable Programs, partial oracles (oracles that are known to be incomplete or unreliable in some way) are common. To compare oracles we might gather data from each of them; inspect it; look for ways in which each has utility (such as which has more predictive power in scenarios of interest).

And there we are again! The "more" in "which has more predictive power" is relative, it's telling us that we are comparing and, in fact, here we're using comparisons to make a decision about which comparisons might be useful in our testing. I find that testing is frequently non-linear like that.

Another way in which comparison is at the very heart of testing is during exploration. Making changes (e.g. to product, data, environment, ...) and seeing what happens as a result is a comparison task. Comparing two states separated by a (believed) known set of actions irrespective of whether you have an idea about what to expect is one way of building up knowledge and intuition about the system under test, and of helping to decide what to try next, what to save for later, what looks uninteresting (for now).

Again this throws up meta tasks: how to know which aspects of a system's state to compare? How to know which variables it is even possible to compare? How to access the state of those at the right frequency and granularity to make them usable? And again there's a potential cycle: gather data on what it might be possible to compare; inspect those possibilities; find ways in which they might have utility.

I started here with a Tufte quote about information being differences that make a difference, and said that identifying the differences is an act of comparison. I didn't say at that point but identifying the ones that make a difference is also a comparison task. And the same skills and tools that can be used for one can be used for both: testing skills and tools.
Image: https://flic.kr/p/q8zmqn

Comments

  1. I got used in a previous role in assessing bugs for severity and impact. So I would often produce bug reports where the bug was reported as being severe but not very impactful (and sometimes I would even warn the dev about it and say something like "You will be marking that as Will Not Fix, I assume. I'm cool with that but don't tell anyone I said so.")

    Or I might mark a bug as not very severe at all but highly impactful, such as a typo that was actually a mis-spelling of the CEO's name (in which case, my private message to the dev would be "Fix that NOW or we will all be sacked!")

    Traditionally, the thing about the ancient Greek oracles was that their utterances were highly gnomic and subject to interpretation. And that interpretation was down to the interpreter's knowledge of what things were like in the Real World at that moment. On that basis, then, testing oracles are no different. I was comparing the technical nature of a bug and the ways it manifested itself with the likely impact in the Real World. And your likelihood of getting sacked because of an error in the code isn't something that can be easily quantified and definitely can't be automated! :-)

    ReplyDelete

Post a Comment

Popular posts from this blog

Can Code, Can't Code, Is Useful

The Association for Software Testing is crowd-sourcing a book,  Navigating the World as a Context-Driven Tester , which aims to provide  responses to common questions and statements about testing from a  context-driven perspective . It's being edited by  Lee Hawkins  who is  posing questions on  Twitter ,   LinkedIn , Mastodon , Slack , and the AST  mailing list  and then collating the replies, focusing on practice over theory. I've decided to  contribute  by answering briefly, and without a lot of editing or crafting, by imagining that I'm speaking to someone in software development who's acting in good faith, cares about their work and mine, but doesn't have much visibility of what testing can be. Perhaps you'd like to join me?   --00-- "If testers can’t code, they’re of no use to us" My first reaction is to wonder what you expect from your testers. I am immediately interested in your working context and the way

Meet Me Halfway?

  The Association for Software Testing is crowd-sourcing a book,  Navigating the World as a Context-Driven Tester , which aims to provide  responses to common questions and statements about testing from a  context-driven perspective . It's being edited by  Lee Hawkins  who is  posing questions on  Twitter ,   LinkedIn , Mastodon , Slack , and the AST  mailing list  and then collating the replies, focusing on practice over theory. I've decided to  contribute  by answering briefly, and without a lot of editing or crafting, by imagining that I'm speaking to someone in software development who's acting in good faith, cares about their work and mine, but doesn't have much visibility of what testing can be. Perhaps you'd like to join me?   --00-- "Stop answering my questions with questions." Sure, I can do that. In return, please stop asking me questions so open to interpretation that any answer would be almost meaningless and certa

Not Strictly for the Birds

  One of my chores takes me outside early in the morning and, if I time it right, I get to hear a charming chorus of birdsong from the trees in the gardens down our road, a relaxing layered soundscape of tuneful calls, chatter, and chirrupping. Interestingly, although I can tell from the number and variety of trills that there must be a large number of birds around, they are tricky to spot. I have found that by staring loosely at something, such as the silhouette of a tree's crown against the slowly brightening sky, I see more birds out of the corner of my eye than if I scan to look for them. The reason seems to be that my peripheral vision picks up movement against the wider background that direct inspection can miss. An optometrist I am not, but I do find myself staring at data a great deal, seeking relationships, patterns, or gaps. I idly wondered whether, if I filled my visual field with data, I might be able to exploit my peripheral vision in that quest. I have a wide monito

Testing (AI) is Testing

Last November I gave a talk, Random Exploration of a Chatbot API , at the BCS Testing, Diversity, AI Conference .  It was a nice surprise afterwards to be offered a book from their catalogue and I chose Artificial Intelligence and Software Testing by Rex Black, James Davenport, Joanna Olszewska, Jeremias Rößler, Adam Leon Smith, and Jonathon Wright.  This week, on a couple of train journeys around East Anglia, I read it and made sketchnotes. As someone not deeply into this field, but who has been experimenting with AI as a testing tool at work, I found the landscape view provided by the book interesting, particularly the lists: of challenges in testing AI, of approaches to testing AI, and of quality aspects to consider when evaluating AI.  Despite the hype around the area right now there's much that any competent tester will be familiar with, and skills that translate directly. Where there's likely to be novelty is in the technology, and the technical domain, and the effect of

Postman Curlections

My team has been building a new service over the last few months. Until recently all the data it needs has been ingested at startup and our focus has been on the logic that processes the data, architecture, and infrastructure. This week we introduced a couple of new endpoints that enable the creation (through an HTTP POST) and update (PUT) of the fundamental data type (we call it a definition ) that the service operates on. I picked up the task of smoke testing the first implementations. I started out by asking the system under test to show me what it can do by using Postman to submit requests and inspecting the results. It was the kinds of things you'd imagine, including: submit some definitions (of various structure, size, intent, name, identifiers, etc) resubmit the same definitions (identical, sharing keys, with variations, etc) retrieve the submitted definitions (using whatever endpoints exist to show some view of them) compare definitions I submitted fro

Testers are Gate-Crashers

  The Association for Software Testing is crowd-sourcing a book,  Navigating the World as a Context-Driven Tester , which aims to provide  responses to common questions and statements about testing from a  context-driven perspective . It's being edited by  Lee Hawkins  who is  posing questions on  Twitter ,   LinkedIn , Mastodon , Slack , and the AST  mailing list  and then collating the replies, focusing on practice over theory. I've decided to  contribute  by answering briefly, and without a lot of editing or crafting, by imagining that I'm speaking to someone in software development who's acting in good faith, cares about their work and mine, but doesn't have much visibility of what testing can be. Perhaps you'd like to join me?   --00-- "Testers are the gatekeepers of quality" Instinctively I don't like the sound of that, but I wonder what you mean by it. Perhaps one or more of these? Testers set the quality sta

Vanilla Flavour Testing

I have been pairing with a new developer colleague recently. In our last session he asked me "is this normal testing?" saying that he'd never seen anything like it anywhere else that he'd worked. We finished the task we were on and then chatted about his question for a few minutes. This is a short summary of what I said. I would describe myself as context-driven . I don't take the same approach to testing every time, except in a meta way. I try to understand the important questions, who they are important to, and what the constraints on the work are. With that knowledge I look for productive, pragmatic, ways to explore whatever we're looking at to uncover valuable information or find a way to move on. I write test notes as I work in a format that I have found to be useful to me, colleagues, and stakeholders. For me, the notes should clearly state the mission and give a tl;dr summary of the findings and I like them to be public while I'm working not just w

Build Quality

  The Association for Software Testing is crowd-sourcing a book,  Navigating the World as a Context-Driven Tester , which aims to provide  responses to common questions and statements about testing from a  context-driven perspective . It's being edited by  Lee Hawkins  who is  posing questions on  Twitter ,   LinkedIn , Mastodon , Slack , and the AST  mailing list  and then collating the replies, focusing on practice over theory. I've decided to  contribute  by answering briefly, and without a lot of editing or crafting, by imagining that I'm speaking to someone in software development who's acting in good faith, cares about their work and mine, but doesn't have much visibility of what testing can be. Perhaps you'd like to join me?   --00-- "When the build is green, the product is of sufficient quality to release" An interesting take, and one I wouldn't agree with in general. That surprises you? Well, ho

Make, Fix, and Test

A few weeks ago, in A Good Tester is All Over the Place , Joep Schuurkes described a model of testing work based on three axes: do testing yourself or support testing by others be embedded in a team or be part of a separate team do your job or improve the system It resonated with me and the other testers I shared it with at work, and it resurfaced in my mind while I was reflecting on some of the tasks I've picked up recently and what they have involved, at least in the way I've chosen to address them. Here's three examples: Documentation Generation We have an internal tool that generates documentation in Confluence by extracting and combining images and text from a handful of sources. Although useful, it ran very slowly or not at all so one of the developers performed major surgery on it. Up to that point, I had never taken much interest in the tool and I could have safely ignored this piece of work too because it would have been tested by

The Best Laid Test Plans

The Association for Software Testing is crowd-sourcing a book,  Navigating the World as a Context-Driven Tester , which aims to provide  responses to common questions and statements about testing from a  context-driven perspective . It's being edited by  Lee Hawkins  who is  posing questions on  Twitter ,   LinkedIn , Mastodon , Slack , and the AST  mailing list  and then collating the replies, focusing on practice over theory. I've decided to  contribute  by answering briefly, and without a lot of editing or crafting, by imagining that I'm speaking to someone in software development who's acting in good faith, cares about their work and mine, but doesn't have much visibility of what testing can be. Perhaps you'd like to join me?   --00-- "What's the best format for a test plan?" I'll side-step the conversation about what a test plan is and just say that the format you should use is one that works for you, your coll