Tuesday, April 6, 2021

Testing Stories


Testing Stories is a new ebook written by testers about their testing:

Software testing professionals from around the globe have volunteered to each share a story related to software testing, with the aim of inspiring others from their experiences.

The team have decided to donate all royalties sold through leanpub to Open Sourcing Mental Illness, as we collectively value their mission statement and work in mental health in the tech and open source communities.

This is a fun project and, more importantly, a good cause. I was happy to contribute along with Dave Symonds, Richard Hill, Sandhya Krishnan, Feroz Ally, Venkat Ramakrishnan, Louise Harney, Haroon Sheikh, Haroon Sheikh, Patricio Miner, Bruce Hughes, Felipe Knorr Kuhn, Louise Woodhams, Mike Harris, Aaron Flynn, Ady Stokes, Kim Cote, Peter Johnson, Ben Dowen, Claire Reckless, Laveena Ramchandani, Andrea Jensen, Steven Devonport, Sapan Parikh, Beth Marshall, Prashant Hegde, Johanna South, Suman Bala, Michael "Fritz" Fritzius, Lisa Crispin, John McGee, NITHIN S.S, and Gemma Ashfield.

Sunday, March 28, 2021

Exploratory Tooling

Last week I started a new job. The team I've joined owns a back-end service and, along with all the usual onboarding process, inevitable IT hassles, and necessary context-gathering, one of my goals for my first week was to get a local instance of it running and explore the API.

Which I did.

Getting the service running was mostly about ensuring the right tools and dependencies were available on my machine. Fortunately the team has wiki checklists for that stuff, and my colleagues were extremely helpful when something was missing, out of date, or needed an extra configuration tweak.

Starting to explore the service was boosted by having ReDoc for the endpoints and a Postman collection of example requests against them. I was able to send requests, inspect responses, compare both to the doc, and then make adjustments to see what effects they had.

If that's testing of any kind, it's probably what I call pathetic testing:

There's this mental image I sometimes have: I'm exploring the product by running my fingers over it gently. Just the lightest of touches. Nothing that should cause any stress. In fact, I might hardly be using it for any real work at all. It's not testing yet really; not even sympathetic testing although you might call it pathetic testing because of itself it's unlikely to find issues.

One of the functions offered by the service is a low-latency search endpoint which enables autocompletion on the client side. You know the kind of thing; as the user types, suggestions appear for them to choose from. 

The doc for this is fine at a high level. I was interested to understand the behaviour at a lower level but found Postman (with my level of expertise) required too many actions between requests and made comparison across requests difficult. The feedback loop was too long for me.

So I wrote a tool. And if that sounds impressive don't be fooled: I was standing on the shoulders of giants.

What does my tool do? It's a shell script that runs curl to search for a term with a set of parameters, pipes the response through jq to parse the hits out of the JSON payload, and then uses sed to remove quoting. Here's the bare bones of it:




curl --location $URL | jq '.results | @csv' | sed 's/[\"\\]//g'
A series of runs might look like this, if I'm exploring consistency of results as I vary the search term starting with "te":
$ search x y te

$ search x y ter

$ search x y term

$ search x y terms

$ search x y termst

$ search x y terms

$ search x y term
Or perhaps like this if I'm exploring the parameters:
$ search x y test

$ search A y test

$ search A B test

There is nothing clever going on here technically but I get a major benefit practically: I have abstracted away everything that I don't need so that my next test can be composed and run with minimal friction. I can quickly cycle through variations and compare this and previous experiments easily. Feedback loop tightened.

Actually, when I said "I have abstracted away everything that I don't need" what I really meant was "I have a very specific mission here, which is to look at how search terms and parameters affect search results. Because I'm on that mission, I choose not to view all of the other data returned by the server on each of my requests. I may miss something interesting by doing that but I accept the trade-off".

That aside, there are numerous things that I could do with this tool now that I have it. For example:

  • Write a script with a list of search terms in it and call each of them in turn, collect the results and write them to a file that I could analyse in, say, Excel. 
  • Point it at a production server as well, and compare my test environment to production in every call.
  • Launch hundreds of these searches at the same time from another script as a simple-minded stress test.

Or I could just throw it away because it has served its purpose: facilitating familiarisation with a feature of our service at low cost and high speed, initiating source code inspection and conversations, and along the way helping me to find a few inconsistencies that I can feed back to the team.
Image: https://flic.kr/p/8pCp4W
Highlighting: https://pinetools.com/syntax-highlighter

Tuesday, March 2, 2021

Unit Testing? Grow Up!

Yesterday, I attended Unit Testing for Grown ups, a webinar with Gil Zilberfeld. Despite the title, it wasn't a tech-heavy tour through testing strategies, test doubles, or testability but instead a series of business arguments in favour of testing, primarily at the unit level. 

If I had to sum it up in a couple of bullets, it'd be something like this:

  • developers, your work does not start at the first line of production code and end when you push
  • managers, your teams can probably work smarter

And if I had a few sentences, it'd go like this:

The goal of a software development group is to solve problems by writing code that "works" and is maintainable. 

It is usually the case that as a codebase expands unintended side-effects occur more frequently and the costs of integration, testing, and bug fixing grow.

How to reduce these costs? Test first, and mostly at the unit level.

But the tests have to be good tests so developers should train their creative skills for thinking about possible problems, and then provide coverage for the right set of them. 

This will be easier if the code is testable. Which means that testability needs to be part of the design of the system. 

Test-Driven Development could be a great way to address all of the points above but it can be hard to get your head around the cost-benefit payoff (as a developer and manager), hard to begin, and then also hard to maintain the discipline.

Bug Advocacy

This month I've been taking the Bug Advocacy course at the Association for Software Testing. It's been ten years since I took the introductory Foundations course, the first in the Black Box Software Testing series, and with this degree of hindsight I can see how fundamental that was in how I like to test.

I've done plenty of learning in the decade since I started testing, so much of the material in Bug Advocacy is not new to me. That doesn't detract from the value of the course. I've taken the opportunity to refresh my memory, and to look at how the other students interpret the same material and how they go about the practical exercises, and compare that to my own approach.

I love that these courses are run with small cohorts, emphasise practice to reinforce theory but also to ask questions of it, and require that students review each other's work as an aid to learning.

Each week there are exercises that have the students interact with each other and software. We then write reports and reviews which themselves are reviewed and reported on. If that sounds convoluted, or even meta, don't be fooled: on a bug reporting course in particular, the concentration on data gathering, organisation, and dissemination is incredibly rewarding.

Tester credibility and influence is a key element in the course, emphasised repeatedly. If we, and the information we provide, and the actions we take, have the respect of those we work with then we are more likely to be able to help decision-makers make the right kinds of decisions, including which bugs to schedule for fix, and when.

  The course materials can be found at testingeducation.org.

Thursday, February 25, 2021

The Spec, But Why?


I'm in the middle of BBST Bug Advocacy at the Association for  Software Testing right now.  As you might imagine, on a course with that name there's been plenty of talk about what we mean by terms like bug, value, and quality. One of the great things about the four-week course is the mix of book work and application, so we students are repeatedly challenged with situations in which the learning can be practically applied.

I have a lot of time for both Seth Godin and Shane Parrish so I'd have been listening carefully to Seth's appearance on the Knowledge Project podcast anyway but, given the context I'm in, the passage I've transcribed below stood out. It's about how the concept of quality is concretised as conformance to spec, and how that in turn directly drives physical actions. It starts at around 1:04:45:

There's lots to be said about the spec. First lets talk about Edwards Deming and what spec and quality mean. Quality is not luxury, quality is not expensive, quality is not that you love it, quality is just one thing: it meets spec.

So if I look under an electron microscope at any part of a Lexus, which is by any measure the highest quality car there is, it's filled with defects. But they're not defects that matter, because they're defects that are within spec.

So, we begin by understanding what is the spec of the work we're going to do? If it meets spec, not only is it quality but it is good enough, and good enough is not a slur. Good enough is a definition: it met spec.

So once it's good enough we ship the work. If you're not happy with that, change your spec. 
But let's be really clear about what the spec is. That what it meant for a Lexus to be good enough when they first came out was it had to be a standard deviation better than a Mercedes. That was their definition.

If someone's gonna say "no, no, we can't ship this Lexus because it's not perfect!" the product manager should say "no, no, no, it was never supposed to be perfect, it never can be perfect, it simply met spec."

The hard work was in defining the spec, a spec that will get you to the next step.

What I like about this perspective on quality (and it's similar to Crosby's view) is that it emphasises understanding what you're trying to create, to what standard, for who. 

What I particularly like about the way it was described here is those last few words: "get you to the next step". That other stuff is the what, but this is the why.
Image: https://flic.kr/p/5ZziiY

Friday, February 12, 2021

Secret Agency

I've listened to every episode of Gene Kim's Idealcast, a podcast about "the important ideas changing how organizations compete and win." If that terse statement sounds desert dry to you, then think again, the show is a wide open ocean of practical experience and considered theory.

I particularly enjoyed the one with Elisabeth Hendrickson whose playbook has chapters on management, software development, people development and more, along with the testing chops of Explore It! that she's particularly known for in our field.

Gene's presentation style is that of a knowledgeable friend, making space for his guests to lay out their perspective on a topic, and giving each insight a big "yes, and". He injects verbal sidebars into the podcast from time to time, pausing the interview to zoom in on a point that was made in passing and direct the listener to references that will give background, or talking about how a specific example made the concept click for him, or highlighting a connection to thoughts aired in another of the podcasts.

So, when he tweeted a stream of notes quoting Elisabeth, and pointing to her "FABULOUS DOES20 talk ... Influence > Authority and Other Principles of Leadership" I was ready to put down what I was doing and watch it. 

Which I did, and it is fabulous, and if I can boil the message down into a couple of sentences, it would be something like this: regardless of whether you are in a position of hierarchical power in your organisation your ability to be a change agent depends on your actions; you do have agency even if you don't realise it yet. To help to effect change you should act to demonstrate the value of the change,  praise behaviours that move the change along, visualise the motivation for the change, and amplify the voices that have evidence for that motivation.

Tuesday, February 9, 2021

Bog-Standard Testing

Bear with me, there is testing here but first we need to talk about my toilet. 

Yeah, my toilet. Its flush was getting weak and though I was pleased to have a preliminary diagnosis (siphon membrane) I wasn’t happy about the likely treatment. I’ve changed siphons before and it’s always taken me ages, with a lot of pieces to disassemble and reassemble. 

Worse, I have never managed to get all the bits refitted on my first attempt without a drip appearing somewhere. That’s usually how the second attempt goes too. Worse still, on this toilet all of the moving parts are hidden behind panels. How was I supposed to do anything with that?

But I know from past procrastination that I get nothing accomplished without starting, so I began by looking and feeling, wondering whether I’d need to apply some force to flex a sprung catch, undo a screw thread that hasn’t moved since Noah was a lad, or pull a stiff doodah out of a tight watchamacallit.  

I’ve learned over the years that it’s worth compromising between the desire for ingress and the risk of breaking something with misapplied effort. When I find myself reaching for a screwdriver to muscle open a joint that no screwdriver fits into, I try to remember to take a step back and decide consciously to go ahead, or not.

The vertical front panels on my toilet seemed likely to give better access than the horizontal button panel so I inspected them for signs of previous movement or fixings, but found nothing. Pushing and pulling didn’t yield anything either, and the screwdriver stayed firmly in the toolbox.

I moved on to the button panel. With a little light pressure I could tell the plate would open up somehow and eventually determined that it popped off if I slid it right and lifted, freeing four plastic catches holding it down and releasing a spring keeping it in place.

Now, with my big talk about siphons and leaks and all that, I may have sounded knowledgeable, but let me clear here: my mental model of this toilet was very flawed. I was expecting to see something mechanical under the buttons, like this:


Instead it looked pneumatic, with buttons pushing air down the tubes to what I still assumed was the siphon. (Spoiler: I was wrong about that too.)

I had a look inside the hole to see what access would be like (it was tight, duh!) and noted that the front panels were screwed on and I could remove them later if I wanted or needed to (good). The back of the plate had a maker’s name and some other details:

Decision point. Tinker a little, call a plumber, or see what I could find online for this kind of cistern setup? With no urgent requirement for a solution I decided to spend a few minutes gathering information. 

There was an email address in all the guff at the bottom right of the panel, at the domain asseur.com. Chuckling gently to myself about the rear-endedness of the URL for a toilet company, I ran into this:

Perhaps they had changed to a domain with less unintentional comedy potential? I backed off to a general web search and got a couple of hits on cylex-uk.co.uk, a site I’d never heard of but which turns out to be some kind of company directory. 

The first hit gives up a postal address and a link apparently to Ideal Standard, a company that I know manufactures bathroom fittings.

Click! Bingo! 

I wondered whether Ideal Standard might have taken over American Standard but if they did, they’re not advertising it very prominently:

No matter, there’s a link to a spares site at the bottom of the page but, booooo!, a broken redirect:

The second Cylex hit was a dead end except for a phone number. I saved that for later.

My next line of attack was searching YouTube for American Standard toilets. At that point, I was trying to find something that would help me decide how to proceed. Maybe someone had the same model with the same problem and demonstrated their solution in a video?

No such luck, but I did discover that American Standard is still a going concern and can be found at americanstandard-us.com. That “-us” suffix is interesting; perhaps UK business was dropped at some point which would be consistent with an Ideal Standard takeover.

There was a spare parts link here too. No error, but it was sooooo sloooooow that I abandoned that investigation. Again, I could always try again if I needed to.

I continued with YouTube, using search terms related to the parts of the toilet and the problem I had such as cistern, siphon (and syphon), flush weak, not flushing, replace, fix, and so on. 

I wasn’t finding much that was directly relevant but I did skip through some of the videos to see whether they featured any parts that looked similar to what was in front of me.

Next I tried another couple of sources I’ve found useful in the past: Amazon and eBay. The beauty of these sites is that sellers often list parts that fit multiple makes and models of an appliance and list all of the makes and models.

This gave something interesting, some more references to Ideal Standard and a part that looks close to the one I could see through the hole under the button panel, second right in the image below.

Time to pause. I wasn’t getting anywhere fast searching for American Standard and it looked like there might be a link between them and Ideal Standard. If I could understand the link, perhaps I could refine my search. So to Wikipedia!

Yes, here’s the result I was after: 

The kitchen and bathroom division was sold off ... the global business [also] became Ideal Standard.

That event is dated 2007. I’ve only been living in this house for 12 months so I don’t know the age or provenance of the toilet but the amount of dust and grime around the cistern suggests it could’ve been in place for years. 

At that point I decided to work on the assumption that Ideal Standard information and maybe also parts were what I needed. I recalled that the reverse of the button plate had some part numbers on it and hit paydirt with the next search!

In fact, I also found that the same results come up with just the part numbers which means I could have got to this point immediately had I started with them. 

Boooooo? Well, no. I didn’t consider the work I’d done up to then wasted. I didn’t spend many minutes on it and I came away with reasonable confidence that the Ideal Standard parts are relevant. 

Hindsight is a tool for learning, not for beating yourself up with.

In any case, I didn’t yet know what I was looking for. I was still trying to understand the problem, what scope I’d got for fixing it right then, what parts I needed and might be able to replace myself, and whether I'd need to call someone in.

I decided to search for all three of the part numbers on the back of the button plate and then skimmed the descriptions and photos across multiple sites. I’ve found over the years that the product listings, and especially the purchaser reviews, at different vendor sites are often complementary and you can build up a bigger picture by combining them. 

The results below are for SVO4567 and I scanned them for photos from different angles, how-to videos, dimensions, or maybe even manufacturer instruction sheets.

I struck gold in an Amazon product description where, unexpectedly, someone had dropped diagnostic clues:

If your Ideal Standard Pneumatic flush valve isn't flushing properly ... it can be this bellows washer ... You can test [this]. Simply disconnect the Pneumatic tube that it attached to your push button and blow down the tube. If the valve doesn't lift properly as you're blowing down it, it will more than likely be this bellows washer. If the valve lifts with ease, it could very well be an issue with the button instead. (sic)

Visual inspection and comparison of the photos and description to the pieces of toilet in front of me suggested that I was in the right area. Bonus, I learned some terminology; the thing I had been calling a siphon is actually a valve:

Second bonus: the discovery that the term of art for hidden cisterns is concealed. Using that word I could get straight to a video on plate removal that would have been handy when I started. Again, them’s the breaks. 

Having the right lexicon also helps with credibility when talking to others. I’ve found that I get more information and time from an expert when I can show I’ve looked into their area rather than just called them up to say “boo hoo thingy broken costy fixy?”

Time to take stock again. At this point I wasn’t afraid to tinker because I knew that even if I broke the pieces in front of me, parts were available and the toilet could still be flushed with a bucket of water if needed. 

So I dismantled all the accessible pieces, pulling the tubes off the top of the valves, pulling the valve caps off, removing the rubber bellows (the black corrugated discs you can see in the SVO4567 photos above) beneath them, and so on. I found that there’s a cylinder inside each valve and pushing them down provokes a powerful flush so I was confident that the problem was between the button and the valve top somewhere. Better yet, I wouldn’t need a bucket to flush the loo if I did break something.

The two buttons, and correspondingly the valve cylinders, are for full flush and half flush. I couldn’t see a significant difference between the flushes but as I’d read at least one product reviewer saying that they had the same experience I didn’t investigate further.

In passing, I wondered how to compare flushes and found that test poo is a thing (!)

There’s a diagram on the back of the button plate that shows which valve is for full flushes. When I took all the pieces off, I’d noticed that the big (full flush) button was connected to the half-flush cylinder. Did I mention that I took photos before removing any pieces so I know how to put them back?

My next move was to try connecting the buttons up in the recommended configuration on the hypothesis that someone had previously noticed a weak flush and simply swapped the tubes over. Nice idea, I had been wondering if I might be able to do that myself as a quick and dirty fix.

It did appear that the power of the flushes was worse that way, so it’s possible that’s what had happened. There were few enough combinations of button, bellows, and cylinder that I could try them all, so I did, but none showed a significant improvement. 

I also tried blowing down the pneumatic tubes as the diagnostic text at Amazon had suggested. I learned something there too: it takes a lot of air pressure to force the valve cylinder down and the only thing getting flushed was me.

Where had I got to at this point? Well, I was reasonably confident that I knew a set of components that contained the problem and I knew that I could order a service pack (SVO4567) for that set. The existence of a specific pack made me think that it was worth replacing all of them together, and that’s what I decided to do.

I wasn’t quite finished with the research though. I wanted to make sure I bought genuine parts from a vendor with a good reputation and paid the going rate with delivery in a reasonable time. I had a list of sites (yeah, I bookmarked them as I went) so that just took a few minutes and around 30 quid. 

Within a couple of days the parts were at my house and I was confident that I could fit them and that I’d know quickly whether they’d improved the situation. While I was fiddling I again tried fitting the big button to the full-flush cylinder but my impression was that it gave a marginally weaker flush so perhaps the valve is on the way out too. I’m not bothered though, the flush is strong enough that it’s not a problem so I left that alone for now.


Great, you’re probably thinking, nice story and I’m really pleased your toilet is working again, but how is this relevant to me? Where’s the testing here?

Good questions and I’d answer that it’s about the approach, the skills, the tool selection, the framing, the intent, the engagement, the decision-making, the documentation. What I did in the bathroom overlaps considerably with how I approach testing.

Imagine I’d described this scenario:

I was presented with a system I hadn’t seen before and told that it had a problem. I have tested similar products so the observed symptoms gave me a working hypothesis but, without knowledge of the specific system, I had difficulty even getting into a position to frame an experiment.

I moved into discovery mode and explored the external surfaces of the system under test, eventually discovering a way to expose information about its back end. It turned out to be an old technology and I had to do research, which included wandering down some dead ends, before finding that it had an API in common with a modern library.

Unfortunately, the API wasn’t well documented but by carefully stitching together information from a bunch of blogs, open source projects, and YouTube videos while exercising the API using a client I know well, I was able to gather enough information to understand some of what I was looking at.

I could now tell that my original hypothesis about the cause of the issue was incorrect but I also had the tools and knowledge to run some experiments which resulted in the discovery that the system under test’s front-end HTTP module was outdated and so partially incompatible with the back-end. 

Upgrading it was easy enough and now the SUT’s client and server are talking happily again. I also have a reasonable mental model of part of the system and I’ve learned about some technologies I was unfamiliar with.

That sounds more like testing, right? But it’s exactly the structure of the investigation I did for my toilet. It also explains why you hear people make analogies between testers and journalists, detectives, and scientists — those roles, done well, also require the ability to explore a space of possibilities, seek the unknown unknowns, and make connections between disparate data.

Jon Bach drew a wonderful representation of his exploratory testing that speaks to me about this:

The Process of Design Squiggle by Damien Newman has similar characteristics. Designers explore too.

I’ve tried to represent the non-linearity and iteration and reiteration in data gathering and hypothesis generation and experimentation that I see and feel in testing multiple times. Here’s one attempt:

It’s exploration, and that’s how I like to test: by exploring


The thing about exploration is that it’s not easily predictable, not deterministic, and heavily dependent on who’s exploring, what they’re trying to do, the order they do it, what they encounter along the way, and how they react to all of that.

In any non-trivial exploration there will be choices to make. With each choice there is an opportunity cost: by choosing to do some particular thing (right now) you choose not to do many possible something elses. (“Decision point. Tinker a little, call a plumber, or see what I could find online for this kind of cistern setup?”)

I try to be alert to choice points because intent is a crucial component of testing for me. At each stage I’ll understand what my current mission is, and then use my choices to deliberately guide my exploration towards the completion of that mission. (“At that point, I was trying to find something that would help me decide how to proceed.”)

This kind of guided or managed exploration is a wonderful way to bootstrap knowledge, trading breadth and depth of investigation, avoiding rabbit holes and shallow time-wasting to navigate towards value in the time available. 

One way to create explicit choice points is time-boxing. I think of this as a way of managing my investment. How much time am I prepared to invest in this next step? How important is it to me to find out the thing I am looking for here? If that’s difficult, can I find a smaller step that will help me to answer it and invest a small amount of time in that? (“I decided to spend a few minutes gathering information.”)

When I explore, even with intent, I often find that in retrospect I could have taken a more direct route to some result. That’s normal. I treat it as learning and try to use it to inform my testing choices in the future. (“I could have got to this point immediately.”)

Learning won’t just be for the future, though. Each step, each iteration, each connection made can incrementally add or adjust knowledge (“my mental model of this toilet was very flawed”). Again, this is normal and in fact desirable. There is no shame in finding that a hypothesis was incorrect and updating perceptions as a result. 

Keeping track of what I’ve learned, the routes I’ve tried, and potential new starting points helps to guide the exploration and enhances my ability to cross-reference. (“I took photos before removing any pieces”, “I bookmarked them as I went.”) Different degrees of record-keeping will be appropriate depending on the mission and the explorer.

Using the system under test and other sources as mutual oracles can be immensely valuable. Cross-referencing observations of the SUT with information found elsewhere gives three options: a problem in the system, some duff information, or both. 

Exploration in and of itself says nothing about the tools used to explore with. Knowing tools that I use frequently well, and knowing about the existence of tools for tasks I’ve never performed, helps me to have options. I was telling my friend Neil about the work I’d done and he asked whether I’d used a reverse image search on the button panel. I hadn’t and it hadn’t even occurred to me to try. The tool is now in my kit for next time. 

Analogies are never perfect, though. What might make my toilet story and my testing mission different? Well, there’s no developer character in my story. If I’d called a plumber in, someone with deep hands-on experience, that would have been similar to asking the developer of the system I’m working on. In this case, an experienced plumber would likely have diagnosed and fixed my toilet in about, umm, 30 seconds instead of a couple of hours.

If the toilet had been leaking gallons of water into the bathroom I would definitely have called a plumber. But the stakes weren’t high, and I explored in such a way that the stakes remained low (“the toilet could still be flushed with a bucket of water if needed”).

When I have had tradespeople in, I’ve sometimes asked if it would be OK to watch them work, or asked them to explain how they diagnosed the problem once they’ve finished. I’ve seen developers be amazing at exploration too  — I’m not saying it’s a tester superpower — and I’ve learnt much sitting at the shoulder of a colleague while we talk about what the problem is, where the faults might lie, how we might experiment to discover more, how we might work around what we find, and so on. 

I’m interested in, and get great pleasure from, the investigation, and in broadening my skills and knowledge. As in life, in testing; my default is generally to look into a product issue I’ve observed before reporting it, for an appropriate amount of time given the context

If your testing extends from picking up a ticket through to exercising precisely the acceptance criteria as described, and closing the ticket then I’m afraid you’re not seeing the intellectual, social, and emotional pleasures of exploration, you’re probably treading and retreading the same paths, and, at best, you’ll be doing bog-standard testing. 


Selected further reading:

Thank you to Conor Fitzgerald and Neil Younger for the reviews and suggestions.