Thursday, March 15, 2018

Testing For Me

I spoke at UKSTAR 2018 this week, an eight-minute talk in a Storytelling track. This post is a prettied-up version of the notes I made for it along with some of the slides. The full slide deck is in the Media page.

My story is called The Anatomy of a Definition of Testing. It's not a suspense story though, so I'll give you the definition right up front:
Testing is the pursuit of relevant incongruity.
That is, for me, testing is the pursuit of relevant incongruity. But how did I get there?

Well, the journey started with Explore It! by Elisabeth Hendrickson, a great book about exploratory testing which has, near the beginning, this definition:
Tested = Checked + Explored
It's finessed a little by
Neither checking nor exploring is sufficient on its own
and the idea that testing is to
... interact with the software or system, observe its actual behavior, and compare that to your expectations.
Interestingly, the definition doesn't really play a significant part in the rest of the book, but as I was reading I kept on returning to it, asking myself questions like these:

Eventually I thought I'd ask Elisabeth whether she could help me out (she said she would!) and we exchanged a few emails, and that helped me to clarify my thoughts. But during the conversation, I began to wonder was this process, of thinking about testing, itself testing? I mean, I was doing things that were consistent with things I do when testing:

But was I really testing? By Elisabeth's definition, I wasn't sure that I could say I was. But it felt a lot like testing. So I went looking for other definitions and found loads!

And I recognise aspects of all of those in my testing, but none of them capture all that testing is for me. Reflecting further, I remembered a talk that Rikard Edgren gave at EuroSTAR 2015 where he said this beautiful thing:
Testing is simple: you understand what is important and then you test it.
Adam Knight has talked and written about Fractal Exploratory testing, and describes it like this:
as each flaw ... is discovered ... [a] mini exploration will result in a more targeted testing exploration around this feature area
To me, they're both talking about how investigation of one area leads to investigation in a subset of that area, and in a subset of that area. A kind of traversal through the testing space where the actions that are performed at any level are similar. And I recognise that too, but it's not all that testing is. For me.

I tried to draw my testing on a feature. It looked like this.

Sometimes multiple activities feed into another. Sometimes one activity feeds into multiple others. Activities can run in parallel, overlap, be serial. A single activity can have multiple intended or accidental outcomes ...

I tried to draw it again. It looked like this:

A vision of an attempt to somehow keep a lid on the ambiguity, and the unknowns, and the complexity in order to be able to get on and test.

A colleague pointed me at a drawing by John Bach of his testing.

That scribble on the left is not necessarily confusion and chaos, but cycling, cross-checking, confirming, until a course of action that seems propitious can be identified and followed out to the right with occasions when exploration goes deep down. And, naturally, I recognise that in my testing too. But it isn't all that testing is for me.

So, without a definition but with a lot of thoughts about what I wanted from my definition I tried to list factors of testing and then come up with a definition that covers them all.

And I thought about it for a good long time. I mean, really long. And eventually out popped my definition:
Testing is the pursuit of relevant incongruity.
But that's a bit of a mouthful. Let's unpack it a little:

Incongruity: Oxford Dictionaries define this as "not in harmony or keeping with the surroundings". I interpret lack of harmony as a potential problem and lack of keeping as an actual problem, and those two interpretations are interesting and useful in testing.

Pursuit: Again, there are two senses that capture important aspects of testing for me. You can pursue something that you don't know is there and that you may never find, like a dream. Or you can pursue the solution to a problem that you know you have, that's right in front of you.

Why not some other verb? For me, an investigation has identified something to investigate, finding requires something to be found and I want to be able to say I tested even if I found no issues, exploration can work but I don't want my definition to be thought of as a definition of exploratory testing, much as I love it.

Relevant: if this work doesn't matter to anyone, why are we doing it? Whoever that is can help us to understand whether any incongruities we identify are valuable to them, relevant to the project.

So that's my definition:
Testing is the pursuit of relevant incongruity.
Notice that it says nothing about particular techniques or methods, or products, or systems. It exists, deliberately, in a space where it must be applied in context, at a time, for a purpose.

But what do I do with it?

Well, any time I'm working I can ask myself whether what I'm doing is contributing the pursuit of relevant incongruity. If it is, I'm testing — I still have the question of whether I'm testing the right things, at the right time but that's a different problem for another story.

If I'm not in pursuit of relevant incongruity I can ask whether I should be, and why. Sometimes it's legit to take off your tester hat and do something else, project housekeeping for example, because it needs doing, because you've got the skills, because you can be most efficient, because it's your turn, or whatever. But sometimes it can provoke me into thinking that I'm not doing what I should be.

Which is great, and I buzzed along very nicely using it. And then I heard Michael Bolton say this on the Quality Remarks podcast:
The goal of testing is identifying problems that matter
And I thought "Hello! That feels pretty familiar!" Although "problems" loses the subtlety of "incongruity", and "finding" I already said I have some reservations about, and note that he's talking about the "goal" of testing, not testing itself. But still, there's a similar sentiment here, and look how much snappier it is than mine!

So I asked him about it, and he said "many things stay in the mind more easily when they can be expressed concisely and snappily."

Which is true to my experience, and also very useful, because it emphasises his different need. He's a teacher, and he wants to share his description and have his pupils remember it. And that's OK, it keeps me humble: I shouldn't impose my view of testing on others, and other views can be just as valid.

And so that's my definition:
Testing is the pursuit of relevant incongruity.
It's my definition to help me do what I need to do in my context. It's for me. But I hope it was of some interest to you...
Image: Jit Gosai (via Twitter)

Wednesday, March 14, 2018

Pen and Storyteller

The latest in my occasional series of experiments in sketchnoting, this time from UKSTAR 2018. The sketchnote/video thing I did as a promo for my own storytelling talk is still available here

Friday, March 9, 2018

Decision By Precision?

I2E, the flagship product at Linguamatics, is a text mining engine and so sits in the broad space of search tools such as grep, Ctrl-F, and even Google. In that world, evaluating "how good" or "how relevant", or the "correctness" of a set of search results is interesting for a number of reasons, including:

  • it may be hard to define what those terms mean, in general cases.
  • it may be possible to calculate some kind of metric on well-understood, small, data sets but less so at scale. 
  • it may be possible to calculate some kind of metric for simple searches, but less so for complex ones.
  • on different occasions the person searching may have different intent and needs from the same search.

But today we'll concentrate on two standard metrics that can be easily defined and which have agreed definitions: precision (roughly "how useful the search results are")  and recall (roughly "how complete the results are").

Imagine we want to test our search engine. We have a set of documents and we will search them for the single word "testing". The image below, from Wikipedia, shows how we could calculate the metrics.

There's a lot of information in there, let's unpack some of it:
  • The square represents the documents. 
  • The solid grey circles are occurrences of the word "testing".
  • The clear grey circles are occurrences of other words.
  • The central black circle is the set of results from the search.
  • The term positive means that a word is in the results.
  • The term negative means that a word is not in the results.
  • The term true means that a word is classified correctly.
  • The term false means that a word is classified incorrectly.
Let's overlay some numbers to make it clear. We inspect the documents (outside of the SUT) and find that there are 1000 words of which 100 are occurrences of the word "testing" (these are the solid grey circles.)

We run our search using the SUT and get back 50 results (the central black circle). We inspect those results and find that 35 are the word "testing" (the true positives) and 15 are something else (the false positives - asserted to be correct, but in fact incorrect).

The pictographs at the bottom of the image give us the formulae we need: precision comes only from the set of results we can see, and in this case is 35/50 or 70%. Recall requires knowledge of the whole set of documents, and for us is 35/100 or 35%.

A striking difference but which is better? Can one be better? These things are metrics, so can they be gamed?

Well, if the search simply returned every word in the documents its recall would be 100/100, or 100%, but precision would be very low at 100/1000, or 10%, because precision takes the negative content in the search results into account.

So can you get 100% precision? You certainly can: have the search return only those results with an extremely high confidence of being correct. Imagine only one result is returned, and it's a good one, then precision is 1/1 or 100%. Sadly, recall in this case is 1/100 or 1%.

Which is very interesting, really, but what does it have to do with testing?

Good question; it's background for a rough and ready analogy that squirted out of a conversation at work this week, illustrating the appealing trap of simple confirmatory testing. Imagine that you run your system under test with nominated input, inspect what comes out, and check that against some acceptance criteria. Everything in the output meets the criteria. Brilliant! Job done? Or precision 100%?


Saturday, March 3, 2018

Better Averse?

What is testing? In one of the sessions at the Cambridge Software Testing Clinic the other night, the whole group collaborated on a mind map on that very question.

I find it interesting how the mind can latch on to one small aspect of a thing. I attend these kinds of events to find something new, perhaps a new idea, a different perspective on something I already know, or something about the way I act. In this case, under a node labelled mindset, one of the participants proposed risk-averse. I challenged that, and counter-proposed risk-aware. You can see them both on the map on this page, centre-left, near the bottom. And that's the thing I've been coming back to since: accepting that there is a testing mindset (with all the sociological and semantic challenges that might have) is it reasonable to say that it includes risk aversion or risk awareness?

Let's start here: why did I challenge? I challenged because the interpretation that I took in the moment was that of testers saying "no, let's not ship because we know there are problems." And that's not something that I want to hear testers who work for me saying. At least not generally, because I can think of contexts in which that kind of phrase is entirely appropriate. But I won't follow that train of thought just now.

But do I even have a decent idea what risk-aversion is? I think so: according to Wikipedia's opening sentence on the topic risk-aversion "is a preference for a sure outcome over a gamble with higher or equal expected value." In my scenario, not shipping is a more certain outcome (in terms of risk that those known problems will be found in the product by customers) than shipping. But I can think of other aspects of shipping that might be more certain than in the case of not shipping. Perhaps I'll come back to them.

But then, do I have a decent idea what the person who suggested risk-aversion meant by it? In all honesty, no. I have some evidence, because they were keen on risk-awareness when I suggested it, that they overlap with me here, but I can't be sure. Even if they were with me right now, and we could spend hours talking about it, could we be certain that we view this particular conceptual space in the same way? A good question for another time.

So what did I mean by risk-aware? I meant that I want testers who work for me to be alert to the risks that might be present in any piece of work they are doing. In fact, I want them to be actively looking for those risks. And I want them to be able to characterise the risks in various ways.

For example, if they have found an issue in the product I'd like them to be able to talk to stakeholders about it, what the associated risks are, why this matters and to who, and ideally something about the likelihood of it occurring and something about impact if the issue was to occur. I'd also like my testers to be able to talk to stakeholders in the same way about risk that they observe in the processes we're using to build our product, and also in the approach taken to testing, and in the results of testing. If I thought harder could I add more to this? Undoubtedly and perhaps I will, one of these days.

While we're here, this kind of risk assessment is inherently part of testing for me. (Perceived) risk is one of the data points that should be be in the mix when deciding what to test at any given time. Actually, I might elaborate on that: risk and uncertainty are two related data points that should be considered when deciding what to test. But I don't really want to open up another front in this discussion, so see Not Sure About Uncertainty for further thoughts on that.

Would it be fair to say that a tester engaging in risk-based testing is risk-averse to some extent or another? Try this: through their testing they are trying to obtain the surest outcome (in terms of understanding of the product) for the stakeholders by exploring those areas of the product that are most risky. So, well, yes, that would seem to satisfy the definition from Wikipedia wouldn't it?

Permission to smirk granted. You are observing me wondering whether I am arguing that a tester who is risk-aware, and uses that risk-awareness to engage in risk-based testing, must necessarily be risk-averse. And now I'm also thinking that perhaps this is possible because the things at risk might be different, but I time-boxed this post (to limit the risk of not getting to my other commitments, naturally) and I've run out of time.

So here's the thing that I enjoy so much: one thought blown up into a big claggy mess of conceptual strands, each of which themselves could lead to other interesting places, but with the knowledge that following one means that I won't be following another, with an end goal that can't be easily characterised, alert for synergy, overlap, contradiction, ambiguity and other relationships, and which might simply reflect my personal biases, with the intellectual challenge of disentanglement and value-seeking, all limited by time.

Hmm. What is testing?
Image: Cambridge Software Testing Clinic

Saturday, February 24, 2018

Transforming Theory and Practice

When Sneha Bhat asked if I'd present with her at CEWT #5 the talk we produced was Theoreticus Prime vs Praktikertron. In this essay we've tidied up the notes we wrote in preparation and included a few of the sketches we made when we were developing our model. The title comes from the Transformers we gave the participants at CEWT to explore in an attempt to illustrate different kinds of theory being discovered and shared.

CEWT #5 asked this question: theory over practice or practice over theory? It's an age-old conundrum, represented in popular culture by memes like these that you would have seen as you avoid both theory and practice by grazing on social media when you should be working:
In theory, there is no difference between theory and practice. But, in practice, there is. (Wiki)
Theory is when you know everything but nothing works. Practice is when everything works but no one knows why. In our lab, theory and practice are combined: nothing works and no one knows why. (Crazy Proverbs)
In the way that all communities look for another to kick against — an outgroup to their ingroup — it’s not hard to find instances of theorists saying that practitioners don’t know why they’re doing what they do, and practitioners saying theorists couldn’t do it if they had to. But let’s not be drawn into this kind of bickering, Instead, let’s step back and try to draw out some distinction between these terms, between theory and practice.
A theory seeks to explain observations. In science it tends to be interpreted as being backed up by a weight of evidence, but in casual conversation this isn’t the case.

Practice also has a couple of primary senses, but both of them are strongly about doing, about activity: either repeatedly exercising a skill, or applying some idea.
So theory is some kind of thing, a thing that is produced, whereas practice is an activity, perhaps an engine for production. That already suggests an intriguing possible relationship, one where practice might drive the generation of theory. But how else might practice relate to theory? You don’t have to look hard to find suggestions and we’ve picked out just three that piqued our interest when we were researching our CEWT talk.
Pete Walen, a tester, writing in Considering Conferences: A Reflection, argues for a causal relationship, that theory only has value if it changes practice.

W. Edwards Deming, in The New Economics, wants theory to invoke practices that will confirm it or contradict it, and either way, increase the sum total of theory.

Steve Klabnik, a well-known thinker and contributor to the Ruby and Rust communities, thinks that the key is finding people who can explain the value of theory to practitioners and get practitioners to point out its flaws to the theorists.
Our intuition is that there’s truth in all of these perspectives, and we want to capture it in a model.  Assuming that theorists and practitioners can be convinced of the value of each other’s contributions, the primary aspects are:
  • Theory should guide practice.
  • Practice should help to refine theory.
And if that looks like a loop to you, it does to us too. But where should we enter it? Jack Cohen and Graham Medley provide a suggestion in their book, Stop Working and Start Thinking:
This cannot be said too often or emphasised too much. Ignorance, recognised, is the most valuable starting place
So we imagine some kind of process for accomplishing a goal which goes like this:
  • Do I have theory that gives me what I need?
  • If yes, then stop.
  • If no, then practice to generate theory.
  • Repeat.
And that seems interesting, maybe even plausible, but perhaps lacking in some important respects. For example, we’ve talked about both theorists and practitioners but while the latter are represented in this pseudocode the former are not. Where might they fit?

Our proposal is that they are present. Our proposal is that what you might call theorists, in fact, are practitioners. We’ve said that theory is a thing and practice is an activity. Theorists think about things. Thinking about things is an activity. Ergo, theorists are practitioners! They might not be working with tangible objects like a plumber would, but then perhaps neither are software testers.

To us, the model looks less naive now. But there’s another wrinkle: practice can generate all manner of data, but we guess that only some of it will become theory. To accommodate that, we think of theory as something that doesn’t necessarily have explanatory power but is simply the data that we care to keep track of.

The loop now makes more intuitive sense. Here is it again, with a theory/data distinction:
  • Do I have theory that gives me what I need?
  • If yes, then stop.
  • If no, then practice to generate data.
  • Repeat.

When starting a project we wonder whether we have data that answers whatever question the project is set up to answer. If not, then we set out to generate that data. This generation may be by manipulating the data we have (traditionally the realm of theorists) or manipulating some external system (traditionally practice).

It may be that both types of practice are required. Whichever it is, the theory that we have to start with, and ideas about how to get it, drive whatever practices are carried out. The data ultimately underpins everything. As Rikard Edgren warns in the Little Book of Testing Wisdom:
Software is messy. That's why empirical evidence is so valuable.
Let’s explore how we think this model suggests explanations for some phenomena that we see around testing practice and theory. Take a very simple case where a tester is exploring an application which takes letters as input. Perhaps she finds that:
  • entering data item A results in output X, B results in output Y, and C is disallowed,
  • the text box in the UI restricts entry to one letter at a time but there is an API through which bulk entry is possible.
The tester has asked a question — “what can this application do?” — and found answers, or data. The data that she tracks is her theory. The theory influences further interactions. In this case, we perceive two broad types of theory: behavioural (what the system does) and practical (how to exercise the system).  In a typical project the first of these is likely to get reported back to the team in general, but the latter is likely to remain local with either this tester or her peers when they share information that helps them to do their job.

We might consider these behavioural and practical flavours of theory to represent expertise and experience respectively. For us, both can usefully guide practice, even in the world where the theorist/practitioner distinction that we reject holds some sway:
It's possible to have degrees of [experience or expertise], or both, or to have neither in some area. Apart from its utility as a heuristic for remembering to look to test at different levels, it serves to remind me that the people on any project have different combinations of each and their mental models reflect that.
The manager knows there's a hammer that can be applied to the current problem: just how hard can it be? The implementer knows what to hit, and where, and just how hard it can be.

As testers we’ll naturally be alert to edge cases, the less usual scenarios. A typical edge would be the initial starting condition for a system. Let’s look at that now, and see how the model we’re building might handle it.

Imagine a tester joining a project that’s working on a product that she has never heard of in a market she didn’t know existed for users that she has no knowledge of nor immediately empathy for. She has no expertise in this particular system but she has experience of systems in general. Perhaps she decides that her mission is to make a mental model of the product and begins to explore it, using cues from other software systems she has known as an initial comparison point.

Her theory in this situation is biased to the practical. Or, perhaps more accurately in our conception of it, the subset of her theory that she chooses to use to guide her initial practice is biased towards experience. As she works she generates more data, and the pieces she cares to keep track of become new theory for her.

Another tester dropped into the same project at the same point might have chosen a different route, even with the same mission. He might have begun by reading all of the available project documentation, stories, bug reports, and support tickets, by talking to all of the team members, by reviewing the code and so on. This is also practice, to us, and again it is based on experience rather than product or project expertise (because this tester has none), and it also generates data of which some portion becomes theory for this tester.

In this example, two testers faced with the same situation chose two different approaches. We made it sound like conscious choices, but that needn’t be the case. We’re all human, so factors such as our biases, available time or other resources, pressure from others, incorrect analyses, inattention, preferences, and familiarity with tools can all impact on the choice too.

Again, in this example, although we made the scenario the same we didn’t restrict the personal theory that the testers carried with them into the project. This can, and will, influence actions alongside any shared theory in the project. Ease of sharing is an extremely positive aspect of theory. Often, in fact, theory can be shared in much less than the time taken to generate the data which lead to the theory. In a software project, we might imagine a team of testers reporting their findings to a product owner who takes from it and works on it to generate her own data and hence theory.

This PO is a practitioner. She might not practice on a physical system but her interactions with data have the same potential for experience and expertise as any interaction with a physical system:
experience: she finds Excel a convenient way to combine the reports on this project given that they are generated as csv files from session-based test management notes by the team members.
expertise: she recognises different terminology in several reports as referring to the same kind of phenomenon and combines them, and rates that issue more important because it’s clearly affecting a significant portion of the product.
Her report is theory for her, and potentially for whoever she chooses to share it with. It might go back to the team, to other teams, to a project manager, senior management and so on. Within the team, her theory will guide action. For example, the testers might be asked to run new experiments or change the way they work, or the PO might decide that it's time to ship. Outside of the team, it may be added to the set of theory for someone else, or it might be ignored or lost, or read carefully without the importance being recognised.

Having access to data is not the same as understanding or exploiting data. A practitioner will not report all of the data they collect nor all of the theory they generate to the PO in this scenario. Even if trying to be exhaustive, they will naturally provide only an abstraction of their complete knowledge, and (if our experience is anything to go by) it will tend to be biased more towards the expertise, to how the system works. In another scenario, such as a community of practice or a peer conference, they might provide a different abstraction, perhaps biased more towards experience, how to work the system. We find David Kolb’s Learning Cycles interesting here.

Also interesting to us, in Why is data hard? Helen L Kupp identifies four levels of abstraction around data: infrastructure, metrics and dimensions, exploration and tools, and insights:
Having lots of data doesn’t make it immediately valuable. And ... not only is leveraging data and metrics well critical to effective scaling, but it is also that much harder because you are “building the plane as it is flying”.
At each level she describes how practitioners operate on the data to generate value, the theory. Interestingly, she also talks about how these layers interact with and feed back to each other:

This is an appealing potential elaboration on the distinction between theorists (nearer the top) and practitioners (towards the bottom) which also emphasises data as the key thing that binds the various parties, and the different ways in which those parties act on the data or in pursuit of theory.

A distinction that we haven’t yet covered is that between tacit and explicit theory. All of the examples we’ve given involve theory that’s out in the open, known about and deliberately shared or stored, explicit. But theory can also be tacit, that is, internalised and difficult to describe. Think how easy it is to ride a bike once you’ve got it, but how hard it would be to explain all of the things you do when riding, and why.

Tacit theory can be both advantageous and problematic. An expert practitioner may add enormous value to a project by virtue of being able to divine the causes of problems, find workarounds for them, and gauge the potential impacts on important scenarios. But they may find it hard to explain how they arrived at their conclusions and asking them to try might impact on the extent to which the theory remains tacit and hence directly and intuitively accessible to them.

This kind of tacit knowledge is learned over time, and with experience, and likely also with mistakes. There is some mythology around the status of mistakes with respect to learning, Scott Berkun says “you can only learn from a mistake after you admit you’ve made it” and it’s not uncommon to hear people claim “don’t worry, you learn more from your mistakes” when you’ve messed up.

We’re not sure we agree with this. Mistakes are actions, which may generate data, some of which may become theory. At the time a practitioner takes an action and has the chance to see that data they may not realise there was a mistake, but there is still learning to be had. Perhaps the insight here is that mistakes tend to generate how-to data (experience) because the results of a mistake are less likely to be shareable (so of less expertise benefit).

In the kinds of situations we find ourselves in day-to-day there are many actors operating in a given system, for instance testers, developers, technical authors, application specialists, Scrum masters, project managers. Each of them is using their practical skills to take some input, make some observations, create some data, and generate some theory.

There are no clear levels between which data and theory are passed. There are multiple types of theory. There is no place in which all data and theory are, or could be, shared (because a substantial portion of it is tacit). Beware, though, because as theory is generated further and further away from the point at which its underlying data was gathered, the risk of successive abstractions being too narrow increases. This can be seen in the classic reporting funnel where information gets less and less tied to reality the further up a reporting chain it goes.

Back to the original question, then. It theory primary? Is practice primary? We think that data is primary and both theory and practice serve to guide the generation and exploitation of data:
  • Practice generates data.
  • Data makes theory.
  • Theory guides practice.
  • Practice generates data
  • ...
The required data, rather than theory or practice, is the starting point, and also the ending point. Once the required data is in hand, the loop can stop. Which, we hope you’ll agree, is a neat, ahem, theory. But what practical value does it have? For us, there are a few useful pointers:

The theorist/practitioner distinction has no real validity. We’re all on Team Practice, although we may have different levels of interest, ability, and inclination in practising in particular ways.

You can consciously make yourself more widely useful by developing skills in practices that apply across contexts and having theory around those practices which is transferable. The flip side is that depth of theory in a particular area will likely be reduced.

Don’t just pick up your favourite tool and begin hacking away. Whatever your biases, you can start any task by asking what data you need, then making a conscious choice about how to get it.
Image: Helen Kupp

Wednesday, February 21, 2018

Cambridge Lean Coffee

This month's Lean Coffee was hosted by Roku. Here's some brief, aggregated comments and questions on topics covered by the group I was in.

My favourite context-free test is ...

  • Turn everything up to 11. Put all of the settings on their highest levels and see what happens.
  • Power cycling rapidly and repeatedly.
  • Sympathetic testing. Just getting a view for what the product offers.
  • Simply trying to use the product.
  • Smoke test.
  • Try to do the opposite of what any documentation says.
  • Ask how the users will use it.
  • Ask what the customer wanted.
  • Find someone without prior experience of the product to look at it.

How do you enhance your personal development in a busy environment?

  • We are given time for personal development at work, but I end up dong work stuff instead. 
  • The company empowers us but it's on us to use the time.
  • I don't want the others on my team to feel that I am slacking by taking the personal development time.
  • I might come in to work early to do some personal development but then find myself picking something up from the backlog.
  • I want to learn python but, because I know I can do it, I'll always revert to bash for quick scripts.
  • I did a Masters degree and the structure, and deadlines, particularly exams, helped me to focus on it.
  • Book slots in the diary for it.
  • Make joint commitments with others because you'll keep them more reliably.
  • Our company puts tasks into backlogs for training, so it's explicit.
  • Can you find goals that can be done on every project, rather than needing time set aside?
  • We might like to do a hack day.
  • We had an internal team conference.

A development process with no mention of testing.

  • Our company has produced a process for architecting or rearchitecting software, something all teams must follow.
  • It says things like "talk to X before going too far down the road in such-and-such an area".
  • ... but it has no mention of testing in it. Should it have?
  • If it's about rearchitecting, perhaps there are already tests? 
  • Or you can use the original implementation as a reference.
  • It's on you to explain the value of testing.
  • Any change to the code base can yield bugs.
  • "We're just moving classes around" being safe is a dangerous assumption.
  • Where's do you think there might be problem? For example, is it that there's no mention of testing in the document, or that you think that nothing you'd call testing will happen, or something else?

How do you avoid mini waterfalls in Agile?

  • There is a natural process: work has to be done and won't be ready until the sprint end. Then it's handed over and needs testing!
  • Are sprints a necessity?
  • No. Some teams work in continuous flow.
  • Could you try pairing, to remove or reduce the impact of the handover point?
  • Could you break work down into smaller chunks?
  • Could you have predetermined acceptance criteria? (And non-acceptance criteria?)
  • Could the stories be closed earlier (e.g. are the developers hanging on to them after finishing?)
  • Could there be too much work in progress, so stories progress slowly?
  • Can you sit with the developers? Proximity breaks down barriers.

What is a good job interview?

  • From whose perspective?
  • Do you ask interviewees to do a task, write tests, talk through their CV?
  • How do you react to tasks as an interviewee?
  • I've rejected roles because of how the company has come across to me.
  • I've accepted roles because the task was interesting and enjoyable.
  • Interviewee: show their best in whatever respects are important to them and for the role.
  • Interviewer: facilitated in a way that let the interviewee show their best in areas that are important to the company, got some sense of the interviewee as a person, and saw how they can think.
  • I think of interviews as auditions.

Tuesday, February 13, 2018

Talking the Fork

Four lightning talks at the Cambridge Tester meetup at Linguamatics last night, four topics apparently unrelated to one another, four slices through testing. Yet, as I write up my notes this morning I wonder whether there's a common thread ...

Samuel Lewis showed us the money. Or, at least, where the money is held before being dispensed through those holes in the wall. He included some fascinating background information about ATMs (and a scary security story) but the thrust of his talk was the risks and corresponding mitigation strategies in a massive project to migrate the ATMs for a big bank to a new application layer and OS (more scariness: many are still running Windows XP).

Much of the approach involved audit trails of various kinds, with customer and other stakeholders sharing their road maps and getting a view of the test planning and strategy in return. I enjoyed that the customer themselves was considered a risk (because they had a reputation for changing their minds) and contingency was built in for that. Samuel described the approach as waterfall and spoke in praise of that kind of process for this kind of project (massive, regulated, traditionally-minded customer). I can accept that; I certainly don't have personal experience there to argue against it. But it was striking to me that one of the factors that contributed to the successful completion of the project was the personal relationship with a developer which lead to the testers getting an unofficial early build to explore.

If you want to get your way, make your case fit the worldview of the person you need to convince. That's one of the three persuasiveness spanners (Robert Cialdini's principles of persuasion) in Sneha Bhat's toolbox. Another is to set up a context in which the other person feels some obligation to you: help them first and they'll likely help you back. The third she shared with us was to find "social proof", that is some evidence that someone else, someone respected, endorses the perspective you're proposing.

She touched a little on how persuasion might turn into coercion and gave us a useful acronym from Katrina Clokie for framing a conversation that's requesting something: SPIN. Identify the Situation and the Problem, explain the Implication and describe the Need you have to resolve it. I've heard the talk a couple of times now and, while everything I've said so far is useful, the phrase that sticks in my mind is that it's important to prepare, and then deliver the message with "passion, compassion, and purpose".

Andrew Fraser started his talk with the request to criticise it. I was already interested (a talk about testing in the abstract, with philosophical tendencies, wrestling with big-picture questions is my thing, and I don't care who knows it) but at that point I was hooked. As far as I understood it, Andrew's basic argument runs something like this: all metrics can be gamed; you can view tests as metrics; so tests can be gamed, i.e. developers will code to the tests; the conditions that the tests check for may not represent anything a customer cares about; ergo software that maximises conformity to the wrong thing is produced.

Phew. I can't pretend to agree, but I enjoyed it so much that afterwards I asked to be a reviewer of the bigger essay that this short piece was abstracted from. From my notes: so this is anti-TDD? so this is like over-fitting to a model? so all the tests need to be specified up front? but surely if you can "train" your developers (in some Skinnerian sense) to code in particular ways you can use it to the advantage of the product?

Finally I ran through an early version of The Anatomy of a Definition of Testing which I'll be delivering at  UKSTAR next month. It's a personal definition, one that helps me to do the work that I need to do in my context.

Four diverse talks then, but what thread did I divine running through them? Well perhaps it reflects something about me, about what I took from the talks, or about what I want to impose on them. It seems to me that people are at the heart of these stories: a personal relationship delivered the early build, a persuasion conversation involves human emotions on both sides, it's people that intuitively game metrics, and a personal definition is really only about the person. Jerry Weinberg was quoted during the questions for my talk and I doubt he'd be surprised to find this kind of theme in talks around software, his second law of consulting being "No matter how it looks at first, it's always a people problem."