Sunday, September 23, 2018

Testing vs Chicken


At CEWT #6 we were asking what constitutes good testing. Here's prettied-up versions of the notes I used for my talk.

One line of argument that I've heard in this context and others is that even though it's hard to say what good testing is, we know it when we see it. But there's an obvious potential chicken and egg problem with that.

I'm going to simplify things by assuming that the chicken came first. In my scenario, we'll assert that we can know what good testing would look like for a project which is very similar to projects we've done before, where the results of testing were praised by stakeholders. The problem I'll address is that we don't have anyone who can do that project, so we have to hire. The question is:
What can we do as recruiters, during recruitment, to understand the potential for a candidate to deliver that good testing?
I've been recruiting technical staff for around 15 years, predominantly testers in the last ten, and my approach has changed enormously in that time. Back in the early days, I would largely base my phone screen interviews around a chronological traversal of the candidate's CV asking confirmatory questions. Checking the candidate, if you like.

These days, a CV to me is more of a filter, and a source of potential topics for exploration. I have also spent a lot of time thinking about how I want to advertise positions, and about the kinds of information I want to give and get from each stage of the process, and how I'll attempt to do that.

I have a simple model to help me. I call it the Egg of Testing Recruitment.


The yolk is the core testing stuff; crucial to our identified needs. The white is the other stuff; important to functioning in our context. It supports the yolk.

Some people will tell you that eggs are easy to cook. Some people also think that recruitment is straightforward: identify a need, describe it, find the person who best matches it, hire them, relax. But eggs don't always come out the way the chef intended.


And recruitment likewise. Here'a few reasons why:
  • multiple important factors
  • limited time and engagements
  • a space of reasonable outcomes
  • a dynamic feedback system
That last one is particularly interesting: as a recruiter, be aware candidates will be looking to align themselves with your needs. If, for example, you do and say things that suggest you favour exploratory testing, then don't be surprised when answers which support their exploratory testing skills start to come.

But  recruitment is starting to sound a lot like testing: the extraction of relevant information from an arbitrarily complex system under various constraints. And, if it's testing, I'll want a mission. And if I had a mission I might phrase it something like this:


The kinds of materials you can usually expect in standard hiring rounds are:
  • a cover letter
  • a CV
  • a phone screen
  • a technical exercise
  • a face-to-face interview
And then there's others that are reasonably common these days, including:
  • social media
  • blog
  • personal website
  • open source projects
All of these hold data about the system under test ... erm, about the candidate. I know that some recruiters disregard the cover letter. I love a cover letter. First, it's more data. Second, it is an opportunity for the candidate to speak direct to me, in their own time, in their own words, about the fit of this role to them and them to this role.

When it comes to conversation and exercises, I use the Egg of Testing Recruitment to remind me of what I'm after.

The yolk: when I can interact with the candidate I tend to want to explore core skills that can only really be demonstrated interactively. I'll want to put the candidate in a position where they can't know everything, where there's ambiguity, and see how they deal with it.

Do they ask for clarification, do they tell me what their assumptions are, do they offer caveated answers, do they say "in this context ..", do they use safety language? In this respect I regard interviews as more like an audition -- asking the candidate to perform something like a testing task, and being able to explain their thought processes around it.


The white: I'll be looking for reporting, presentation, consistency and the like in the written material. I'll also be noting stuff that could be ways in to understanding other aspects, particular technical expertise that I can ask about, for example. I can't ask for demonstration of all skills, but I can ask behavioural questions such as "can you tell me about a time when someone doubted the value of testing or when someone asked you to justify your testing?"


In the real world, of course, the egg model looks very different.


The yolk and egg cannot be separated so cleanly. But that's OK. In the interview, I can be testing both at once. For example, on any answer the candidate gives I can be looking for consistency. I can gauge the correctness or reasonableness or depth of a series of answers and use them as checks on the candidate's reliability of answering.

Having explored the candidate using conversation and exercises I need to evaluate them. A job advert that reflects what you actually want helps here. (It's worth remembering that when you're writing it.)

This evaluation is again like testing; you've stopped because you've spent the time that is available. Of course you could have done more. Of course you could have taken alternative routes. But you didn't and now you have to report: what you did, what you found, and the values and risks associated with that.

In your day job this probably goes to a stakeholder who ultimately makes a decision. In recruitment scenarios, you may well also be the stakeholder. But that shouldn't alter the way you go about your business, unless it makes you care even more than you would normally to do a good job.

I think there's three major points here. To put yourself in a position to recruit testers who can do the kind of good testing you're after:
  • understand your mission
  • treat interviews as auditions
  • explore the candidate
Here's my slides:

Saturday, September 15, 2018

Look at the Time


I'll be quick. In the kitchen. With tea brewing. Talking to a developer about new code. Exploring  assumptions. An enjoyable conversation. A valuable five minutes.

A third party provides an object identifier to our code. We use it as a key because it's our understanding that the identifier is unique. We have never seen identifier collision.

Try again: we have never seen identifier collision at any given point in time.

Do we know that the third party will not recycle identifiers as the objects they represent are discarded? What would happen if they did?

No longer in the kitchen. Tea on desk.
Image: https://flic.kr/p/aCWUN5

Thursday, August 30, 2018

Boxing Clever


Meticulous research, tireless reiteration of core concepts, and passion for the topic. You didn't ask, but if you had done that'd be what I'd say about the writing of Matthew Syed based on You Are Awesomereviewed here a few months back — and now also Black Box Thinking.

The basic thesis of the latter is captured nicely in a blog post of his from last year:
Black Box Thinking can be summarised in one, deceptively simple sentence: learning from mistakes. This is the methodology of science, which has changed the world precisely because it is constantly updating its theories in the light of their failures. In a complex world, failure is inevitable. The question is: do we learn, or do we conceal and self-justify?
Who wouldn't want to learn from their mistakes, you might ask? Lots of us, it turns out. The aviation industry tends to come out well in Syed's analysis. Accidents, mishaps, and near-misses are reviewed for ways in which future flights might be less likely to repeat them, and the knowledge is shared across the board. Blaming is minimised in order that all participants are encouraged to share their evidence and thoughts.

The medical and healthcare industries, and also politicians, tend not to do so well. In these areas, blame culture and a fear of reprisals are said to hinder the extent to which mistakes are admitted to, investigated, and subsequently mitigated.

Atul Gawande's The Checklist Manifesto makes similar points, and prescribes the use of checklists as one way to mitigate the future risk. Syed spends a lot of time on the ways in which cultural changes in philosophy, mindset, and practice, need to be made in order to get to a point where the risks are identified, accepted, and then provoke some kind of positive action.

There's so much material packed so densely into this book that I can't do it justice here. In lieu of that, here's some of the entwined key threads as I saw them:
  • We live and work in complex systems 
  • ... where failures will happen.
  • A blaming culture is likely to result in lower visibility of issues and more ass-covering 
  • ... whereas open cultures encourage and support self-reporting.
  • A "production" failure should be seen as a learning opportunity
  • ... and a chance to reduce the risk of future such failures.
  • Use "development" failure as a tool
  • ... particularly within an iterative development environment.
  • Expertise comes from practice and feedback
  • ... but a mixture of theory and practice helps avoid local maxima.
  • A fixed mindset is less likely to innovate
  • ... and broadening our outlook makes creative connections more likely.
  • On the whole, we prefer narrative over data 
  • ... and when beliefs and data disagree, we tend to deny the data.
  • Understanding what to measure and record is key
  • ... and sometimes it's sensible to experiment to understand what to measure.
This last point in this list gives the book its title — the black box recorder on an aeroplane is often crucial in understanding the circumstances that lead to an incident — while the first point is hammered home repeatedly: there is often no one single root cause for which an individual can clearly be held responsible.

This complexity is itself hinted at in the list: there are many variables at play, and they are interconnected. There is generally no silver bullet, no quick-fix, no one size to fit all. On this point, in a particularly nice meta twist, Syed notes that the approaches espoused for learning, say, how to build a product can also be used on the approaches themselves — in order to learn better how to build, perhaps we first need to learn better how to learn.

On learning then, three things that I'm taking away from this book.

I have historically been sceptical when I hear people blithely say that we learn more from failure than success. Out of context, I still don't believe that's necessarily a given but I think perhaps now I have more nuanced thinking here.

First, using a generate-and-test approach in development, and treating each generation that doesn't improve our test metric a failure, we might say that the volume of failure drives our learning more than the final success. Syed gives the example of James Dyson who made thousands of incrementally different prototype vacuum cleaners before arriving at his first production model. Thousands of failures, each of which helped to point the way to success.

Alternatively, I wonder whether it might mean that that analysis of the differences between success and multiple failures allows us to understand the factors important to success in a way that simple (ahem!) success does not.

Also new to me, and hidden in a footnote (p. 172), there's an interesting term:
"Observational statistics" is a phrase that encompasses all the statistics drawn from looking at what happened. Randomised control trials are different because they encompass not merely what happened, but also construct a counterfactual for comparison.
That counterfactual is key; it helps to balance survivorship bias. A well-known example comes from the second world war: deciding where to add armour to planes based on where there are bullet holes in those that returned to base is to miss the massive value of the unobserved data. Those that got shot down and never made it back might well have been hit elsewhere. (For a brief summary see e.g. Mother Jones.)

Another footnote (p. 220) raises an interesting potential tension that I realise I've been aware of but perhaps never surfaced before:
Getting the manufacturing process running seamlessly is often about ironing out unwanted deviations. It is about using process controls and the like to reduce variation. Creative change is often about experimentation: in other words, increasing variation.
Sensitivity to variability, to the unknown, should be adjusted consciously based on the context in which we are operating. More frequently, it appears to me, we have a relatively fixed level of comfort which can compromise our ability to operate in one or other of the scenarios that Syed identifies.

Black Box Thinking, despite the repetition due to the interconnectedness of the ideas it puts forwards and despite its sardine tin consistency, is a book worth persevering with. It's helped me to both learn and reflect on many concepts I've been thinking about for some time myself. Here's a few:
Image: Amazon

Thursday, August 16, 2018

Tufte: Visual Explanations


Last year I read a bunch of Edward Tufte books: The Visual Display of Quantitative Information, Envisioning Information, Visual Explanations: Images and Quantities, Evidence and Narrative, Beautiful Evidence, and The Cognitive Style of PowerPoint. I found them compelling and ended up writing You've Got To See This for the Gurock Blog. 

In the intervening year I've found ways to incorporate aspects of what I learned into my work: I've tried hard to remove the junk from my figures and charts; I've noted that when we're talking about how to talk about our data, something like small multiples can help us to visualise more of it more easily; I've encouraged members of my team to think about the difference between exploring data in a tool such as Excel, and presenting data in a chart produced by Excel.

After that experience, I thought it might be interesting to review the notes I took as I went through the books (which I did, and it was). Then I thought it might also be useful to share them (which I'm doing, and you can judge).

This short set of posts contain the quotes I took from each book, presented in the order that I happened to read them. Themes recur across the series, but the quotes don't necessarily reflect that; instead they show something of what I felt was interesting to me in the context of what I'd already read, what I already knew, and what I was working on at the time.
All of the books are published by Graphics Press and available direct from the author at edwardtufte.com. Particular thanks go to Šime for the loans and the comments.

--00--

How are we to assess the integrity of visual evidence? What ethical standards are to be observed in the production of such images? (p. 25)

... the reason we seek causal explanations is in order to intervene, to govern the cause so as to govern the effect ... (p. 28)

... descriptive narration is not causal explanation; the passage of time [can be] a poor explanatory variable ... (p. 29)

The deep, fundamental question in statistical analysis is Compared with what? (p. 30)

Time-series are exquisitely sensitive to choice of intervals and end points. (p. 37)

Displays of evidence implicitly but powerfully define the scope of the relevant, as presented data are selected from a larger pool of material. Like magicians, chartmakers reveal what they choose to reveal. (p. 43)

When assessing evidence, it is helpful to see a full data matrix, all observations for all variables, those private numbers from which the public displays are constructed. Not telling what will turn up. (p. 45)

... there are right ways and wrong ways to show data: there are displays that reveal the truth and displays that do not. (p. 45)

... lack of visual clarity in arranging evidence is a sign of a lack of intellectual clarity in reading about evidence (p. 48)

Informational displays should serve the analytical purpose at hand: if the substantive matter is a possible cause-effect relationship, then graphs should organize data so as to illuminate such a link. (p. 49)

In magical performances, knowledge about the revealed frontview (what appears to be done) fails to yield reliable knowledge about the concealed backview (what is actually done) — and it is the audience's misdirected assumption about such symmetric reliability that makes the magic. (p. 55)

[techniques of deception practised by magicians], when revered, reinforce strategies of presentation used by good teachers. Your audience should know beforehand what you are going to do; then they can evaluate how your verbal and visual evidence supports your argument. (p. 68)

If a clear statement of the problem cannot be formulated, then that is a sure sign that the content of the presentation is deficient. (p. 68)

Relevant to nearly every display of data, the smallest effective difference is the Occam's razor ... of information design. (p. 71)

Congruity of structure across multiple images gives the eye a context for assessing data variation. (p. 82)

Multiple images reveal repetition and change, pattern and surprise — the defining elements in the idea of information. (p. 105)

Excellence in the display of information is a lot like clear thinking. (p. 141)
Image: Tufte

Tufte: The Cognitive Style of PowerPoint

Last year I read a bunch of Edward Tufte books: The Visual Display of Quantitative Information, Envisioning Information, Visual Explanations: Images and Quantities, Evidence and Narrative, Beautiful Evidence, and The Cognitive Style of PowerPoint. I found them compelling and ended up writing You've Got To See This for the Gurock Blog. 

In the intervening year I've found ways to incorporate aspects of what I learned into my work: I've tried hard to remove the junk from my figures and charts; I've noted that when we're talking about how to talk about our data, something like small multiples can help us to visualise more of it more easily; I've encouraged members of my team to think about the difference between exploring data in a tool such as Excel, and presenting data in a chart produced by Excel.

After that experience, I thought it might be interesting to review the notes I took as I went through the books (which I did, and it was). Then I thought it might also be useful to share them (which I'm doing, and you can judge).

This short set of posts contain the quotes I took from each book, presented in the order that I happened to read them. Themes recur across the series, but the quotes don't necessarily reflect that; instead they show something of what I felt was interesting to me in the context of what I'd already read, what I already knew, and what I was working on at the time.
All of the books are published by Graphics Press and available direct from the author at edwardtufte.com. Particular thanks go to Šime for the loans and the comments.

--00--

Visual reasoning usually works more effectively when the relevant evidence is shown adjacent in space within our eyespan. (p. 5)

Many true statements are too long to fit on a PowerPoint slide, but this does not mean we should abbreviate the truth to make the words fit. It means we should find a better tool to make presentations.(p. 5)

How is it that each elaborate architecture of thought always fits exactly on one slide? (p. 12)

By using PP to report technical work,  presenters quickly damage their credibility ... Both [reviews of NASA's investigations into Shuttle disasters] concluded that (1) PowerPoint is an inappropriate tool for engineering reports, presentations and documentation and (2) the technical report is superior to PP. (p. 14)

... the PowerPoint slide typically shows 40 words, which is about 8 seconds of silent reading material. (p. 15)

This poverty of content has several sources. The PP design style, which uses about 40% to 60% of the space available on a slide to show unique content, with remaining space devoted to Phluff, bullets, frames, and branding. The slide projection of text, which requires very large type so the audience can see the words. Most importantly, presenters who don't have all that much to say (p. 15)

Sometimes PowerPoint's low resolution is said to promote a clarity of reading and thinking. Yet in visual reasoning, arts, typography, cartography, even sculpture, the quantity of detail is an issue completely separate from the difficulty of reading ... meaning and reasoning are relentlessly contextual. Less is bore. (p. 16)

To make smarter presentations, try smarter tools. (p. 28)

PowerPoint promotes a cognitive style that disrupts and trivialises evidence. (p. 30)

Preparing a technical report requires deeper intellectual work than simply compiling a list of bullets on slides. Writing sentences forces presenters to be smarter. And presentations based on sentences makes consumers smarter as well. (p. 30)

Our evidence concerning PP's performance is relevant only to serious presentations, where the audience (1) needs to understand something, (2) to assess the credibility of the presenter. (p. 31)

Consumers of presentations might well be skeptical of speakers who rely on PowerPoint's cognitive style. It is possible that these speakers not evidence-oriented, and are serving up some PP Phluff to mask their lousy content ... (p. 31)
Image: Tufte

Tufte: The Visual Display of Quantitative Information


Last year I read a bunch of Edward Tufte books: The Visual Display of Quantitative Information, Envisioning Information, Visual Explanations: Images and Quantities, Evidence and Narrative, Beautiful Evidence, and The Cognitive Style of PowerPoint. I found them compelling and ended up writing You've Got To See This for the Gurock Blog. 

In the intervening year I've found ways to incorporate aspects of what I learned into my work: I've tried hard to remove the junk from my figures and charts; I've noted that when we're talking about how to talk about our data, something like small multiples can help us to visualise more of it more easily; I've encouraged members of my team to think about the difference between exploring data in a tool such as Excel, and presenting data in a chart produced by Excel.

After that experience, I thought it might be interesting to review the notes I took as I went through the books (which I did, and it was). Then I thought it might also be useful to share them (which I'm doing, and you can judge).

This short set of posts contain the quotes I took from each book, presented in the order that I happened to read them. Themes recur across the series, but the quotes don't necessarily reflect that; instead they show something of what I felt was interesting to me in the context of what I'd already read, what I already knew, and what I was working on at the time.
All of the books are published by Graphics Press and available direct from the author at edwardtufte.com. Particular thanks go to Šime for the loans and the comments.

--00--

For Playfair, graphics were preferable to tables because graphics showed the shape of the data in a comparative perspective. (p. 32)

... small non-comparative, highly labeled data sets usually belong in tables. (p. 33)

... the relational graphic — in its barest form, the scatterplot and its variants — is the greatest of all graphical designs ... It confronts causal theories that X causes Y with empirical evidence as the actual relationship between X and Y (p. 47)

Graphical excellence consists of complex ideas communicated with clarity, precision, and efficiency. (p. 51)

Particularly disheartening is the securely established finding that the reported perception of something as clear and simple as line length depends on the context and what other people have already said about the lines. (p. 56)

... given the perceptual difficulties, the best we can hope for is some uniformity in graphics (if not in the perceivers) and some assurance that two perceivers have a fair chance of getting the numbers right. (p. 56)

Deception results from the incorrect extrapolation of visual expectations generated at one place on the graphic to other places. (p. 60)

Show data variation, not design variation. (p. 61)

Graphics must not quote data out of context. (p. 74)

If the statistics are boring then you've got the wrong numbers. Finding the right numbers requires as much specialized skill — statistical skill — and hard work as creating a beautiful design or covering a complex news story. (p.80)

Occasionally artfulness of design makes a graphic worthy of the Museum of Modern Art, but essentially statistical graphs are instruments to help people reason about quantitative information. (p. 91)

Above all else show the data. (p. 92)

The best designs ... are intriguing and curiosity-provoking, drawing the viewer into the wonder of the data, sometimes by narrative power, sometimes by immense details, and sometimes by elegant presentation of simple but interesting data. (p. 121)

John Tukey wrote: "If we are going to make a mark, it may as well be a meaningful one. The simplest — and most useful — meaningful mark is a digit" (p. 140)

Small multiples resemble the frames of a movie: a series of graphics showing the same combination of variables, indexed by changes in another variable. (p. 168)


Small multiples are inherently multivariate, like nearly all interesting problems and solutions in data analysis. (p. 169)

Tables are clearly the best way to show exact numerical values, although the entries can be arranged in semi-graphical form. (p. 178)

Given their low data-density and failure to order numbers along a visual dimension, pie charts should never be used. (p. 178)

Explanations that give access to the richness of the data make graphics more attractive to the viewer. (p. 180)

Words and pictures belong together. Viewers need the help that words can provide. (p. 180)

Thus, for graphics in exploratory data analysis, words should tell the viewer how to read the design ... and not what to read in terms of content. (p. 182)

What is to be sought in designs for the display of information is the clear portrayal of complexity. Not the complication of the simple; rather the task of the designer is to give visual access to the subtle and difficult — that is, the revelation of the complex (p. 191)

Tufte: Envisioning Information


Last year I read a bunch of Edward Tufte books: The Visual Display of Quantitative Information, Envisioning Information, Visual Explanations: Images and Quantities, Evidence and Narrative, Beautiful Evidence, and The Cognitive Style of PowerPoint. I found them compelling and ended up writing You've Got To See This for the Gurock Blog. 

In the intervening year I've found ways to incorporate aspects of what I learned into my work: I've tried hard to remove the junk from my figures and charts; I've noted that when we're talking about how to talk about our data, something like small multiples can help us to visualise more of it more easily; I've encouraged members of my team to think about the difference between exploring data in a tool such as Excel, and presenting data in a chart produced by Excel.

After that experience, I thought it might be interesting to review the notes I took as I went through the books (which I did, and it was). Then I thought it might also be useful to share them (which I'm doing, and you can judge).

This short set of posts contain the quotes I took from each book, presented in the order that I happened to read them. Themes recur across the series, but the quotes don't necessarily reflect that; instead they show something of what I felt was interesting to me in the context of what I'd already read, what I already knew, and what I was working on at the time.

All of the books are published by Graphics Press and available direct from the author at edwardtufte.com. Particular thanks go to Šime for the loans and the comments.

--00--

Emaciated data-thin designs ... provoke suspicion — and rightfully so — about the quality of measurement and analysis. (p. 32)

Small multiples, whether tabular or pictorial, move to the heart of visual reasoning ... Their multiplied smallness enforces local comparisons with our eyespan. (p. 33)

We envision information in order to reason about, communicate, document, and preserve that knowledge ... (p. 33)

Standards of excellence for information design are set by high quality maps, with diverse, bountiful detail, several layers of close reading combined with an overview, and rigorous data from engineering surveys. (p. 35)

Simplicity of reading derives from the context of detailed and complex information, properly arranged. A most unconventional design strategy is revealed: to clarify, add detail. (p. 37)

If the visual task is contrast, comparison, and choice — as it so often is — then the more relevant information with eyespan the better. (p. 50)

Simpleness is another aesthetic preference, not an information display strategy, not a guide to clarity. What we seek instead is a rich texture of data, a comparative context, an understanding of complexity revealed with an economy of means. (p. 51)


One line plus one line results in many meanings — Josef Albers. (p. 61; image above)

The noise in 1 + 1 = 3 is directly proportional to the contrast in value (light/dark) between figure and ground. (p. 62)

Careful visual editing diminishes 1 +1 = 3 clutter. These are not trivial cosmetic matters, for signal enhancement through noise reduction can reduce viewer fatigue as well as improve accuracy of readings from a computer interface, a flight-control display, or a medical instrument. (p. 62)

The arrangement of many computer interfaces is similarly overwrought. (p. 64)

Information consists of differences that make a difference. (p. 65)

At the heart of quantitative reasoning is a single question: Compared to what? (p. 67)

Comparisons must be enforced within the scope of the eyespan, a fundamental point occasionally forgotten in practice. (p. 76)


The Swiss maps are excellent because they are governed by good ideas and executed with superb craft. Ideas not only guide work, but also help defend our designs (by providing reasons for choices) against arbitrary taste preferences. (p. 82; similar image above)

Noise is costly, since computer displays are low-resolution devices, working at extremely thin data densities ... at every screen are two powerful information-processing capabilities, human and computer. Yet all communication between the two must pass through the low-resolution, narrow-band video display terminal, which chokes off fast, precise and complex communication. (p. 89)
Images: TufteArchemind, Bücher