Thursday, September 12, 2019

You Can Tidy the Data

This week Sime suggested Tidy Data by Hadley Wickham for the Test team book club:
A huge amount of effort is spent cleaning data to get it ready for analysis, but there has been little research on how to make data cleaning as easy and effective as possible. This paper tackles a small, but important, component of data cleaning: data tidying. Tidy datasets are easy to manipulate, model and visualize, and have a specific structure: each variable is a column, each observation is a row, and each type of observational unit is a table. 
Messy data needn't be bad data, but it might not be in a format that makes it easy to process. Many tables used for data presentation will contain implicit variables, such as person or result in the table here:

If you've ever generated, aggregated, or inherited data of any scale for analysis you're almost certainly already familiar with the basic ideas. You've probably also done informally, with much cursing, copy-pasting, and burnt fingers what this paper formalises as a small set of transformation patterns that, applied appropriately, will make messy data tidier.

The table below tidies the table above to have one row per experimental observation, one data value per cell, and all variables explicitly named:

In our discussion we noted some interesting overlap with the observability white paper we read recently. Although the term tidy data wasn't used, these pieces of advice suggest that the folks over at are familiar with the idea:
Exploring your systems and looking for common characteristics requires support for high-cardinality fields as a first-order group-by entity. 
Add units to field names, not values (such as parsing_duration_µs or file_size_gb) 
Generate one event per service/hop/query/etc.
Wickham's paper is short and readable so I won't summarise it here but I will note that the operations have snappy names (melting, splitting, casting) and examples that illustrate them, and their application and composition clearly.

I'll also mention that, once again, I'm struck by how useful it is to name a thing and take it out of the realms of tacit, experiential knowledge and into the world of explicit, inspectable knowledge and hence shared value.

Images: Tidy Data, AbeBooks

Sunday, September 1, 2019

Lego My Ego

The Line. As a model there's not much simpler than a single horizontal line but, for Jim Dethmer, Diana Chapman, and Kaley Klemp, it's sufficient in any quest to become a more conscious leader: at any given moment, a person is either above the line and conscious, or below it and unconscious

In their book, The 15 Commitments of Conscious Leadership, they elaborate. Being above the line means being open, curious, and committed to learning while being below it means being closed, defensive, and committed to being right. To operate above the line is to have a By Me state of mind (to take responsibility for being in any situation, to let go of blame) while below it is To Me (to believe that external factors caused the situation, to have a "victim consciousness"). Above the line leads to healthy and trusting relationships while below the line leads to toxic and fear-based relationships.

Shane Parrish, interviewing Dethmer on the Knowledge Project podcast, suggested his own snappy summary of the opposition: above the line is about outcomes while below it is about ego. I think this captures the essential idea of the book well and fits with my personal trajectory, albeit one that I feel has been underway for a long, and very slow, time.

My own experience has led me to a place where I try to recall Jerry Weinberg's quote "things are the way they are because they got that way" when encountering a situation, to encourage me to see it as the state that just is irrespective of how it was arrived at. Despite not always managing to either recall it or act on it I still do my best to be motivated by achieving congruence, to resist the temptation to make comparisons, and, particularly, to avoid judging others.

I additionally find humility in this definition of an idiot from Bob Marshall: "Anyone who is just trying to meet their needs, in the best way they know how, where their way makes little or no sense to us."  But it's not easy to let go of ego and I fail frequently.

Recognising that I've slipped below the line on a particular occasion is a positive according to Dethmer and his partners (Kindle location 248-249, 250-251):
We suggest that the first mark of conscious leaders is self-awareness and the ability to tell themselves the truth.

Distortion and denial are cornerstone traits of unconscious leaders.
Strategies for dissolving the ego to reach a dispassionate view of the context, other participants, and yourself are the thrust of the book after the core concepts have been introduced. It consists of 15 commitments that signify consciousness and, if I could boil it crudely down to a single sentence, it's about finding ways to frame situations as learning opportunities or cultivating behaviours that naturally lead to those kinds of framings.

Note that commitment is a loaded term here. It's not a statement of intent, but rather the result of behaviour. The commitment is satisfied by being above the line with respect to it, not by saying you will attempt to be so.

Interestingly, despite apparently aligning with my own motivations, I found a strong negative reaction to aspects of the book. There's a thread of spirituality running through it that I find hard to accept. To pick just one example:
Again, if the universe is benevolent, always organizing for the highest good, then other people are part of this collective support for your personal growth. (2905-2906)
I have a similar issue with energy:
Energy flow is our natural state, but when it’s blocked or interrupted, the life force so essential to great leadership is dampened, and effectiveness wanes immediately and drastically. (1688-1690)
Fortunately, accepting the premise of a benevolent universe or some other higher agency (they also mention Source, Allah, God, Love, Jesus, Presence, and The Tao) is unnecessary for practical purposes.

On that practical side, anecdotes are offered to show how actions can lead to desirable outcomes. Again, my intuition is that above the line is a better place to be than below it but, with an analytical background, I found myself yearning for stronger data.

Despite these and other misgivings  I persevered with the book and still find the central messages compelling. Another quote towards the end seemed to address this point:
When we own our resistance, we see that we simply need more motivation: more vision or dissatisfaction. This is not a problem. It is just what is so in this moment. (3390-3392)
Image: Mercury Rev at Discogs

Sunday, August 18, 2019

First AST the Post

A few months ago I was asked if I'd consider standing for the board of the Association for Software Testing. After giving it a lot of thought, and talking to some people, I decided that I probably did have something to offer, even if it's only the perspective of a relative outsider.

Last week, at CAST 2019 the voting results were announced ... and I'm in!

For the record, then, here's my answers to the election questionnaire.

Q1: Why are you an AST Member?

When I first became a tester I sought out resources to help me learn what testing was. The first book I bought was Lessons Learned in Software Testing and I found that the way it described testing was the way I naturally wanted to test. I researched the authors and their other work which led me to Rapid Software Testing, joining AST, and completing the BBST Foundations course.

I remain a member of AST because of its integrity, its dedication to the testing craft, and its commitment to a strong code of ethical conduct for testing.

The other people I know who are members of AST are, like me, passionate about testing. Sadly, not everyone I know who is passionate about testing thinks the AST is relevant to them.

Q2: How do you intend to promote diversity within the AST? How could AST promote diversity, of all kinds, within our own organization and within the wider testing and technology communities?

The AST’s code of ethics expects its members to respect diversity of cultures and not to discriminate on the basis of race, sex, religion, age, disability, national origin and so on. From the outside, taking the speakers at CAST 2019 as an example, the organisation appears to be demonstrating that it doesn’t believe older white men are the only people who have something useful to say about testing. 
I’d try to promote diversity in and by the AST by encouraging it to seek out people who do not already engage with it and find ways to help them to engage. This might be by making AST relevant to them, by moving AST geographically closer to them, by making AST financially accessible to them, or something else.

Q3: What do you think the AST board has historically done well, and what do you think needs to change?

With only a small number of volunteer staff, I think the board has done well to keep the organisation running, grow and extend its conferences and webinars, and maintain a high level of commitment to testing as a craft and high-quality testing education. 
A significant challenge for AST is that it can seem dry, academic, and a bit dated. Worse, for me, it’s not at all obvious what AST think the point of being a member of AST is.

Q4: If you are elected to serve on the board, what is your vision for the future of AST and what do you hope to accomplish as part of the board?

I’d like to see the passion and dedication that the AST has for testing be more visible and be expressed in more accessible ways. Working out how it wants to coexist with other organisations in the testing space is key. 
Simon Sinek suggests that projects should start with the why and I’ve found that to be good advice in software and other projects. I’d encourage AST to understand why it exists before it starts to make any changes. 
Part of that reflection must include an acknowledgement of the current context where there are test conferences popping up all over the place and groups such as the Ministry of Testing doing a brilliant job of promoting testing.

Q5: Many people come to be AST Board of Directors candidates through a long history of community involvement ... Please describe any current initiatives you participate in that might affect your ability to serve on the AST board, and serve the AST membership

Once or twice a year I help to organise CEWT, the Cambridge Exploratory Workshop on Testing, a small peer conference for the testing community in Cambridge, UK. I regularly attend and support the Cambridge Tester Meetup and its sister Testing Clinic, and I blog regularly at Hiccupps
I don’t believe that these activities will get in the way of AST business.

Q6: What is your vision for the future of AST’s training program?

When I talk to people about the AST’s training program they refer to its depth, breadth, and quality … and also to it being a really significant commitment of both time and effort. My own experience of BBST left me with a similar perspective. I’d add that having a cohort studying the same content together over a constrained time, with dedicated and knowledgeable instructors, is a great motivation to learn and an aid to learning well. It makes BBST stand out from other online learning. 
It isn’t cheap to create training this valued and valuable — and as important to the reputation of AST — so alterations or extensions would need to be carefully considered. I’d want to understand where the AST thinks its niche is (see previous answer) and then align the training program with that.

Q7: (Optional) Would you like to provide a short (250-400 word) introduction to go on your candidate page?

James is one of the founders of Linguamatics, the world leader in innovative natural language-based text mining, and over the years he’s had many roles in the company including web site admin, tech support, and dev manager. He is currently the test manager, a position in which he strives to provide an environment where his testers have an opportunity to do their best work. 
He organises CEWT, the Cambridge Exploratory Workshop on Testing, which has covered topics such as testing ideas, when testing went wrong, why we test, and the values of both testing practice and testing theory. He’s active in the Cambridge testing community, regularly attending, presenting, and promoting its activities.
At EuroSTAR 2015, he won the best paper prize for his talk and essay Your Testing is a Joke, on the relationship between testing and humour. He’s also spoken at UKSTAR 2017, where he described his personal definition of testing and how he arrived at it, and Softtest 2018, about the two-way benefits of testers getting involved in customer support, a topic also covered in his Ministry of Testing ebook, When Support Calls
He’s on Twitter as @qahiccupps and blogs at Hiccupps.
Image: Maria Kedemo

Sunday, August 4, 2019

Licensed to Coach

Last week, unexpectedly, I became a Licensed Scrum Master. I knew I was attending a course run by Scrum Inc with a group of my colleagues, but I had no idea that there was a short online multiple choice exam and a certification for me at the end.

It's easy to be sceptical and sniffy about the industry around Scrum, but I try to go into training events in a positive frame of mind, looking to participate, open to having my views changed. I'm not sure that there was a seismic perspective shift for me on this occasion, but I still thought it'd be interesting to run a personal retrospective.

Some background first: I've never worked on a Scrum team, although I manage testers on them. I'm very interested in ways that software can be developed and tested, and one of my favourite books in that space is  Extreme Programming Adventures in C# by Ron Jeffries, a great practical example of iteration, reflection, and honesty.

A disclaimer too: without much visibility of the remit of the training, and knowing that the level of Scrum experience of the attendees ranged from 10 years to none in the group I was in, my comments here are knowingly, selfishly, biased to my own preferences.


  • Participants were split into "Scrum teams" for the class, with Product Owner and Scrum Master roles being nominated. This was neat, both to build up some team spirit and reflect some of the material being taught.
  • The Scrum Inc people I dealt with were good: knowledgeable, open, seeking feedback and acting on it. 
  • From what they said and the evidence I saw, Scrum Inc eat their own dog food. In particular, both trainers were able to give nice examples of their own teams' values, off the cuff.
  • The course material reiterated that Scrum exposes problems for the team and organisation rather than providing some kind of magic wand to remove them.
  • Teaching sessions were organised as "sprints": a Kanban board for sections of the course, with a burndown chart showing progress, velocity and so on. I thought this was a gimmick to begin with but came to think I'd have liked to have seen it extended.
  • The content included extremely light touches on lots of interesting (non-Scrum) stuff such as Shuhari, Weinberg's Systems Thinking (although only for a comment on context-switching costs), Simon Sinek's Golden Circle, Value Stream Mapping and The Theory of Constraints. Perhaps it'd have been nice to have a categorised list of further reading in the course notes.


  • Statistics motivating the use of Scrum (or sometimes simply "Agile") were glossed over too quickly and with too little definition for my taste. See also  The Leprechauns of Software Engineering.
  • The practical exercises weren't geared to explore the course content in much depth. This is understandable; time is limited. But I found it unsatisfactory and the type of exercises — sometimes with a competitive element — were distractions from the key points.
  • The why (Scrum's pillars and values) got lost in the what (the mechanics of the ceremonies and the artifacts). I fear that this risks promoting a kind of cargo cult Scrum.
  • Similarly, the how (what the Scrum Guide mandates for Scrum) wasn't heavily distinguished from another what (potentially useful practice to implement Scrum) either. Story points, burndown charts, and Kanban boards might be helpful, but they're not fundamental. It's possible to do Scrum without them.
  • I would have liked to have seen more on the roles. The Scrum Master should coach the development team, the Product Owner, and the company, but we didn't get to that beyond a bullet point.
  • What the business commits to the team was also thinly covered. The focus of the training material was heavily biased to what the team does and what it commits to deliver to the business.


  • Relate all teaching directly back to the Scrum pillars (transparency, inspection, adaptation) and values (commitment, courage, focus, openness, respect) and explain how the activities serve them.
  • Cover the human side of this kind of work. An enormous factor in the success of a team will be the interpersonal relationships. There definitely isn't time to coach the participants in (say) assertiveness and feedback, but some acknowledgement of their importance would make the content stronger.
  • Replace the piecemeal exercises with a themed set that runs for the duration of the course, based on some kind of development project. Use it to explore the ceremonies, roles, and artifacts more thoroughly. Admittedly, this would still be artificial but the context built up over successive exercises can mitigate that to some extent.
  • Provide more coaching in the roles. It was great to have Scrum teams for the duration of the training, but the PO was reduced to housekeeping in several of the exercises.
  • Provide more demonstrations of actual practices, such as the burndown charts for sessions. I'd have liked to have seen the trainers explicitly facilitate a course retrospective, for example, to show how experienced practitioners go about their work.

The picture at the top is how the Scrum team I was in approached the challenge of explaining Scrum in 60 seconds using a deck of cards showing the roles, events, and artifacts. Shorn of the details of particular practices, the core iteration and reflection is clear to see. So clear, in fact, that we explained it in less than 30 seconds.

And that's one of the things I'll take away: for sure, Scrum comes with a lot of baggage (not least that it's often seen as unthinking management's silver bullet) but at heart, as the Scrum Guide says, it is transparent and iterative and (my emphasis) ...
... not a process, technique, or definitive method. Rather it is a framework [that] makes clear the relative efficiency of your product management and work techniques so that you can continuously improve the product, the team, and the working environment.
But for it to have a reasonable chance of succeeding I think all concerned are going to need to understand and buy into that.

Monday, July 29, 2019

Hey Little Hen

How about if it wasn't Venn diagram, but instead it was a When diagram? Would we know when things were going to get done if we had one of them?

I'm sure I'm not the first person to make that phonetic connection, but perhaps I'm the first who has  sufficiently little shame to mention it publicly on Twitter:
If I asked you to produce a "When diagram" for the project you're running, what would you think I was after? What kind of picture/chart/diagram would you draw for me?
A handful of kind folk responded, each with something different:
  • @joelmonte: When, in the project perspective, sounds like a milestones diagram.  It can be interconnected, and showing the dependencies of events...  But this is only my sunday-morning-wild-guess.  Do we also have "Why", "How" and most importantly "Who is to blame" diagrams?
  • @always_fearful: Gant chart
  • @hairyhatfield: A venn diagram with sets 'sooner'  'later' 'essential' 'nonessential'
  • @SheyMouse: A scrum master used to use a burn up chart snapshot to show likelihood of success. There were three projections "sooner" if velocity increased, "current" rate, "later" if velocity decreased. C-level understood this. Quite effective in explaining score crop too.
  • @zaphodikus: A line graph of time versus likelihood that I have a real live customer and that they will be so happy they pay a million dollars for the output

I thought it'd be fun to sketch how I thought their ideas might look, so I imagined a little project with a handful of features and dependencies and drew this:

We're probably all familiar with the Gantt chart to some extent and those of us who've had the misfortune to have to maintain one for a project of any complexity or a changing context probably hate them with a passion. 

The Gantt chart's focus on visualising dependencies is something that I like a lot and I've sometimes put effort into a Kanban-plus-dependencies visualisation. The milestones diagram here feels somewhat like that, although a forecast rather than a current status. This may be due to my interpretation of the milestones as lanes rather than dates.

I find the burn up chart a compelling representation of progress through a task list, with the added benefit of showing the changing size of the task list. The projections, based on rate of completion of tasks so far, are interesting and a useful way to represent the uncertainty around estimated completion dates.

I interpreted the time vs likelihood chart as representing potential value creation, with some line at which sufficient potential value has been implemented to take some action, such as deploying or moving out of beta. On the assumption that tasks are prioritised based on business value this feels like it would track the burn up. 

The When diagram that's most like a Venn diagram is interesting to me because, the way I drew it, there are two pairs of partitioned spaces which make it essentially into quadrants. If the assumption I made here — that nothing can be both sooner and later, and nothing can be both essential and non-essential — isn't right, then a more traditional-looking Venn diagram might be generated. I feel that there's some connection to the important/urgent distinction  trying to get out of this picture too.

OK, that's me done but why did I bother? Because ideas are interesting to me in their own right: because a trigger like two words that sound the same provoking a thought I hadn't had before and which might lead to some insight I haven't had before is something that I covet. 

In this exercise, the things I liked thinking about the most were the projected Kanban lanes, the notion of pooling tasks into sooner and later visually, and the graphical representation of error bars around estimates.

Thank you to everyone who responded. I really appreciate it.

Wednesday, July 24, 2019

Ideas and Learning

The Cambridge Tester Meetup last night had talks from Jamie Doyle and Samuel Lewis. I took the opportunity to practice my sketchnoting again.

Jamie is a business owner and an ex-tester and test manager. He described how, despite his background in testing, he has still kicked off product development based on sketchy 3am "great ideas" in teams without any testers.

Having lost some money building the wrong thing, he's now an advocate of shifting testing left. He recommended that the C-level get testers in to provide risk assessments of ideas before they're committed-to, and that testers look to make contacts on that side of the business and get themselves in a position to be asked to help.

He also shared some of the approaches and questions he used redoing the problematic project with a friendly tester from the bottom up. Interestingly for me, this exercise sounded like the kind of thing a business analyst might do. I see a lot of crossover between the test and BA roles but Jamie appeared to draw a really strong line between them, based on his experience: testers to review ideas, and BAs to break down requirements.

Samuel's talk was about the framework he has built at DisplayLink for helping new starters feel less anxious, more productive, and happier in their early days at the company. He talked essentially about two things.

First, the way the framework was developed: by talking to people about what they thought was important to learn, finding ways to document that (with learning objectives for each piece), lots of review, and then finding a tool to manage the pieces.

Second, the way the framework is applied and maintained: by giving each starter a copy of a Trello board containing the intro tasks, by support from a buddy, in 1-1s, in regular conversation in stand ups, and by feeding back their experience into the process for the next person along.

Monday, July 22, 2019

I Guess

I really enjoyed providing pre-production comments on Rich Rogers' book on quality, Changing Times,  so when the opportunity to do the same for George Dinwiddie came up recently, I took it.

Why? Oh, a handful of reasons, including:
  • I'm here for the testing and reviewing feels a lot like testing. (My definition of testing: the pursuit of relevant incongruity.)
  • There's the interesting intellectual challenge of finding a way to provide the kind of review being requested effectively and efficiently.
  • There's the interesting social challenge of delivering my thoughts in a way that conveys them respectfully, despite sometimes being critical.

George's book is called Software Estimation Without Guessing and I knew up front that there would be two rounds of review for it. The first was on a version with a couple of chapters still to be written, the second with all content present but further editing still required.

The publisher, The Pragmatic Bookshelf, provided clear guidelines on the kind of review they wanted and, in particular, they asked for grammar and typos to be excluded. Initially I felt that this might be a mistake — I could see them, why not point them out and reduce the chance they'd be missed later — but increasingly came to think that it was a good call. By ignoring them, I didn't break my flow to annotate stuff that is probably bread and butter to professional editors, instead delivering what value I could by commenting on the content based on my domain knowledge.

Pragmatic also provided a list of questions, which I decided to treat as an aid to reflection, and so this is how I approached the initial tranche of feedback:
  • I read from the beginning to the end without skipping forwards at all.
  • Along the way, I annotated the manuscript directly with suggestions, criticisms, and things I was wondering about. I hoped that understanding a reader's expectation at different points might be interesting to George in terms of structuring.
  • I didn't read in blocks of less than a chapter.
  • After each chapter, I re-read Pragmatic's questions and, in a separate text file, added fieldstones for each of them, and also for an "Other" category. These comments tended to be at a higher level than those made directly into the draft.
  • When I had read all of the material, I sorted and edited down my answers to the questions.
  • Finally, I reviewed my comments on the manuscript. To make it clear which comments were my first reactions and which had been made with the benefit of hindsight, I flagged this later round with "(2)".

Using Pragmatic's questions to structure my big-picture thoughts was useful even though talking to the editor later I found that they're a reasonably standard set. I treated them essentially as a checklist:
  • Who is the audience for this book? Is the book’s tone appropriate for that audience?
  • Is the book well-organized? Is the material presented in a reasonable order, and does it flow well from one topic to the next? Does the table of contents provide a useful guide?
  • Is the book correct? Are there any technical details that are in error, misleading, or perhaps recently superseded?
  • Is the book engaging? Do you want to keep reading it?
  • Is the book complete? Are there any missing topics, or extraneous topics that should not have been included?
  • Is the book consistent? Is the level of detail consistent and appropriate? How about the audience being addressed?
  • Would you recommend this book to others? Why or why not?

The second round manuscript provided the material that was absent in the first draft, but didn't change the other content. Again, I thought about how to approach it to compromise reasonably between the time I could put in, the coverage I was being asked to provide, and the depth I could go to. This is what I did:
  • I read through only the new chapters, annotating as I went.
  • Again, I read in chapter-length blocks.
  • I read through my answers to the publisher questions from last time.
  • Where it seemed appropriate, I provided new answers in which I was prepared to emphasise, downgrade, add, or remove my originals.
  • I reviewed my comments on the new material looking for inconsistencies between it and earlier comments.

Retrospecting on my methodology, I notice how natural I find this kind of cycle: dive in, record findings and feelings, reflect, choose a level of review and go again. I see it in my day job too. The publisher checklist gave me a structure in this case, but where I don't have one I'll usually distil it from the notes I've recorded.

The human element feels crucial to tasks like this. Beyond some Twitter chat, reading his blog, and watching a recent webinar I don't know George. However I've been on the receiving end of many reviews and know well my sense of indignation and self-righteous rightness when confronted with some thoughtful analysis that contradicts a line I've taken. (This is only amplified when changing my line would involve a lot of work.)

I try to bear this in mind when providing requested feedback, but it can sometimes be hard to balance humility and candour. When in doubt, for this task, I've tried to err on the side of my truth, hoping that'll be more useful to hear, even if it's dismissed.

And that prompts one final thought; although I doubt that he ever had any other intention, I found myself feeling the need to say this to George about my comments: "Ignore anything you want. It's your book!" That's not to belittle my effort, but to reflect the mindset with which I tried to approached the review. It's only my opinion and mentally tagging what I've said with "I guess" can help to remember that, as it's read but also as it's written.