Saturday, June 23, 2012

Geek and Ye Shall Find


If you aren't using automation to get you to a place where you can efficiently apply intelligence you're almost certainly wasting time and effort. Michael Bolton put it this way on Twitter last week:
 I like ... using automation as a taxi to drop me off at the site of the real work.
When we moved offices a couple of months ago Marketing updated the contact details in their collateral and then requested that we review the changed PDFs. Simple approaches to this task would include reading all of the documents from top to bottom, skimming them looking for addresses to check or using a PDF reader's search functionality in each document to look for likely mistakes. All of these would have been slow, prone to error and without a good way to make efficiencies.

What we actually did was automate, using command line tools to do the heavy lifting. First, we transformed the PDF files into a text format in which they could easily be searched as a set:
 $ find . -name '*.pdf' -exec pdftotext {} \;
Then we identified search terms that would signal the presence of outdated contact details. Some examples include the old postcode, building and phone number:
 $ grep "0WS" *.txt
 $ grep "St.*John" *.txt
 $ grep 421360 *.txt
It's important to understand the limitations of your tools and techniques. In this case we didn't search for long strings like "St. John's Innovation Centre" which could have been broken over a line, or which would have missed cases where "Johns" had no apostrophe. We could have added more complexity to account for that, but given the task it didn't seem worth the effort.

This approach took minutes to implement and turned up a bunch of errors including this one, a combination of new and old postcode:
Linguamatics 324 Cambridge Innovation Centre Cowley Cambridge CB4 0WG 0WS
Arguably this is only semi-automation because we drove each step by hand. Even if so, this just shows another strength of automation - you can use it for investigation, for exploration. If we think of a new search term we can try it cheaply and get data back on its effectiveness immediately. Because it's cheap we can try many search terms and, once we've got a good set, we can put them all into a short script and re-use it in the next round of testing. The Linux history command is a handy way to review what you've tried.

In fact, we'd previously set up a simple script to chain together tests like this as a gate on generating our user documentation. Our source is HTML and, before it's built into the product, the doc team run a script that checks for well-formedness, broken links, spelling, a bunch of common typos and so on.

This is cheap and cheerful automation that can become part of your regular tool arsenal, freeing you up to think about how to apply your testing brain, not how to keep track of the donkey work you've done in case you need to redo or change it.
Image: http://flic.kr/p/7FDvu2

Wednesday, June 20, 2012

A Glass of Weinberg

So the Dev Manager finally started his own blog. Obviously, I was more than happy to pour feedback on it but more than disappointed that there's no public bug tracker, so I had to content myself with exaggerated outrage and inflated prioritisations via email. Which just isn't the same.

But, joking aside, we should offer up a toast to him for writing positively about Weinberg's Perfect Software in one of his first posts even if he misinterpreted my face when I passed him the book as hopeful. In fact it was merely reflecting my ennui, despair, frustration, coffee intake and thoughts of an impending holiday.

Weinberg wants to know about how his writing has affected people's lives and I'm anxiously awaiting developments here too.
Image: http://flic.kr/p/9V2bKU

Saturday, June 9, 2012

Never Thought

Maybe I'm just an oversensitive, subtle and thoughtful kind of guy, or attuned to ambiguity, or unempathic, or unimaginative, or a git, or plain old-fashioned thick, but being precise and clear with language is a skill that I value highly and strive to achieve.

With questions, I find that I'm unlikely to assume the context if it's not already present or given. So, for example, I probably would not have a simple answer to "should the product do X?" Are we discussing a version of the product in the field or the product in development? Is it my personal opinion you're after? My user-head opinion? If so what kind of user? Perhaps it's what the spec says or said once upon a time? Maybe X is a reasonable result under certain circumstances but not others? Are you implicitly telling me that the software does do X and you're not sure whether it's right? Or that you think it might be intended to do X, and it doesn't? Or asking whether we need X in the product?  Maybe you're engaged in a thought experiment and have a new set of potential applications for the product in which X is the jumping off point?

Statements are no easier. If I was to I hear "I'm not sure about feature Y" I'd want to understand whether the uncertainty  is a measure of the confidence of the speaker in the work they've done (perhaps they aren't sure they tested sufficiently broadly or with sensible test data, or at the right scale) or in Y itself  (perhaps the AUT is clearly unstable in some respect) or both or something else altogether.

I've got particular bĂȘte noires ("seems to work", gah!) and recently one showed up a couple of times in the space of a few days from different people,  but fortunately not anyone at work. "The application will never Z" is how it went,  in the context of a discussion on the kinds of testing that might be necessary for an application they had no experience of using, with the coda that we thus didn't need to test for or around Z.

Never? If you mean that you think Z is unlikely given all the information that you have, say so. If you mean that the requirements don't mention Z, then say so. If you mean that Z is not a valid input or output, then say so. If you mean that an action required to achieve Z is forbidden by the environment in which the application is (intended to be) deployed, say so (but let's ask whether it'll ever be deployed otherwise). If you mean that you've spoken to the developer and he says that the code path that represents the Z scenario rejects the input with sensible error handling, say so. If you mean that you have been testing this application and haven't observed Z, say so (but let's discuss the coverage). If you mean that you can't think of a setup in which Z could be observed, say so. If you don't understand why I might be banging on about provoking the product to produce a Z (as precisely as I can) don't try to cut the conversation off by saying it'll never happen and instead please just say so.
Image: http://flic.kr/p/cagK3S

Monday, June 4, 2012

Slight of Hand


As the wartime slogan had it, careless talk costs much time in reimplementation and testing and ultimately late delivery of software that contains less than you wanted at a quality level of just-about-bearable and which will be followed immediately by a patch release, no two, no three patch, no four patch releases.

It's incumbent on everyone in a development project to share knowledge efficiently by getting key points across economically. If you're not comprehensive, coherent, correct and concise, you risk other people missing your point because they never heard about it, couldn't follow it, didn't believe in it or lost interest in it, and that leads inexorably to extra costs.

In that spirit, the meat in this post is that you should endeavour be as open as possible, full in description and slight in length. When these are in conflict, strive to remove the need for fullness. If you must be full, then structure your content accordingly.

Software teams are often dislocated in place but even if not they will be dislocated in time for some of the time. Documentation (including wikis, bug reports etc) provide a form of collaboration and knowledge sharing between the author(s) and the reader(s). When the readers are also authors it's a more involved collaboration and when a document is the primary mode of communication too there's greater need for completeness, efficiency and clarity. Like all collaboration there is cost and because generally there are fewer authors than readers, it's generally cheaper for the writer to pay up front.

Here's some of the things I try to keep in mind when I'm writing for work.

Dialogue vs monologue. In a conversation the listener can interrupt but on the page it's up to the writer to provide context. If there's insufficient context your readers won't know what you're talking about and it will cost them time and effort to divine it - and they might not get it right.

If you say the same things in multiple places you create a maintenance headache or a breeding ground for inconsistency. As the Dev team almost have it, Don't Rewrite Yourself. Instead cross-reference or provide hyperlinks when possible to give context.

Avoid anaphora with no obvious antecedents in the context. So "in the meeting it was decided that ..." might be OK for you and the attendees today but what about someone else next year? If it's not important that it was a meeting don't say. In any case, give the important information which is, e.g. "The CTO said today that ..." Likewise, "elsewhere" "above" or "below" should be replaced with a link to the place. Documents are dynamic so don't just assume the layout or structure will persist.

You can avoid the need to explain terms if you just don't introduce them. Use the standard name for the thing you're referencing and do so consistently. Try to avoid even changing its spelling or capitalisation because, to readers, differences raise questions. Don't create implicit definitions either - when you need a new name for something then be explicit, once, clearly.

If you're digressing, and the digression is useful but not relevant, move it to somewhere else and reference it.

Group related points together. They provide and reinforce context.

Don't describe something when you can easily show it and don't overshow just because you can. If you have log or other trace then use it sparingly. Show the key lines with timestamps and full error messages and provide the rest in a link.

Screenshots are good. When they're good screenshots. A verbal description that leaves an exercise for the reader is wasteful of a reader's time. If you have many readers, it's wasteful many times. A tight text and a screenshot with a red circle round the problem is cheap and useful.

Justify your conclusions, but as briefly as possible. Resist the temptation of pointing the reader at a starting point from which they could reason to the conclusion themselves, although do point to that starting point. If not, you're selling yourself short and your reader long.

Don't play your cards close to your chest. You're in a team and what you think is irrelevant, impossible or ignorable may be none of those things to someone else on the team.

Write short, declarative sentences and short paragraphs. Don't be afraid to restructure and cut what you've written before you commit it.

Use indentation, font styles, bullets, headers and other formatting when they're useful for explanation or clarity. Avoid them otherwise - they're just a visual distraction. Use tables or diagrams when they can compress or simplify  lists or prose. But apply the same principles to them as you would to writing - be precise and concise.

Sometimes you do need to write more, but in that case, structure so that the key stuff is first, like a journalist would. Key points up front; context and detail later. Try to avoid putting your intepretation in front of the facts. For example, start with the problem not with a question about how to implement one possible solution.

None of this applies to blog posts, of course.
Image:http://flic.kr/p/5fT3A8