Tuesday, July 31, 2012

Testing in My Sleep

I think I read it in something Oliver Burkeman wrote recently but the idea is all over the place: counting backwards in threes can help you to fall asleep when you've got stuff on your mind and can't settle down.

The idea is that it's a sufficiently complex activity that you have to focus on it rather than all the other things crowding your consciousness and keeping you awake, but it doesn't have a high enough cognitive load to keep you awake itself.

So the other night, after one of my daughters had woke me up at 2am to tell me she loved me (thanks, but...) I tried it:
500, 497, 494 ... all going well ... 491, 488  ...  counting down ... 485 ...  a second voice: how can we check we're on track?  ... 482 ...  how about using 470? ... 479 ... 10 by 3 is 30 ... 476 ... and 30 off 500 gives 470 as a simple verification ... 473 ... Hmm, 473 less three, yes! ... 470 .... That's a pass. We can check again at another 30 off 470, that'll be 440 ... erm, 467? ... then 410, 380, 350, you not asleep yet, James?
Image: http://flic.kr/p/83AsRH

Monday, July 30, 2012

Testing Generally


I sometimes consciously split the functionality I'm testing into two parts: general: behaviour that is the same, or similar, regardless of where it appears, how it is invoked and so on; and specific: which differs according to function, context, time, data types etc. 

I'll tend to do this more on larger projects when the areas are new to me, or to the product, or if they're complex, or I think the test framework will be complex, or the specific is heavily dependent for its delivery on the general, or perhaps when the specific details are certain to change but the general will be stable.  

I'll be looking to implement automation that concentrates first on general functionality and self-consistency and that will serve as a backstop when I move on to the more specific material. 

To speed things up, to get wider coverage easily, and to avoid dependencies, I'll try to avoid crafting new test data by looking for data already in the company that can be reused. Static dumps from live servers can be good, but dynamically changing internal landfill instances are gold dust because they'll be running the latest Dev build and generating new data all the time.

Take the example of a server which exposes an API using HTTP. The API gives clients access to resources (by URLs) and actions on those resources (e.g. searching across back-end data sources).  My functionality breakdown might include the following:
general: each resource is exposed to a client as custom data structures but some properties will be shared across resources, e.g. "children" always represents sub-resources whose URL can be derived from the resource itself.  
An interesting subset of general functionality is those based on standards. In this case, the HTTP standard for client-server communications is  well-defined and independent of your product (although your product may only implement parts of it and there are areas in which there is leeway for client and server to choose an action).  
specific: any functions on the resources that are outside of HTTP are specific. For example, the query parameters on URLs will have a specific meaning to this server.
So how might I set up general testing, using pre-existing data here?

There's a huge space of potential tests to do with conformance to HTTP RFCs. As these tests are, for the most part, independent of  the data in your system  you can implement them without worrying about what data you have (if you request a resource that's not there, the system should respond with a 404).

A particular general test  might request the children of a collection resource (effectively a folder) and then request each of them in turn. If they all exist, it confirms a degree of consistency between the back-end data, its presentation in the API, and the client-side view of it. Conversely, requesting a resource that you know should not exist (e.g. http://myserver/collection/thiswasnotachild) can confirm error behaviour. Note that you can not confirm that all of the children that should be there are present this way, without extra knowledge of whatever backs the server.

A subclass of specific tests is close to general: system meta data. That is, a set of attributes of the product that are true regardless of the data that's in the system.  In the server example, perhaps there is a finite set of resource types that the server will enumerate. You can cheaply check that the server's list agrees with a list in your test suite without knowing what data is stored for any of those types.


If there is a lot of data in your test systems, randomising access to it lets you trade run-time of a given invocation of the suite against cumulative coverage over time, because different sets of data will be visited on each run. You can implement a cache of what's been touched in recent runs and avoid it later although I have found this not worth the hassle. On a landfill server, data can change under your feet which adds another dimension to the testing. And note that  it can be productive to run the suite against servers without any data in their back-end stores at all. 

These kinds of suites can also be parameterised. For example, we could ask the randomisation to run tests for a certain period of time, to a certain depth or breadth, for a certain number of data items or some other limit or search strategy.  In an  automated GUI test suite we're building at the moment, we're playing with parameters representing user behaviours such as "fast" vs "slow", "keyboard" vs "mouse" and so on for different invocations of the suite - running the same tests in different ways.

So why might this kind of testing be interesting?
  • It puts you in the product (or in the technologies on which the product depends) immediately, learning about both, getting background for the specific testing and testing to a level that is practical and sensible at any given time.
  • You quickly flesh out the basic structure of your test harness, learn what kinds of utility functions you'll need and the like. This can be invaluable when you're ready to extend to specific tests because you've got the infrastructure in place already.  I try to partition the two sets of tests so that I can run them separately.
  • You end up testing against all sorts of malformed data (intermediate formats; buggy data, crafted data from the dev team, antique data from previous releases...) and learn a lot about how the application copes with them.
  • Consistency is a testing watchword (see e.g. FEW HICCUPPS) and time spent understanding the baseline level of consistency of a feature or product is seldom wasted.General testing is a lot about consistency.
  • When you're ready to, and if it makes sense to, you can extend to creating data as well. If I do this, I make a point of cleaning up test suite data at close.
It's clear that this approach has limitations. In particular, although it's data-driven, it's driven by the data that is present and by a one-sided view of that data. If it passes, it will tell you that  no inconsistencies were found in the data and functionality touched by a particular run, but no more.

Despite this, it can be very productive and later become a regression test that extends as the data you point it at evolves. There's usually suitable data lying in your dev and test environments that belongs to you, was otherwise redundant and that you can get the extra value from.
Image: http://flic.kr/p/55ryMX

Tuesday, July 17, 2012

A Clavicle Education


Co-location is intrinsic to some software development and it can also have social benefits, build an esprit de corps and smooth out the kinds of communication issues that time zones and typing often cause. But, for me, there's another softer reason why co-location is advantageous - I learn stuff in passing from the natural interactions I have in the course of a working day. 

When I go to a colleague's desk to ask about some functionality, and they pull up source files for inspection, I'm looking at the text, but I'm also interested in the editor they're using, the powerful ways it lets them search/replace and the fact that it has a plug-in for fancy diffing that I wasn't aware of and that I can use myself next time.

Sitting with a developer as they write code is a welcome insight into the mindset of someone who really knows the nuts, bolts, screws, rivets, nails and other fixings when my skills, relatively speaking, extend to being able to tell a bradawl from a bread knife. And it's often an interesting view inside a piece of the product too, with architectural details prompting thoughts on potential test vectors.

While one of the team is looking for patterns in a huge number of log files, I might see how a chain of a few simple Linux utilities can become an exploratory testing probe.

Workmates with expertise in our application and the domains it's used in dispense nuggets of useful information too - perhaps just by having their browser open on a site when I go over to talk to them I discover a whole new resource of testing material. Or when they work the product I see new paths through it that I can try to exploit next time I'm in that area.

Even the way that different staff interact with their computers is a constant source of new end-user behaviours. A hundred tabs in one browser? Navigating the Windows desktop and its applications by keyboard only? Numerous applications open for hours even though they're finished with, and the machine is running like a dog? Constantly shuffling Windows to and from the centre of the desktop to reflect the user's focus? They're all out there, and plenty more.

Standing at the shoulders of workmates, you should always keep testing antennae out. When you're learning one thing you'll often find out something else just as useful for free.
Image: http://flic.kr/p/4oGgfp

Wednesday, July 11, 2012

Mock the Afflicted


The concept of test doubles is well-established in unit testing with mocking probably the most familiar. The idea is that you fake enough of an API to permit unit tests to run against it. You can control the way that the mock API responds, tailor your test coverage and avoid external executables, dependencies on other code and so on to run your tests.

It can be useful to use similar concepts at the component level too. For example:
  • for diagnosis and investigation of misbehaviour in complex systems you can replace components in a workflow. I like this especially when the problem behaviour is hard to reproduce naturally. Part of our core product is  a server that calls out to other software for some tasks and returns data back to a client. Replacing a server-side executable with something that generates a specific output (e.g. returning an error response to the server, or bad data to the client) or behaviour (e.g. taking so long to respond that the server is forced to time out) can be extremely helpful in identifying a cause or prompting an effect.
  • similarly, you can capture input data for a component by replacing it with a script that simply dumps its arguments to some convenient location. This is particularly useful when you have no doc and/or the product logging doesn't give you enough information, or you don't trust it.
  • in regression tests it can be useful to isolate individual executables and have them talk to a shim rather than another real component. I'd almost always also run the real components in tandem too, for the potential interaction side-effects. Once you have the framework, multiple variant runs can be cheap.
Image: FreeDigitalPhotos.net

Wednesday, July 4, 2012

The Elephant in the Fume

I read my daughter a story last night, about an elephant that thinks she's a mouse. It seems reasonable, her big book of knowledge says mice can be grey with big ears and skinny tails. Note to new testers: specifications are seldom sufficient. Think broadly.

She moves in with a mouse family but it doesn't go well. Luckily Granny Mouse has the wisdom of age and experience, works out what's gone wrong and takes Nelly to the zoo to be with her own kind. Unfortunately, one of the mice then reads Nelly's book and gets the idea he's an elephant. Note to old-hand testers: spread your insight through the team.

A few times recently I've found myself talking to less-experienced colleagues and realising that I've segued from the mouse to the elephant, assuming they were following. They weren't, and the discussion got confused. We started off on a small acute, practical problem of right now and I expanded out to the big, theoretical potential solution of the future - but staying within the necessary constraints shared by the two. 

Big picture thinking is a valid way to help reason through to an answer for today, but it's unfair and unhelpful of me not to be clear, especially because it used to leave me fuming when I was on the receiving end. Note to self: remember that mice can turn into elephants, but please say when the trunk is being stuck on.
Image: amazon.co.uk