This week I was working with a developer to make a change to a legacy codebase in an area neither of us are very familiar with. The need is easy to describe in general and some occurrences of the behaviour we want to alter are common, straightforward to identify in use, and clear in the code.
Unfortunately, the logic in the application is complex, the data used is domain-specialised and the behaviour we are interested in can occur in extremely specific combinations that are hard for a layperson to predict. I had no confidence that cases we knew were all of the cases.
My colleague did a round of work and asked me to take a look. I exercised the application in a few ways while inspecting its logs in real time so that I could see the effect of the changes (or not) immediately. This gave up some rarer examples which had not been covered in the code. I added a couple of tests to characterise them and he identified another code change.
Next, rather than continue exercising the product looking for cases in increasingly hard-to-find corners I turned to code analysis. I searched the source looking for lines that were similar to instances of the changes we'd made up to that point and reviewed them. Some looked like they might be close enough to provoke the behaviour we were looking for.
However, even with this information, finding the right combination of inputs to provoke traversal of a specific codepath was tricky so I went to our production data in Honeycomb and searched for evidence of those functions in the traces.
As expected, they were rare but did exist and I could take the requests and run them locally which showed us that we had a new class of changes to make rather than a single instance.
--00--
Why do I mention all this? Because it illustrates something valuable about testing: taking multiple perspectives is likely to yield more insight than one. I could have simply confirmed the first round of implementation: put in the data we know about, get the result we expect, tick and move the ticket to Done.
What I try to do instead is wonder how we could identify that there could be something more to look for, and how to do that cost-effectively: how to give ourselves a chance of knowing the unknowns?
It is productive, often, to ask what sources of data we have and how we might inspect them cheaply. In this example there were three sources:
- application logs (at verbose level)
- code
- production telemetry
I happened to use them in that order this time but on another occasion I might do it differently. To extract value cheaply you'll want to be familar with tools. In this case I used:
I have incrementally built my knowledge in these tools over the years by trying to use them in situations like the one I've described here. I find it helpful to have a specific question to ask to motivate my learning. I am generally excited rather than demotivated by the need to use a new tool, because I know that it opens up potential benefits when I have questions in future.
I can go top-down, from input data to application output; bottom-up, from points in the code to data; or loop, when one insight makes it clear that an earlier analysis was incomplete or when trying to find a way to frame a question to a tool that will give the desired answer.
It's important to have a sense of when to stop. What is a proportionate level of effort for this task? How far are you down a rabbit hole? Have you done enough ... for now?
This will not be a linear process and there is rarely a "correct" approach. It's a distasteful analogy, for sure but, in testing, there is pretty much always another way to skin the proverbial cat.
Image:https://flic.kr/p/auUid8
Comments
Post a Comment