I was listening to the Radiolab podcast Stochasticity yesterday as I walked to the shop. The presenters were talking to two women, both named Laura Buxton, who became friends years ago because one of them released a balloon with their name and address attached from their back garden and the other found it 150 miles away in their back garden.
After Laura Two got in touch with Laura One they discovered other incredible similarities: they were both tall for their age, had brown hair and blue eyes, both had a labrador, black, a rabbit, grey, and a guinea pig with orange markings. They ended up in the local newspaper, and on national and even international television talking about how glad they were that fate had brought them together.
If that sounds astonishing, how about this? As I strolled along the river and the podcast turned to the question of the probability of Laura-like events, David Spiegelhalter, a renowned statistician, jogged past me.
What are the chances?
The podcast uses a golf analogy to get an intuitive idea of what is happening in these kinds of situations: when the ball is hit from the tee, the probability of it landing on a particular blade of grass is very low, but the probability of it landing on a blade of grass is very high.
When you are the blade of grass that the ball landed on, you will more likely prefer the OMG-fate-chose-me! interpretation, but to observers the it-was-coincidence-that-it-was-you big picture reading is usually more appropriate.
There's an additional layer here too, the human factor. We like the story of the two Lauras but we need to be aware that (a) the storyteller's bias selects and blurs the details to make it more compelling, and (b) we have a listener's bias to buy that story.
For example, the Radiolab report is coy about differences between the two girls, about the fact that the balloon was not found by the second Laura herself, and their ages. It turns out that one was "nearly 10" while the other was actually 10.
This kind of insight can inform our testing in interesting ways, for example:
- state a mission and/hypothesis before running an experiment, to avoid cherry-picking something that appears unusual or confirms our biases from the data afterwards
- when an apparently surprising result presents itself, step back to ask whether it's the blade of grass or a blade of grass
- think about whether to generate larger data sets, e.g. run the experiment 10, 100, or 1000 times to help identify true outliers more easily
Note 1: although I hadn't heard it before I started searching today, it turns out that the Laura Buxtons story is reasonably widely known. Here's a handful of other references: 1, 2, 3, 4.
Note 2: I think it was David Spiegelhalter. It was someone who looked like him, anyway.
Image: Champions
Comments
This article was curated as a part of #117th Issue of Software Testing Notes Newsletter.
https://softwaretestingnotes.substack.com/p/issue-117-software-testing-notes
Web: https://softwaretestingnotes.com
Post a Comment