The Dev Manager lent me the book he's been reading recently, Michael C Feathers' Working Effectively with Legacy Code. The detail is mostly too fine for my coding knowledge and needs but I had a few ideas reinforced and new ones to take away:
I don't know whether I do accept it - the existence of unit tests is not the same as the existence of useful unit tests or sufficient unit tests to make edits as safe as Feathers is trying to - but the insight it gives is interesting. In the world AFK we should be asking ourselves who tests the testers? In parallel, in the codebase, we should be asking who tests the test code? This is not only a problem for testers; Dev need to think about what kinds of coding policies and so on they need for their own tests.
The view I've taken to date is that for our test code we'll trade off the expense and effort of wrapping it in unit test against the benefits that we can accrue by doing something else - such as investigating the application under test - instead.
That's not to say we don't care about the code. We do and, although we're by no means software engineers, we try to do a decent job: all our code is under version control; we write our own library code to share across suites; we regularly rewrite or even replace components; we report bugs in our test code in the same bug tracking system as our product, and we verify our fixes, and so on.
Testing the test code is important and one of my earliest posts was on that topic. On occasion I've also talked about using (and used) regression test suite output as a kind of harness for the test code itself. Imagine that you have a test suite that you want to refactor. You run it and all of the tests (or checks) pass. Now, hold the product executables constant and change the test code. If, after your changes, the tests still pass you've got confidence that your changes are good. The suite is a self-testing magical gem and you are a coding colossus!
Kind of. What you've actually got is some evidence that you probably haven't ballsed things up totally. Imagine that your edit simply changed every test call of the kind
$actual=create_X(a,b,c);
assert($expected, $actual, "value of X is as expected")
to
$actual=create_X(a,b,c);
assert("pass", "value of X is as expected")
Now your test suite still passes, but performs no tests. Sure but that's unrealistic and you'd never do that kind of thing, would you? No, of course not. But might you occasionally replace a genuine check with a dummy one while debugging? Could you accidentally check that in? Hmm.
There are subtler ways to get to that position too. I came across test code this week that looked like it had lost some functionality in refactoring. This meant that a handful of checks were not being performed. The code looked like this:
if (X & Y) {
if (Z) {
do_one_thing()
}
else {
do_another()
}
}
The key problem here was that if X and Y were not true there was no else clause in which some appropriate action could be carried out. Consider that there was an earlier edit which removed a final
else {
do_another()
}
and altered the logic in the nested conditional with the intention that the single call to do_another()would deal with all relevant cases. Despite the error in the logic, running the suite either side of the edit would show no change to the test suite's pass/fail status but some tests would not have been run any longer.
So will I be mandating unit test coverage of all test code? Not at this stage. I think we have decent enough practices on a small enough test codebase that changes core logic sufficiently infrequently that we can continue as we are for now. Will I continue to think carefully about the potential legacy of that decision. Oh yes.
Image: http://flic.kr/p/5kREwF
- For Feathers, legacy code is any code without (unit) tests. That's right: code written today can be legacy code.
- Most software changes seek to hold some or all existing functionality constant and having (good, unit) tests can give a developer reassurance they have achieved that.
- When code has no tests, or has tests which have dependencies, the degree of reassurance drops and it's natural for a developer to want to avoid touching existing code so as to mitigate the risk of breaking something. This can lead to worse code.
- Making code testable involves breaking dependencies so that small atomic elements of the logic and structure are can be addressed individually. Feathers doesn't deny the utility of testing at a larger scale, but strongly promotes the necessity of testing at the micro scale.
- The book has a nice line in terminology: Edit and Pray vs Cover and Modify as the two basic methods of changing code made me chuckle. The notion of a seam is interesting too. It's a way that the behaviour of a piece of code can be changed without changing the code itself, for example by overriding a method on a class.
In Feather's description it's a way to circumvent an unwanted dependency at test time, but for me it provokes thoughts of the unintended consequences that come from a change in one part of the codebase: A seam as a map of the areas in the codebase in which any element has an effect, like a seam of coal spreading underground.
I don't know whether I do accept it - the existence of unit tests is not the same as the existence of useful unit tests or sufficient unit tests to make edits as safe as Feathers is trying to - but the insight it gives is interesting. In the world AFK we should be asking ourselves who tests the testers? In parallel, in the codebase, we should be asking who tests the test code? This is not only a problem for testers; Dev need to think about what kinds of coding policies and so on they need for their own tests.
The view I've taken to date is that for our test code we'll trade off the expense and effort of wrapping it in unit test against the benefits that we can accrue by doing something else - such as investigating the application under test - instead.
That's not to say we don't care about the code. We do and, although we're by no means software engineers, we try to do a decent job: all our code is under version control; we write our own library code to share across suites; we regularly rewrite or even replace components; we report bugs in our test code in the same bug tracking system as our product, and we verify our fixes, and so on.
Testing the test code is important and one of my earliest posts was on that topic. On occasion I've also talked about using (and used) regression test suite output as a kind of harness for the test code itself. Imagine that you have a test suite that you want to refactor. You run it and all of the tests (or checks) pass. Now, hold the product executables constant and change the test code. If, after your changes, the tests still pass you've got confidence that your changes are good. The suite is a self-testing magical gem and you are a coding colossus!
Kind of. What you've actually got is some evidence that you probably haven't ballsed things up totally. Imagine that your edit simply changed every test call of the kind
$actual=create_X(a,b,c);
assert($expected, $actual, "value of X is as expected")
to
$actual=create_X(a,b,c);
assert("pass", "value of X is as expected")
Now your test suite still passes, but performs no tests. Sure but that's unrealistic and you'd never do that kind of thing, would you? No, of course not. But might you occasionally replace a genuine check with a dummy one while debugging? Could you accidentally check that in? Hmm.
There are subtler ways to get to that position too. I came across test code this week that looked like it had lost some functionality in refactoring. This meant that a handful of checks were not being performed. The code looked like this:
if (X & Y) {
if (Z) {
do_one_thing()
}
else {
do_another()
}
}
The key problem here was that if X and Y were not true there was no else clause in which some appropriate action could be carried out. Consider that there was an earlier edit which removed a final
else {
do_another()
}
and altered the logic in the nested conditional with the intention that the single call to do_another()would deal with all relevant cases. Despite the error in the logic, running the suite either side of the edit would show no change to the test suite's pass/fail status but some tests would not have been run any longer.
So will I be mandating unit test coverage of all test code? Not at this stage. I think we have decent enough practices on a small enough test codebase that changes core logic sufficiently infrequently that we can continue as we are for now. Will I continue to think carefully about the potential legacy of that decision. Oh yes.
Image: http://flic.kr/p/5kREwF
Comments
Post a Comment