Sunday, May 22, 2011

Debugging Software and The Scientific Method

I find it odd that debugging software, sorting out the errors in a program's source code, is simultaneously the most frustrating and exhilarating aspect of writing software. I've actually banged my head against a wall when trying to track down a bug. Conversely the feeling of satisfaction when I finally pin a difficult bug down and dissect it is indescribable. Being able to point at a line of code and say "that's the culprit, and this is how we fix it" is a joy like no other. Almost as good as the first time you watching a program you designed and wrote run correctly.

I've described the process to various non-programmers in the past and the analogy I've always used is that of a detective - sleuthing out the underlying problems and pointing a finger at a particular area of code to say "book 'em Danno." And it's not a bad analogy, really. But I think both processes are actually better described as using the Scientific Method.

When trying to sort out what's misbehaving in a piece of software we have to:

Observe what the software is doing - gather as much data as possible on how it should behave given various inputs, and then observe how it actually behaves when given those inputs. How does the output differ from the expected output?

Hypothesize - in order for the program to behave the way it is, the code must be doing X instead of Y. At this point, for simpler bugs, we can often just look at the source code to see. For more complex or subtle bugs, though, we often need to go further.

Predict - If that's the case and it is doing X, then we should see a particular behaviour A, given some input not in our initial observation.

Test - try that input!

Collect data - Does it do A, as predicted? Or did it do something else? If it did something else, back to step 1! Going through here too frequently can lead to that frustration I mentioned...

However, if it did follow our prediction, we need to make a judgment call - is that enough evidence? What other possible hypotheses are out there? How can we test those, to see if they're the case or not? Building a hypothesis and set of tests to falsify (or not) that hypothesis, and simultaneously falsify other likely hypotheses is the art of debugging. And the core of what Science is all about - subjecting a hypothesis to rigorous testing and possible falsification, building up a consensus of data lines to raise it to a proper Theory.

Well run software companies carry the process even further by adding in peer review - all changes should be put before at least one of your peers to see if they can find faults in your solution or new errors you might introduce with your change. It's not (usually) as wide spread as publishing in a peer-reviewed scientific journal, but it can be just as rigorous. A good reviewer will ask penetrating questions about your understanding of the situation and propose alternative hypotheses as to what might be going on, to see how well your solution works.

Certainly, the analogy only goes so far - we're building software, which is an Engineering task. We're not adding to the sum of human understanding about the functioning of the universe. But the methodology is fundamentally the same and good engineers need to have a solid grasp on it.

2 comments:

Unknown said...

Of course, you can use certain languages/methodologies to prove formally that programs - or subsections thereof - WILL exhibit specific behaviour. The behaviour of any conventional program - certainly a single-threaded VN program - is knowable is principle, whereas the true nature of the universe is not. If inductive "knowledge" is the best you can achieve, falsificationism is your best and only friend. But most programming is amenable to deductive proofs.

Having said that, in PRACTICE we tend to code systems whose functioning is simply too complex to use formal proof as a viable programming methodology, so we end up treating the system as though it were something close to the messy "real world", where inductive methods, falsification and trial and error are the norm.

Greg said...

Of course you can attempt to prove simple chunks of computation should behave, given specific conditions. But the day to day reality I've experienced is that it's just not worth the bother, since I'm not writing code for NASA or pacemakers.