Fog Creek Software
Discussion Board




Unit-testing

Has anyone around here used a unit-testing framework like NUnit (http://www.nunit.org/)? Are these any good or just an overhead?

BillyBill
Sunday, September 14, 2003

I'd guess that lots of people around here are using the xUnit frameworks. I'm using JUnit and it's really useful, particularly if you tell (N)Ant to run the unit tests when you build.

John Topley (www.johntopley.com)
Sunday, September 14, 2003

How often and at what milestones do you guys perform unit-testing in a project life-cycle?

BillyBill
Sunday, September 14, 2003


Well, to get the full benefit of a unit test suite you kind of need to run it all the time.

You use the unit tests to support refactoring efforts. The unit tests tell you your recent changes haven't broken anything.

If you're using Test Driven Design, then the unit tests tell you when you're done.

You run the unit tests after you've integrated some new code in order to confirm that your additions haven't broken the product builds.

Basically, whenever you change something, run the unit tests. For that reason, the unit tests have to be quick. You lose a lot of the benefits of unit tests if you can't run them anytime, all the time.

anon
Sunday, September 14, 2003

Optimally, every build automatically runs all the tests.

Ingrained in your brain is the tying between "build & test". If the code compiles, it's tested. If the tests succeed, THEN you can interact with it and make sure that it's what you want. Until the tests succeed, you're just wasting time interacting.

Realistically, this may not be possible, as the tests take a long time to run. In this case, I would say that at the very least, after each build, you should be automatically running the largest realistic subset of tests you can. That may just be the tests for the class you're writing, or it may be the tests for the library.

Either way, all source code checked into the tree should be running all tests. You should have a continuous integration server which automatically does a clean build and test as soon as anybody checks anything in.

Brad Wilson (dotnetguy.techieswithcats.com)
Sunday, September 14, 2003

"You should have a continuous integration server which automatically does a clean build and test as soon as anybody checks anything in."

Better still if the build or unit/regression testing fails, then the check-in should be rolled back.

This way the controlled code base is always 'clean', at least to the limits of the test regime. In my experience having the check-in break means that programmers are rather less inclined to what I like to call 'creeping refactoring'; the pernicious habit of looking at a piece of code and thinking "that's awful, I'll just fix it". This is (1) a waste of time (providing the bad code did what it was supposed to do, and that's why we have unit testing), (2) can be come a source of friction bewteen staff and (3) has an awful habit of beaking stuff.

If a programmer doesn't like the implementation of something he can always post a request to re-factor it as a separate work item, but I've found that letting people just re-write stuff willynilly is a grade A1 method for blowing schedules.

P.S. I thought it interesting that Oracle have just discovered distributed builds. I remember doing this on a workstation network with Apollo Computers DSEE in about 1987 or '88. Plus ca change.

David Roper
Sunday, September 14, 2003

Well, if refactoring breaks stuff very often, it doesn't sound like the unit tests are providing sufficient coverage.  I would suggest using a code coverage tool in conjunction with the unit tests to remedy this situation.

Scot
Sunday, September 14, 2003


I use "Ok" in perl with Test::More.

IMHO, it's probably cut my debugging time by ~ 80%, and overall development time by 5%-10%.

There is no silver bullet, but I think TDD (Test-Driven Development) is a great "fire and motion" technique.

JMHO ...

Matt H.
Monday, September 15, 2003

Another perspective:
http://www.satisfice.com/articles/test_automation_snake_oil.pdf

Mark
----
Author of "Comprehensive VB .NET Debugging"
http://www.apress.com/book/bookDisplay.html?bID=128

Mark Pearce
Monday, September 15, 2003

Scot,

Absolutely, but the issue is a principally a cultural one. First you've got to get unit testing accepted by the programming team, then you've got to make sure that the test coverage is complete - which in my mind means all the exposed interface and not just that part which is actually used within an application - and then you've got the ensure that the tests are always run.

All too often coders will consider (in their own minds) that checking in incomplete/ buggy code is justified in order to allow them to proceed or to pass a "quick fix - does this help?" version to QA. Embedding testing into the check in process focuses attention on the fact that management consider the quality of the code base to be a high priority, and this in and of itself helps to maintain quality.

P.S. Do not get me into an argument about why coders should create interfaces that are more extensive than actually required. For some reason they just do, usually with the justification that "I was working on it and it crossed my mind that XXX might be useful sometime". My response is when? Ensuring that all interfaces are fully tested and stating categorically that untested interfaces are not to be used,  makes it just that little bit harder for programmers and has amazing effects.

David Roper
Monday, September 15, 2003

Mark:

The article you linked to was criticizing record-and-playback automated GUI Testing.

Most of the posts here are about Automated *UNIT* testing, JUNIT style - programatically testing the code behind the GUI.  In my experience, Test Driven Development VASTLY decreases the amount of time spent on "Integration Debug and Fix" at the end of the project.

If you're pointing out that WinRunner is over-hyped - I agree.  But I don't think WinRunner type tools are what these folks are talking about.

Matt H.
Monday, September 15, 2003

"testing the code behind the GUI"

--> By this I mean the Business-Logic and Data tiers, not the GUI tier.  If you're MVC, think "Model and Controller, not View."

Matt H.
Monday, September 15, 2003

Matt,

While the article was investigating one specific type of automated testing, many of its criticisms are general to automated testing in general. Many developers think of automated testing as a "silver bullet", and this article tries to debunk that point of view.

One of my own major criticisms of the way in which automated unit testing is used is that developers are often too lazy to step through their code in a source-level debugger. Their reasoning seems to be that if the unit test has passed, then the code is okay - white-box testing isn't required if the black-box testing worked.

No, no, and no again! Just because a unit test has passed doesn't mean that the code is okay. Diligent, worldly-wise and cynical developers know that they need to combine the use of multiple tools to remove bugs, but too many people use unit testing as a crutch to be lazy.

So my rant is against unit testing per se, but against developers who treat it as a magic wand. Repeat after me...

"there is no silver bullet...there is no silver bullet..." :)

Mark

Mark Pearce
Monday, September 15, 2003

"Better still if the build or unit/regression testing fails, then the check-in should be rolled back."

Is the rollback done by the build tool? This sure got my attention, as it would save a lot of headaches later on.

Been on several teams where programmers indiscriminately check in code that breaks a build. I've been interviewing for senior level/tech lead positions and I'm hoping to implement some sort of daily build in my next position.

Slartibartfast
Monday, September 15, 2003

David, I hear you... it sounds like your situation is difficult... and that you already know the answers.  :-)

Scot
Monday, September 15, 2003

> In my experience having the check-in break means that programmers are rather less inclined to what I like to call 'creeping refactoring'; the pernicious habit of looking at a piece of code and thinking "that's awful, I'll just fix it".

I disagree -- I think you are thwarting a healthy impulse here.

I know there are edge cases: developers who simply *must* rewrite every bit of code they see; developers who are assigned to work on X, but start rewriting Y and Z instead. I'm not defending that.

But if you are working on code, and you have unit tests, and you see a way to make the code simpler and clearer, I think you should go for it.

> but too many people use unit testing as a crutch to be lazy.

Disagree again. If you're lazy cause you're sure sure sure that you have enough unit tests and all your unit tests pass, then relax: you've earned it.

As for sitting in front of a debugger -- pul-leeze! Most of the people who I've seen do that write code that's way too complicated in the first place. Again, it's OK if you use the debugger to learn about the code (especially legacy code) so you can write unit tests. But if the only way you can be sure that the code works is to step through it, I think you are a few steps behind to begin with.

Make your code simple and test it well, and you will be in the top 10% of the industry.

Portabella
Tuesday, September 16, 2003

Portabella,

>> As for sitting in front of a debugger -- pul-leeze! Most of the people who I've seen do that write code that's way too complicated in the first place. <<

I recently fired a developer working for my company. After missing one bug because he relied on a faulty unit test, he was warned to step through all new code - it's part of our process. Then he was caught again checking-in code without stepping through it - so he was given a second warning.

The final straw was when our hero made a modification and ran it through the unit tests successfully. Turns out that his modification was by-passed by an optimisation further up the call chain, and the correct test result was returned for a completely unrelated reason. This error was nearly released into production - it was only caught at the QA stage 'cos we got lucky.

When asked why he had ignored two explicit warnings about relying on unit tests exclusively, he claimed that he was a highly-skilled developer who didn't make mistakes often, that the requirement to step though code was silly and that QA should take most of the responsibility for catching bugs. Because he was a contractor, we were able to "release" him immediately, without any notice.

Now of course this was partly a management mistake for hiring this idiot, and our hiring process will be changed in an attempt to detect this type of attitude. But it also reinforces my opinion that developers who rely on any single technique to test their code are intellectually lazy and often full of shit.

Mark
----
Author of "Comprehensive VB .NET Debugging"
http://www.apress.com/book/bookDisplay.html?bID=128

Mark Pearce
Tuesday, September 16, 2003

"and that QA should take most of the responsibility for catching bugs"

This seems to be a common attitude with developers from what I've seen. Unfortunately, it has the undesirable side effect of QA hating developers because "they're rubbish - all they ever do is release buggy code".

Better Than Being Unemployed...
Tuesday, September 16, 2003

> After missing one bug because he relied on a faulty unit test

Wow! So why don't your unit tests work?

It seems to me that whatever thought-process you're capturing when you step through the code with a debugger could just as well be captured by a unit test -- and then it would be repeatable.

I do omit certain categories of code which are very difficult to unit test (eg, how to automate GUIs tests is a perennial topic). I could totally understand a mandate to hand test these.

Using debuggers has the interesting property of being totally invisible: when I hand you the code, how are you going to know if I ran it through a debugger or not? With unit tests, you know.

As the author of a book on debuggers, I think you're hardly objective in this matter, and you're certainly not winning any points with phrases like "intellectually lazy and often full of shit". Fact is, it sounds to me like you've got a big chip on your shoulder, and are just looking for people to fire to reinforce your own beliefs. Enjoy your little tinpot dictatorship while it lasts, bro!

The best counterexample to your argument is people who ship high-quality code with comprehensive (and working!) unit tests. They're out there; I've worked with them.

Portabella
Tuesday, September 16, 2003

Portabella,

>> Wow! So why don't your unit tests work? <<

For the same reason as some of your unit tests don't work. Automated unit tests are written in code, and code has bugs. In this case, our hero wrote the buggy unit test and the buggy code.

>> It seems to me that whatever thought-process you're capturing when you step through the code with a debugger could just as well be captured by a unit test -- and then it would be repeatable. <<

Unit tests only catch certain classes of bugs - stepping through code tends to catch an entirely different set of bugs - other techniques catch other sets of bugs. To produce good code, a developer needs to use many techniques, and not just rely on automated tests to be some sort of magic bullet.

>> ...when I hand you the code, how are you going to know if I ran it through a debugger or not? <<

The way we found out about these three cases is by finding each bug, and realising that the developer in question would have found each bug if he had bothered to step through the code.

>> As the author of a book on debuggers, I think you're hardly objective in this matter, Fact is, it sounds to me like you've got a big chip on your shoulder, and are just looking for people to fire to reinforce your own beliefs. <<

My book is on .NET debugging techniques in general, not on any specific debugger. It also majors on writing code that doesn't have bugs in the first place. Writing this book, along with co-authoring other books and lectures on this topic, has given me some objective (as well as subjective) insights into this subject.

Of course, I wouldn't presume to be on your level - I bow humbly before your superior expertise. Perhaps I could be priviledged enough to glimpse some of your public writings in this area?

>> Enjoy your little tinpot dictatorship while it lasts, bro! <<

It's lasted 8 years. But I'll retire soon with my money and relinquish my dictatorship - no more will I be able to sack delinquent developers <sad sigh>.

Mark

Mark Pearce
Tuesday, September 16, 2003

> Automated unit tests are written in code, and code has bugs.

  But when you step through code, you never make mistakes? Sheesh! What kind of reasoning is that?

  When my unit tests break, I simply fix them. Then they work. Same with any other code.

> Unit tests only catch certain classes of bugs - stepping through code tends to catch an entirely different set of bugs

  You still haven't addressed my point. Let's accept, provisionally, the idea that debugging can show us new classes of errors. Then why not just write unit tests to show the errors and verify that they're fixed?

  We could also ask if marathon debugger sessions are as productive as, say, pair programming or code review. But I guess you'd rather just fire more developers to prove yourself, eh?

> The way we found out about these three cases is by finding each bug, and realising that the developer in question would have found each bug if he had bothered to step through the code

Right. This is the part where you're getting religious on me.

It seems to me that if "our hero" had written correct unit tests OR provably stepped through with a debugger but missed the error, he'd still be working for you. True or false?

"But he couldn't have missed it with a debugger" is just saying that you'll never make mistakes with a debugger. I can understand why you, the author of a book on debuggers would be pushing this, but it's essentially dogma, not fact.

> To produce good code, a developer needs to use many techniques

I do agree with you on this one, but I think you are utterly wrong to think that good code cannot be produced without a debugger.

> I bow humbly before your superior expertise

Don't make even more of an ass of yourself, please.

> But I'll retire soon with my money and relinquish my dictatorship

On your profits from the book? I wouldn't count on that.

And I'm willing to bet that the fire-'em-unless-they-use-a-debugger rule won't long outlast your regime. In fact, since religious dogma seems to often be replaced with its polar opposite, I see a bright future in unit testing for your company once your ass is outta there!

Portabella
Tuesday, September 16, 2003

Portabella,

>> But when you step through code, you never make mistakes? Sheesh! What kind of reasoning is that? <<

Read my post again. Every technique involves finding bugs and missing bugs. My point is that the more techniques you use, the more bugs you will find. I'm not a lone voice - several authorities (Steve McConnell, John Robbins, Steve Maguire, Jim McCarthy) agree with me - perhaps you know more than all of us put together?

>> Then why not just write unit tests to show the errors and verify that they're fixed? <<

Because unit tests can't find many types of bugs - for instance, the optimisation bug that was the final straw in my previous post.

>> We could also ask if marathon debugger sessions are as productive as, say, pair programming or code review. <<

We don't do pair programming, but we do design and code reviews. Once again, reviews find a different set of bugs to code stepping or unit tests.

>> It seems to me that if "our hero" had written correct unit tests OR provably stepped through with a debugger but missed the error, he'd still be working for you. True or false? <<

False. He was fired because he missed three bugs that would all have been found using a simple walkthrough, and because of his bad attitude towards QA, and because I thought he was a lazy developer, and because he tried to deny that the bugs were his fault.

>> "But he couldn't have missed it with a debugger" is just saying that you'll never make mistakes with a debugger. <<

No, it's not saying that. Anybody can make a mistake with a debugger - but it's an extra technique that would almost certainly have found these specific bugs.

>> On your profits from the book? I wouldn't count on that. <<

On the substantial profits from the company that I founded and built over the years. Why do you insist on making this personal?

And now I'm intrigued. How many books have you written? How many people do you employ?

>> In fact, since religious dogma seems to often be replaced with its polar opposite, I see a bright future in unit testing for your company once your ass is outta there! <<

Whoever pays the piper calls the tune. Why should I care when I'm alternating between the Caribbean, the Swiss Alps and the Rockies with the spoils :)

Mark

Mark Pearce
Tuesday, September 16, 2003

> My point is that the more techniques you use, the more bugs you will find.

I've never disputed that; I've only questioned the must-use-a-debugger-to-write-good-code nonsense.

> I'm not a lone voice - several authorities (Steve McConnell, John Robbins, Steve Maguire, Jim McCarthy) agree with me

Agree with the multiple techniques approach or this  debugger-or-die dogma?

Plenty of great code has been written *without* debuggers. Kernighan -- you've heard of him, perhaps? only uses print statements.  The XP guys actively loathe debuggers.  Do you think all of the code written before there were debuggers just sucks?

Myself, I'm more moderate: if you get great results with a debugger, then go for it! Just make sure you write your unit tests; they'll be around long after you've gone home for the night, and they'll show me the important things you saw in your debugging session.

> perhaps you know more than all of us put together?

Why do you insist on this kind of straw man horseshit?

> Because unit tests can't find many types of bugs - for instance, the optimisation bug that was the final straw in my previous post.

That's a pretty amazing statement to make. You have a program and you haven't a *clue* what it does at runtime because of the optimizer?

I'll put it to you bluntly: can you write a unit test to show what this piece of code does, regardless of optimizations, or not?

> False. He was fired because he missed three bugs that would all have been found using a simple walkthrough

You just contradicted yourself. If his unit tests had actually worked (ie, no bugs), he'd still be there, according to everything you've said.

> On the substantial profits from the company that I founded and built over the years. Why do you insist on making this personal?

Why do you insist on dragging out your company, your book, your retirement plans, etc etc? Or trying to win the argument by asserting that you've written more books than I have? Do you think that makes you right?

Can your arguments stand on their own, or do they all rely on some poor soul who you canned just to show how great debuggers are?

Portabella
Tuesday, September 16, 2003

Portabella,

>> Do you think all of the code written before there were debuggers just sucks? <<

Do you think all code written before there were unit tests just sucks?

>> Agree with the multiple techniques approach or this  debugger-or-die dogma? <<

Agree with both.

>> You just contradicted yourself. If his unit tests had actually worked (ie, no bugs), he'd still be there, according to everything you've said. <<

He's still be here, at least for a while, only because I wouldn't have had the opportunity to dissect his attitude if he had caught those bugs.

>> I'll put it to you bluntly: can you write a unit test to show what this piece of code does, regardless of optimizations, or not? <<

I'll put it bluntly back - no, you can't. The feature required knowledge of multiple interrelated methods, and you wouldn't get the knowledge that a business rule optimisation skipped the new code without either stepping through the code or having an intimate understanding of the business.

>> Or trying to win the argument by asserting that you've written more books than I have? Do you think that makes you right? <<

It doesn't make me right. It just means that I've spent the majority of my computing life trying to understand software quality issues. And I built a successful company on that understanding.

Now, I'm still interested in what you've achieved in your career and in business with your understanding of software quality issues.

Because it's contributed bugger-all to your anger management and social skills :)

Mark

Mark Pearce
Tuesday, September 16, 2003

> Do you think all code written before there were unit tests just sucks?

Of course not.

I'm not the one pushing the debuggers-or-die approach; you are.

And you didn't answer my question.

>> Agree with the multiple techniques approach or this  debugger-or-die dogma? <<

> Agree with both.

Can you point me to the places where these authors say that without debuggers you can't write good code?

> without either stepping through the code or having an intimate understanding of the business

So if you had the "intimate understanding of the business", you could write a unit test for it?

> Now, I'm still interested in what you've achieved in your career and in business with your understanding of software quality issues.

No, you're interested in trying to evade those same issues.

If you've been in the industry long enough, you've seen plenty of people "succeed", ie make money, have successful companies, etc with ideas that are all but totally wrong, all kinds of cranky and weird processes, and so forth.

I give you plenty of credit for drive and ambition, but that success says nothing whatsoever about being right.

> Because it's contributed bugger-all to your anger management and social skills

You should talk! Go through and read your posts again -- they drip with ego and vanity. From bragging about your retirement to bogus appeals to authority, you haven't missed a trick.

Maybe you should stick to your little company where you can fire anyone who disagrees with you.

Portabella
Wednesday, September 17, 2003

Portabella,

Here are a few authority citations for you:

************
Steve McConnell: Code Complete
************
Chapter 4: Section 4.5: Check The Code Formally

"Once the routine compiles, put it into the debugger and step through each line of code.

"Make sure that each line executes as you expect it to. You can find many errors by following this simple practice."

Chapter 25: Unit Testing

"The process of stepping through a piece of code in a debugger and watching it work is enormously valuable.

"Walking through code in a debugger is in many respects the same process as having other
programmers step through your code in a review. Neither your peers nor the debugger has the same blind spots that you do. The additional benefit with a debugger is that it's less labor-intensive than a team review. Watching your code execute under a variety of input-data conditions is good assurance that you've implemented the code you intended to."

************
Steve McConnell: Software Survival Guide
************
Table 14.1: Recommended Integration Procedure

1 Developer tests a piece of code.
2 Developer unit tests the code.
3 Developer steps through every line of code, including all exception and error cases, in an interactive debugger.
4 Developer integrates this preliminary code with a private version of the main build.
5 Etc, etc.

************
John Robbins: Debugging Applications
************
Chapter 3: Debugging During Coding

"When developing new code, or updating existing code, I follow a standard pattern. I write a little bit of code, maybe a couple of functions or the initial parts of a complicated function and immediately test the code with the debugger. The primary reason is so I can evaluate general logic and flow."

************
Steve Maquire: Writing Solid Code
************
Chapter 4: Step Through Your Code

Yes, an entire chapter devoted to code stepping and how to do it effectively. Here's an exerpt from the chapter introduction:

"The best way to write bug-free code is to actively step though all new or modified code to watch it execute, and to verify that every instruction does exactly what you intended
it to do."

************

Now, put yourself in my position - who should I believe? On the one hand are the industry
luminaries cited above, plus my own experience and research. On the other side is...you.

Hmmm - difficult decision.

Mark

Mark Pearce
Wednesday, September 17, 2003

Let's do a little reality check.

McConnell's book was written in 1993. The examples are in C. There were no standard unit-testing frameworks around at the time. So, while this was good advice when he gave it, the world has evolved since then.

Note also that the integration technique he recommends is hugely expensive in time; I'll return to this point later.

I also had a look at my copy of Writing Solid Code, also from 1993, and it seems clear that Maguire isn't at all familiar with automated test frameworks. The "What about Sweeping Changes" box on page 78 makes this clear, because the answer to his question: "Can you make such sweeping changes without introducing any bugs?" is "Yes, if you have a comprehensive unit test suite". His advice -- to check every branch with a debugger -- is, I think, sub-optimal. It's just too easy to make a mistake, especially late at night.

The section from Robbins' book - I haven't read it -- seems to be describing his own personal technique. Great, if it works for him! I suspect, but of course can't know, that he'd be totally down with TDD, and changing his paragraph to read:

"When developing new code, or updating existing code, I follow a standard pattern. I write a little bit of code, then I write a unit test for it. The primary reason is so I can evaluate general logic and flow."

> On the other side is...you.

If you're wondering why I'm flaming you so relentlessly, the answer is the cheap rhetorical tricks like this one that you've employed over and over and over again. I already pointed out, several posts ago, "industry luminaries" who do not use debuggers as a technique. You read that post, because you responded in minute detail to it. So why do you keep pulling this bullshit?

The *reasons* that people prefer unit tests to debuggers are also not difficult to understand. I'll find a "luminary" to cite them if you wish, since you seem overawed by them, but I think *every* developer can understand that:

1. The debugging session is transitory, and, as I said before, invisible. The test code is permanent, or at least as permanent as any other code.

2. The unit tests can be made to run automatically, the debugging session cannot.

3. It is extremely difficult to use debuggers at all in some environments, eg, server-side applications (in production, in remote environments, etc). Indeed most of what you can do with debuggers that is difficult to do with unit tests can be done with proper logging.

Kernighan and Pike say the same thing in "The Practice of Programming":

"As personal choice, we tend not to use debuggers beyond getting a
stack trace or the value of a variable or two. One reason is that it
is easy to get lost in details of complicated data structures and
control flow; we find stepping through a program less productive
than thinking harder and adding output statements and self-checking
code at critical places. Clicking over statements takes longer than
scanning the output of judiciously-placed displays. It takes less
time to decide where to put print statements than to single-step to
the critical section of code, even assuming we know where that
is. More important, debugging statements stay with the program;
debugging sessions are transient."

Portabella
Thursday, September 18, 2003

Portabella,

>> If you're wondering why I'm flaming you so relentlessly... <<

I'm not wondering about this at all. I am fully aware that your lack of social skills is driving you to make 'ad hominem' attacks similar to this one :)

>> 1. The debugging session is transitory, and, as I said before, invisible. The test code is permanent, or at least as permanent as any other code. <<

You seem to be concocting some extraordinary strawman. I love unit tests - our current product has more than 1,800 unit and feature tests, and that number is growing every day. But in our company, unit tests are done *in addition* to code stepping.

Using both techniques finds more bugs than using either technique in isolation. Is this so hard for you to understand?

>> 2. The unit tests can be made to run automatically, the debugging session cannot. <<

See my previous point.

>> 3. It is extremely difficult to use debuggers at all in some environments, eg, server-side applications (in production, in remote environments, etc). <<

With .NET, it's very easy to debug server apps, remote apps, and production apps - I should know, because I've written a whole book on this exact subject.

And of course, the same argument can be made about unit tests when trying to test GUI-driven apps. The difficulty of applying a single technique in certain situations isn't a good reason to throw up your hands and admit defeat.

>> Kernighan and Pike say the same thing in "The Practice of Programming": <<

You complain about how outdated "Code Complete" is, and then quote a book that's even older!

Mark

Mark Pearce
Thursday, September 18, 2003

> 'ad hominem' attacks similar to this one :)

Uh, the ad-hominem-attack guy would be *you*.

I'll leave it to the readers of the thread (if there are any left at this point) to decide who's making the points better.

> Using both techniques finds more bugs than using either technique in isolation. Is this so hard for you to understand?

No indeed: in fact, I said so several times already in this thread! Why do you relentlessly misquote me?

> The difficulty of applying a single technique in certain situations isn't a good reason to throw up your hands and admit defeat.

No one's "admitting defeat".

I said several times that there are some problems that are difficult to unit test. This is, in fact, well known. But I don't doubt that, *if there were a reasonable way to unit test them*, people would do so, because of the benefits I mentioned.

What I am frankly dubious about are the supposedly wonderful benefits from using debuggers, wonderful to the extent that you can't write good code without them.

You've continued to assert that, dogmatically, apparently without understanding that the existence of good code written without debuggers is a logical contradiction to your argument.

I have no objection to the story that goes: "I'm a guy who understands debuggers inside and out. I've used them effectively in lots of places. If you've never used them [and please note that I have], you may be surprised by what you can learn about the code with them." You'd do a lot better with this tack, instead of the "intellectually dishonest and full of shit" line (now there's an ad-hominem attack!) that you were pushing earlier.

> You complain about how outdated "Code Complete" is, and then quote a book that's even older!

There's a substantial difference: Kernighan and Pike were well aware of debuggers, but chose not to use them. Their reasons for doing so are still valid and understandable to this day. McConnell, Macguire et al. do not seem to have had unit tests suite available (*). I admire both of these men, but they simply could not discuss what they did not have.

* The discussion in the book actually sounds like they did have unit tests, but at such a low level that they did not have any real confidence in them. Most likely, this means that there was no testing framework available, and so whether the test worked or not was very hit-or-miss. For example, a C programmer might write a dummy executable that called the function in question and printed out the expected output. While better than nothing, that's simply lightyears away from the results you can get with the various unit testing frameworks available today.

> our current product has more than 1,800 unit and feature tests, and that number is growing every day.

Good for you.  Sounds like you need to add at least one more though! (Refer to previous posts if you don't get this).

As you converge on unit testing nirvana, consider dropping the requirement to run everything through a debugger. You can do this in stages by dropping the places where you are confident you have enough coverage. Obviously, given the kinds of bugs you're seeing, and the cited code complexity, you're not quite there yet, and maybe you shouldn't give up your crutches just yet. But give it time, and it will come! :) And you can use the time you save for posts to Joel on Software!

Portabella
Thursday, September 18, 2003

>> What I am frankly dubious about are the supposedly wonderful benefits from using debuggers, wonderful to the extent that you can't write good code without them.

You've continued to assert that, dogmatically, ... <<

Show me anywhere in this thread where I've made this statement. Once again, for the slow of thinking, here's my viewpoint:

Unit tests are good. Code stepping is good. Both of these techniques combined are better than either in isolation.

>> " You'd do a lot better with this tack, instead of the "intellectually dishonest and full of shit" line (now there's an ad-hominem attack!) that you were pushing earlier. <<

It wasn't an ad-hominem attack because I was referring to a *class* of developers (ie, *not* you, although it might include you) - don't you understand the meaning of 'ad-hominem'?

>> As you converge on unit testing nirvana, consider dropping the requirement to run everything through a debugger. <<

Repeat after me...unit testing is not a silver bullet...unit testing is not a silver bullet...unit testing is not a silver bullet...in fact, there is no such thing as a silver bullet

>> Obviously, given the kinds of bugs you're seeing, and the cited code complexity, you're not quite there yet, and maybe you shouldn't give up your crutches just yet. <<

Our current QA stats show one bug per 1,600 lines of code reaching QA. What are your current project stats? Do you even measure your stats? Or isn't that a requirement for a wage-slave such as yourself?

Mark

Mark Pearce
Thursday, September 18, 2003

> It wasn't an ad-hominem attack

Maybe so... but it's still tremendously arrogant.

And this one clearly *is* both arrogant and ad-hominem:

> Or isn't that a requirement for a wage-slave such as yourself?

Rather than sticking to your case, you've tried to bait me throughout this entire "discussion".

> Unit tests are good. Code stepping is good. Both of these techniques combined are better than either in isolation.

You've watered this down a lot, now, to the point that one could hardly disagree.

But when you say "good", you're still omitting the *cost* of the technique. Clearly that's what Kernighan and Pike are objecting to.

I said before that I didn't object to folks running their code through debuggers, if they find it helpful. But I personally  find it far less helpful than unit testing, code review, and pair programming (and I'm far from alone in this belief).

I also find it tedious, and rather than thinking that I need to be more anal about it, I regard tedium as a legitimate danger signal. There's only so much tediousness that we can deal with.

I'll put it to you another way: if we have unit tests for code, and pair program it, so that at least two people think it's sensible, and code review it, what's the incremental benefit of running it through a debugger as well?

> Our current QA stats show one bug per 1,600 lines of code reaching QA.

Good for you!

But as you must know, the issue that I'm pushing towards is this mysterious unable-to-unit-test bug that you've told us about earlier. Why is it, O Testing Maven, that your so-comprehensive unit tests didn't catch this? If it's such a major issue for you, then why does it rest with individual contractors? If I were you, I'd be plenty concerned about that, and I'd be working to make it so that my contractors could unit test only their pieces, without worrying that they'll get sandbagged by the framework.

To sum up the thread, you *say* that what you're interested in is less bugs, but it looks instead like a soapbox to talk about how great your expertise with debuggers is, and to rag on those who ain't down with your religion.

You can impress me -- and maybe end this discussion -- by actually stating some cases where you can use debuggers to crack some issues that are tough with unit testing (if there's a straightforward way to unit test it, then I'll have to ask why I should do it two ways when one way will suffice).

*Don't* bother to refer me to your book. In fact, look at it the other way: if you can actually make some coherent points, you may actually get some sales. The way the discussion has gone so far, I think you've lost quite a few.

Portabella
Thursday, September 18, 2003

Portabella,

>> And this one clearly *is* both arrogant and ad-hominem: <<

Of course - I'm now responding in a similar vein. If you can't take a little bloody nose, maybe you ought to go back home and crawl under your bed. It's not safe out here. It's wondrous, with treasures to satiate desires both subtle and gross, but it's not for the timid.

>> You've watered this down a lot, now, to the point that one could hardly disagree. <<

Perhaps I've watered it down too much. So here's a more controversial addition to my stated viewpoint:

"Developers who use only a single testing technique are intellectually lazy and often full of shit."

Now of course, this is a personal thing. Just like your distaste for code stepping is a personal thing.

>> I also find it tedious, and rather than thinking that I need to be more anal about it, I regard tedium as a legitimate danger signal. There's only so much tediousness that we can deal with. <<

I find the coding of unit tests to be extremely tedious, but it doesn't stop me doing it.

>> if we have unit tests for code, and pair program it, so that at least two people think it's sensible, and code review it, what's the incremental benefit of running it through a debugger as well? <<

As I stated above, code stepping finds a different set of bugs to unit testing and reviews. It shows you what's *really* happening close-up and in your face - you can't hide behind idealised unit tests and cosy self-deception.

Perhaps some background will help. Before I became a developer, I was a professional chess master. Chess is a sport that exposes your mistakes ruthlessly. All of your expertise, imagination and knowledge is summarised in a single number. If you make more mistakes than your peers, that number drops. If you don't make as many mistakes as your peers, that number rises.

Developers don't have this immediate and ruthless feedback, so they tend to deceive themselves. They under-estimate the number of bugs they produce, and many of them form lazy habits because of this cushioning from reality. This is one major reason why the industry's bug rates are so high.

>> Why is it, O Testing Maven, that your so-comprehensive unit tests didn't catch this? <<

Because the unit tests verified that the code produced the correct result (which it did). The unit tests didn't show that the correct result was produced for the wrong reason (which it was).

>> To sum up the thread, you *say* that what you're interested in is less bugs, but it looks instead like a soapbox to talk about how great your expertise with debuggers is, and to rag on those who ain't down with your religion. <<

You regularly construct these silly strawmen. I've never claimed any great expertise with debuggers - if you want that, you need to talk to somebody like John Robbins.

>> You can impress me -- and maybe end this discussion -- by actually stating some cases where you can use debuggers to crack some issues that are tough with unit testing (if there's a straightforward way to unit test it, then I'll have to ask why I should do it two ways when one way will suffice). <<

Give me a few hours, and I'll extract the relevant code from our SCM tool and present it here.

Mark

Mark Pearce
Thursday, September 18, 2003

Portabella,

Here's the first of the examples that you requested. I'm assuming that you can read C#. If not, I can explain the example in pseudo-code instead.

This is somewhat modified from the original code to remove superfluous details:

bool AccessGranted = true;

try
    {
    // See if we have access to c:\test.txt
    new FileStream(@"c:\test.txt",
                    FileMode.Open,
                    FileAccess.Read).Close();
    }
catch (SecurityException x)
    {
    // access denied
    AccessGranted = false;
    }
catch (Exception x)
    {
    // something else happened
    }

The first unit test checked that the correct value was returned from this function if the CLR granted access to the test file. This test passed because no exception was thrown and the function returned true.

The second unit test checked that the correct value was returned from this function if the CLR didn't grant access to the test file. This test passed because a SecurityException was thrown and the function returned false.

Unfortunately, no unit test was written to check that the correct value was returned from this function if the CLR granted access, but a discretionary access control list (DACL) on the file didn't grant access.

In this case, a different exception is thrown, and caught by a badly-coded catch clause that both caught System.Exception without re-throwing it (a bad no-no) and didn't set the return value correctly.

If the developer had stepped through the code, he would almost certainly have noticed these two problems while following the two unit tests mentioned above. He could then have used this knowledge to fix the routine and create another unit test for this third situation.

Mark

Mark Pearce
Thursday, September 18, 2003

Portabella,

Here's the second of the examples that you requested. I'm assuming that you can read VB .NET. If not, I can explain the example in pseudo-code instead.

Once again, this is somewhat modified from the original code to remove superfluous details. For the sake of this discussion, we have a reference type called Container and an associated subtype called ShipContainer.

Our hero decides to override the Equals method of ShipContainer to always return true if a ShipContainer is compared with a Container. So his new code for the ShipContainer type goes something like this:

Overloads Function Equals(ByVal AnyContainer As Container) As Boolean
    Returns True
End Function

Let's ignore for the moment the follies of this procedure - pretend that you've never heard of the four major equality principles and the fact that this code violates at least two of them.

Our hero writes a unit test to instantiate a ShipContainer and compare it with a Container - sure enough, the unit test returns true and our hero is happy.

Elsewhere in the application is some perfectly legitimate code that says the following:

Dim MyContainer As Container
Set MyContainer = New ShipContainer

What happens if you compare MyContainer with a Container? Ooops - it returns false!

The explanation for this behaviour is subtle. The Equals member of ShipContainer as written above doesn't actually overload the Equals member of Container, because Container inherits from Object, whose Equals member takes an Object parameter.

Because our hero omitted to write a unit test that declared an object as Container, but actually instantiated it as ShipContainer (inheritance rules dictate that a subtype can always be substituted for its supertype), he created an enormous bug. If he had actually stepped through his new code, it is quite likely that he would have realised the different possibilities that his code (and unit tests) didn't handle.

Mark

Mark Pearce
Thursday, September 18, 2003

Portabella,

My third example is a classic illustration of the dangers of relying on unit tests to avoid refactoring bugs. Again, I'm assuming that you can read VB .NET. If not, I can provide pseudocode instead.

Our intrepid hero finds the following procedure:

Function RandomCalculation(ByVal PurchaseAmount As Single) As Single
  If PurchaseAmount < 0.0 Then
    Throw New ArgumentException("Purchase amount must be >= zero")
  Else
    Return PurchaseAmount * 1.08F
  End If
End Function

He decides to refactor this procedure because he's onvinced that for code readability, the nominal case should on the main line, not in the ...Else clause - I told you he was stupid. His refactored code looks like this:

Function RandomCalculation(ByVal PurchaseAmount As Single) As Single
  If PurchaseAmount >= 0.0 Then
    Return PurchaseAmount * 1.08F
  End If
  Throw New ArgumentException("Purchase amount must be >= zero")
End Function

So now he runs the unit tests, which throw the full range of legitimate inputs at this procedure. Sure enough, all of the unit tests are successful. Our hero proclaims himself a Master of the Universe, and moves on to his next great challenge. Can you see his screw-up?

When this code reaches QA, one of the QA tests happens to trigger an unrelated bug which results in an invalid (according to the spec) argument being passed to this procedure. The invalid argument value is NaN - the acronym for Not A Number - which was generated by an earlier  floating-point division by zero.

NaN has its peculiar quirks. Without going into details here, the first version of the above procedure throws an exception when passed an argument of NaN, but the second version doesn't.

Remember that NaN was not a valid argument according to the spec, so no unit test was written for it. But the first developer coded the procedure properly, and the refactoring broke this working code because of our hero's ignorance of NaN issues.

This bug probably wouldn't have been found by code stepping, but it's a wonderful demonstration of the dangers of relying on unit tests to avoid refactoring bugs.
 
Mark

Mark Pearce
Friday, September 19, 2003

Mark, Portabella

Well, there is still somebody else reading this thread: while I’m not keen on the flaming there seems to be a sensible discussion going on underneath, so up on to the soapbox I go with a VERY LONG message....

I confess that to start with I would take Portabella’s stance on this one – using an automated unit test framework along with writing the unit tests up-front with the interface force a developer to think hard about what a component really does, then they act as a confirmation that the component’s implementation matches those expectations, then they allow refactoring / extension / behaviour modification of the component to be done safely in the future, blah, blah, blah … I’m sure both of you are quite familiar with all this.

I’ll go out on a limb and try to summarise what I think are the two viewpoints being taken here:-

Mark:

1. Unit test are pieces of code – code has bugs – so unit tests cannot be trusted blindly to ensure the correct specification and testing of a component.

2. Unit tests may be incomplete – so there are holes in a component’s implementation that are not being tested, and act as a hiding place for bugs / unspecified additional behaviour of an interface.

Portabella:

1. Automated unit tests are cheap to repeat often, and unlike debugger sessions, they don’t get tired, fractious and prone to making mistakes. This makes them a better testing method.

2. Just looking at what some code does in the debugger doesn’t mean to say one’s understood it is working – it just means the code executes from start to end. This is more or less equivalent to old ‘smoke test’ idea which while good is not as rigorous as the unit test concept. A unit test (written up-front) should act as a specification for what a component should actually do for its client code.

You two will put me right if I’ve erred, I’m certain.

I’m already convinced by Portabella’s viewpoint but I am concerned about Mark’s (as we have the unit tests are a panacea strategy at my client site).

So for point #1: unit tests are just code – so have them written by pair-programming or carry out a code review shortly after they are written – preferably before starting on component implementation. The idea is to avoid putting the entire codebase under scrutiny but to intensively subject the unit tests to manual inspection. In effect the code base is being separated into a trusted part – the tests - and a part subject to validation against the tests that goes pretty much unreviewed.

This isn’t perfect – there have been occasions (*very few*, but it does happen) where it’s taken a third reviewer to spot something that got past a pair programming session or a code review. Our attitude is, we’ll take this risk, we’re not in the business of writing provably correct software, we just want to develop efficiently and get a high level of trust in the product quality. I would also add that debugger walkthroughs are equally as prone to the human factor as unit test reviewing.

For point #2, that’s a lot trickier to deal with. I had a look at the code examples posted by Mark and thought, OK, this is the business of how much of a component’s testing should be direct by its own unit tests and how much should be done indirectly via unit tests of higher-level components that depend on the one in question.

For the examples given I would say that higher level unit tests would have caught the problems mentioned – although in the case of the access control component it would be an indicator that a unit test is missing and should have been written pronto.

For the other examples I would consider the pros and cons of doing more precondition checking at a lower-level against just letting the higher level unit tests detect the failure. Again, this happens to us all the time – but we go back and fill in those missing unit tests. We’re not pretending to be perfect, if things go wrong we go and fix them.

A partial solution to this is to profile the codebase while running the entire unit test suite – this identifies implementation code which is either untested (the unit tests aren’t comprehensive enough) or is superfluous (defensive code put in to avoid having to think too much about an algorithm, speculative code put in because ‘someone’s going to need this’). I’ll be honest and say we hardly ever do this – this is going to have to go up in priority.

But I’m still worried by what Mark wrote – I’ve had a recent experience of doing an interface plus lots of unit tests up-front, got it all reviewed and sent the result out to another person to implement. The other developer implemented the component so that it completely satisfied the unit tests – but did the wrong thing. It turned out that one of the unit tests didn’t quite test all of the facets of the component’s behaviour – and this afforded enough leeway to result in correct, but unacceptable behaviour.

Oh, and how did we verify this? Hmm, we walked through the code with a debugger.


BTW: Mark – 1800 unit and acceptance tests – we’ve only got about 400. Respect! :-)

Cheers,

Gerard

Gerard
Friday, September 19, 2003

> Of course - I'm now responding in a similar vein

Heh. You *started off* that way, and have continued it all the way through. Anyone who reads through the thread can see that.

> I find the coding of unit tests to be extremely tedious, but it doesn't stop me doing it.

Sure. I actually don't mind them so much, but I'm willing to grant you your preference. The difference, of course, is that it isn't tedious to actually *run* the tests once you've written them, whereas the debugger is tedious every time through.

> Because the unit tests verified that the code produced the correct result (which it did). The unit tests didn't show that the correct result was produced for the wrong reason (which it was).

You're (deliberately, I think) missing the point. Now that you know the problem, are you going to write a unit test which exposes it, or not? That's really the 640K question here, as well as advice that you're getting for free :)

As far as your examples go.... thanks for taking the time to post them (and I mean this sincerely). But since in your own words they rely on gross programming errors, they are a bit less than convincing (although I can see that they do bolster your argument as far as firing this fool goes, if it was indeed the same guy that did all of this).

In fact, I'd turn this around and ask if, as you say, this guy ignores or is ignorant of basic programming ideas, would stepping through the code actually help him? I think if you're fair, you'll agree that it might a little, but not very much.

And, in fact the second, it sounds like your real problem is not the technique, but actually the person. You'd do well with a seasoned professional who wrote good code, and used unit testing ruthlessly, regardless of whether he used a debugger or not.

> Developers don't have this immediate and ruthless feedback, so they tend to deceive themselves.

Indeed, that is a good point, and I agree.

But what I ask myself is how to make it *easy*, or at least as easy as possible, to undeceive myself and my team.

I congratulate you on your success in chess, but I submit that in many respects it is a poor metaphor for software development, especially if it involves an I-win-only-when-you-lose mentality. You may find yourself retired and vacationing on the beach, all alone.

Portabella
Friday, September 19, 2003

Portabella,

>> You *started off* that way, and have continued it all the way through. Anyone who reads through the thread can see that. <<

You're just wrong. I defy you to find an earlier personal flame than this one (made by you):

"Fact is, it sounds to me like you've got a big chip on your shoulder, and are just looking for people to fire to reinforce your own beliefs. Enjoy your little tinpot dictatorship while it lasts, bro!"

>> The difference, of course, is that it isn't tedious to actually *run* the tests once you've written them, whereas the debugger is tedious every time through. <<

I think it's a fallacy that test automation reduces human error by default. It reduces certain types of human error, but introduce other types. All automated test suites require human intervention, if only to diagnose the results and fix broken tests. It can also be surprisingly hard to make a complex test suite run without a hitch. Common culprits are changes to the software being tested, memory problems, file system problems, network glitches, and bugs in the test tool itself.

>> Now that you know the problem, are you going to write a unit test which exposes it, or not? <<

Of course - but the problem wouldn't have been found by using unit tests. While unit tests are great for finding implementation bugs, they're fairly useless for finding requirements bugs, design bugs, testing bugs and so on. For these, you need other techniques.

>> But since in your own words they rely on gross programming errors, they are a bit less than convincing <<

Only the first of the three problems is gross. The other two are actually quite subtle.

>> In fact, I'd turn this around and ask if, as you say, this guy ignores or is ignorant of basic programming ideas, would stepping through the code actually help him? <<

It would help to educate him. Of course, there are no guarantees, and no dev process can really compensate adequately for dodgy developers. But this is not a binary thing - there are degrees of self-help and success.

>> I congratulate you on your success in chess, but I submit that in many respects it is a poor metaphor for software development, especially if it involves an I-win-only-when-you-lose mentality. <<

I look on software development as "I-win-only-when-the-compiler-loses". So for me, it's a war I'm waging on bugs, and I'm looking for multiple weapons to help me win that war. If this was Quake, unit testing would be the BFG and code stepping would be the nailgun. Both weapons have their place in the game.

Mark

Mark Pearce
Friday, September 19, 2003

Gerard,

Thanks for your post! I'm preparing my reply now...

Mark

Mark Pearce
Friday, September 19, 2003

> You two will put me right if I’ve erred, I’m certain.

I think that's a fair summary.

> Our attitude is, we’ll take this risk, we’re not in the business of writing provably correct software, we just want to develop efficiently and get a high level of trust in the product quality.

That's also where I'm at.  And, as I've said, there are Human Factors issues as well.

> A partial solution to this is to profile the codebase while running the entire unit test suite

I'll play Devil's Advocate and point out that this doesn't really solve the problem. Even if you can show coverage, you cannot be sure you have tried all the possible relevant inputs.  You really have to think about it.

> Oh, and how did we verify this? Hmm, we walked through the code with a debugger.

No worries here. It's certainly a valid technique, and I've made it a point already to fire up a debugger on my current code base.  My initial finding is that I can achieve much of the same value automatically with assertions, and these both remain with the code and are executed automatically.

This is discussed extensively on the c2 wiki here:

http://c2.com/cgi/wiki?UseAssertions

The essence of the thread (sans flaming) is captured on the ForgetTheDebugger node.

Portabella
Friday, September 19, 2003

M & P: it's a nice Friday evening in the UK and the prospect of a beer is looming large - so if you don't see replies to posts directed at me, don't take it as disinterest or trolling on my part.

I'll have a look again on Sunday or Monday evening if this thread is still going (if I can stay off JoS for that long!)

Have a good weekend.

Gerard
Friday, September 19, 2003

> an earlier personal flame

But saying "intellectually lazy and full of shit" is fine? Don't just respond, think about it.

> It reduces certain types of human error, but introduce other types.

Of course, but it's a *net* win, no? Else we could just chuck all of our tests and be better off.

> All automated test suites require human intervention...

Yadda yadda yadda, we all know this.

> they're fairly useless for finding requirements bugs, design bugs, testing bugs and so on.

I do not agree here, except in the very limited sense that *running the test suite alone* won't tell us those things.

In fact, if a class is difficult to write tests for, it is probably too complicated. If a framework is difficult to test, it will probably be a hotspot for bugs. If the test code is not simple and straightforward, then it probably has bugs. And if it is difficult to write tests for a set of requirements, then they probably have bugs.

Your point only applies if the unit tests are written by zombies.

>> Now that you know the problem, are you going to write a unit test which exposes it, or not? <<

> Of course

Then we agree on the most basic level.

> It would help to educate him.

I guess being fired is an education of sorts, but you honestly do not come off as very concerned at all about this guy's "education", and it is hypocritical to pretend otherwise.

Portabella
Friday, September 19, 2003

Portabella,

>> But saying "intellectually lazy and full of shit" is fine? <<

Yes, it's fine because it wasn't a personal flame - it referred to an entire class of developers. As far as I can tell, you aren't even in that class (using only a single testing technique).

So my point remains - the personal flames started with you, and I then responded in a similar manner. Neither of us lost our temper, but we both started sharpening our swords.

>> Of course, but it's a *net* win, no? <<

Yes, I agree that unit testing can be a net win if, and only if, it's done in a careful manner with due regard for the points raised in the paper to which I referred. Of course, code stepping can also a net win.

>> I do not agree here, except in the very limited sense that *running the test suite alone* won't tell us those things. <<

Exactly my point - you need to use techniques such as code stepping and code reviews to get an angle on the many bugs that unit testing can't detect on its own.

>> I guess being fired is an education of sorts, but you honestly do not come off as very concerned at all about this guy's "education", and it is hypocritical to pretend otherwise. <<

Because he was a contractor, not a permie, I didn't have the slightest interest in his education.

Perhaps his fantastic technical "vision" got in the way. Maybe he didn't want to compromise his artistic sensitivity. Well, time to grow up! I don't give a shit - if he wants to take my money, he has to do what he's told.

Mark

Mark Pearce
Friday, September 19, 2003

Gerard,

From your very good summary of our viewpoints:

>> 2. Unit tests may be incomplete – so there are holes in a component’s implementation that are not being tested, and act as a hiding place for bugs / unspecified additional behaviour of an interface. <<

I would probably put this statement stronger that you have done. You can *guarantee* that the unit tests will never find all of the bugs. Several studies have shown that, at least on larger projects, requirements and design bugs far outnumber implementation bugs, and unit tests can't easily find bugs in these two categories.

I would also say that it's not feasible to test many components completely by using unit tests. The dev tools nowadays allow teams to build extremely functional and complex apps, but the amount of tests needed to verify these apps is staggering. We're running upwards of 1800 unit and feature tests, but I suspect that we're only testing something like 70% of our product's functionality in this way.

Our large body of tests has also shown us another problem. As the number of tests grows, our understanding of the test code and the test coverage has not kept pace. People move on and app features change rapidly, leaving us in a position where where nobody dares to mess with the test code too much, but everyone keeps adding new tests. 

We've also found a few cases of a developer interpreting a unit test as a special case in the code! Rather than handling the general situation, the temptation was to write edge-case code to pass the test with the intention of returning to fix this later. Needless to say, the good intentions were never realised later.

>> 1. Automated unit tests are cheap to repeat often, and unlike debugger sessions, they don’t get tired, fractious and prone to making mistakes. This makes them a better testing method. <<

I'm not convinced by this argument either. How can one measure the relative costs and benefits of two testing methods that mainly find different categories of bugs.

>> So for point #1: unit tests are just code – so have them written by pair-programming or carry out a code review shortly after they are written – preferably before starting on component implementation. <<

We've tried this, using code reviews. The problem is that it's impossible to review test code without also reviewing the app code that's being tested.

>> A partial solution to this is to profile the codebase while running the entire unit test suite – <<

A Google search will highlight the many problems that come up code coverage tools - in particular, Brian Marick has written a good paper on this.

In summary, my viewpoint is that there is no silber bullet, and that a team should use multiple testing techniques in order to root out bugs wherever they reside. And specifically, Portabella's orginal claim that only dodgy developers need to step through their code with a debugger is just wrong.

Mark

Mark Pearce
Sunday, September 21, 2003

Portabella, Gerard,

Another problem with unit testing is that although, like other forms of testing, it succeeds in reducing your uncertainty about the bugginess of particular areas of your product, it is useless for doing any type of risk-weighted uncertainty.

Some areas in your product will be more risky than others, perhaps because they're used by more customers or because failures in that area would be particularly severe.

Failing to identify risky areas is a common testing mistake amongst developers, and using
"blunderbuss" unit testing techniques can lead to misallocated testing effort. Developers are notorious for concentrating on the edge cases, and thereby wasting testing effort that would be better devoted to more risky areas.

Of course, other testing techniques are also exposed to this problem, but unit testing is particularly exposed because of its emphasis on testing all code regardless of its risk.

Mark

Mark Pearce
Sunday, September 21, 2003

Portabella wrote...

"I'll play Devil's Advocate and point out that this doesn't really solve the problem. Even if you can show coverage, you cannot be sure you have tried all the possible relevant inputs.  You really have to think about it."

I agree with you - in my mind there are two aspects: one is 'is there code that is completely untouched by the unit test suite', the other is 'do the unit tests *really* test the code that is covered'.

I think profiling would help address the first problem, and that pair-programming and/or code reviewing does help address the second (although this is not going to be a perfect technique - but what is?)

To muddy the waters a bit more, the second aspect can be refined a bit more:-

1. Do we know what the interface of a component means?

If it's a function, what's its preconditions and postconditions?

If it's a class, what are the preconditions and postconditions of its methods? Does the class have
interesting instance-level or class-level invariants?

If we can't specify a postconditions of single method calls easily or cheaply, can we specify the postconditions of sequences of related method calls on an object?

If it's a assembly of closely related classes (iterators and containers, objects with mementos, subjects and observers, etc, etc), what are the postconditions of sequences of interactions between the relevant instances and are there inter-object invariants?

2. Are we hammering the components enough in our tests? Even if one (or two) has thought through the interface very carefully and has got the full number of unit tests in mind to check every aspect of a component's behaviour required to fulfill a use case or system requirement's interaction diagram, it's still possible to have 'dilute' unit tests (I think this was your point about inputs).

It comes down to a statistical approach - you can't test the entire space of inputs, but you can go for obvious boundary cases at the interface level plus a large sample of more or less 'random' inputs. If I've tested the interaction between push and pop methods on a stack for sequences involving zero, one, two and three elements plus, say 100 sequences of zero to a thousand elements, I'm *reasonably* confident the unit test is assiduous enough.

(No, I don't accept inductive reasoning on the implementation as sufficient in this case - unless there's no other alternative).

As soon as you start testing software that uses numerical methods or involves several threads, you really have to sock it to your components in the tests. From my own experience, it pays to keep increasing the number of test cases until you stop finding more bugs. Again, not a perfect strategy, but it will do.

Having some long unit tests goes against the gospel of XP, but out in the real world, who cares? ;-)

There's also the issue of having unit tests for higher-level components act as a safety net for missing unit tests on lower-level component, but I think I've done that death now!


"My initial finding is that I can achieve much of the same value automatically with assertions...."

Yes, we do this too with noticable benefit - but in the example given at the end of my post, this wouldn't have helped (although it *might* have been caught by a unit test for a higher-level component, or by an acceptance test).


"The essence of the thread (sans flaming) is captured on the ForgetTheDebugger node. "

Again, yes: I'm all for *attempting* to forget the debugger by using unit tests and DBC, and this is what we try to do - but watch out for situations like the one that bit me.


Cheers,

Gerard

Gerard
Sunday, September 21, 2003

Mark wrote...

"Several studies have shown that, at least on larger projects, requirements and design bugs far outnumber implementation bugs, and unit tests can't easily find bugs in these two categories."

OK - if I understand a 'requirement bug' to be an incorrect requirement, then yes - unit testing deals with a different aspect. If the customer wants a word-processor and we've delivered a fully unit-tested and debugged engine-management system, there's a problem.

As for 'design bugs', they're to be expected. I'm a great believer in doing 'small design up-front' - ie. upfront design in iterative cycles - as long as one is prepared to throw the design away in each iteration once interface and unit test development gets underway. Designs are great for sizing up future work, exploring risk, and priming the test-first development process, but once the latter is underway the insights gained 'down in the code' should take precendence over those nice UML diagrams.

You mentioned the issues stemming from the size of the codebase at your site (I estimate you have, say 300 to 600 components in the codebase given 1800 tests). The only system of that scale I've worked on didn't have any unit tests, and all acceptance testing was done by hand, so I can't quote from my experience on that issue. I have been warned, though, thanks!


"I'm not convinced by this argument either. How can one measure the relative costs and benefits of two testing methods that mainly find different categories of bugs."

Just because it's less labour intensive and the dreaded 'human factor' is reduced, that's all. That doesn't mean to say I don't use a debugger to find out why a unit test doesn't pass - I do, along with DBC, lots of trace statements, asking somebody else to have a look at what I've done, spinning Tibetian prayer wheels, you name it I'll use it :-)


"We've tried this, using code reviews. The problem is that it's impossible to review test code without also reviewing the app code that's being tested."

I have to disagree with you there: we are pretty comfortable with describing component behaviour solely via unit tests working through interfaces (ie. no white-box testing).


"A Google search will highlight the many problems that come up code coverage tools - in particular, Brian Marick has written a good paper on this."

Yes, I think I know the one you mean. If memory serves correctly, there was a point about poor coverage by unit tests also implying that the existing tests would probably be lacking, not just that extra unit tests would need writing. Message understood.

"In summary, my viewpoint is that there is no silber bullet, and that a team should use multiple testing techniques in order to root out bugs wherever they reside."

Get those Tibetian prayer wheels in right now!

I'm going to have to sign off from JoS for now, but send an E-mail if you want to prod me back into discussion. It's been a pleasure corresponding with you and Portabella.

Cheers,

Gerard

Gerard
Monday, September 22, 2003

> And specifically, Portabella's orginal claim that only dodgy developers need to step through their code with a debugger is just wrong.

I didn't say that, though.

I suggested that debuggers were popular with folks who like to write convoluted code, because they *need* a debugger to figure out what was going on.

I've personally seen that behavior several times.

This is perhaps the polar opposite from what you're suggesting: if there are people who "just think" about the code, there are others who start by just firing up the debugger and refusing to think about the code at all.

I said that some people get good results using a debugger, and I thoroughly support that. Others get good results without one, and I support that too.

Portabella
Monday, September 22, 2003

Portabella,

>> I didn't say that, though. <<

Hmm...it's possible that I misunderstood the meaning behind the following quote!?

"But if the only way you can be sure that the code works is to step through it, I think you are a few steps behind to begin with."

>> if there are people who "just think" about the code, there are others who start by just firing up the debugger and refusing to think about the code at all. <<

Yes, like you, I've seen both of these behaviours.

I think that we basically agree, but we just have different ways of working.

Mark

Mark Pearce
Tuesday, September 23, 2003

Gerard,

>> You mentioned the issues stemming from the size of the codebase at your site (I estimate you have, say 300 to 600 components in the codebase given 1800 tests). <<

It's rather more complex than that. Our product generates code automatically, creating the entire end-to-end core infrastructure of an application, from the data model and stored procedures all the way up to the the MVC-based GUI.

So we don't just need to write unit tests for our code, we also need to write (and sometimes automatically generate) tests for validating the generated code.

Hence the need for those Tibetan prayer wheels - they sure come in handy for those hairy moments when you get the first faintest tiniest inkling that something, somewhere, has gone terribly wrong... :)

Mark

PS Are you based anywhere near London?

Mark Pearce
Tuesday, September 23, 2003

> I think that we basically agree, but we just have different ways of working.

That sounds like a fine place to leave the thread too.

Good luck with your application! :)

Portabella
Tuesday, September 23, 2003

*  Recent Topics

*  Fog Creek Home