Fog Creek Software
Discussion Board




The cost of poor engineering

A recent thread on the de Gaulle collapse became a debate on whether "engineering" of software made sense. Many agreed (or seemed to) that extensive efforts for high quality and reliability were certainly justified for life-critical systems. I agree. Most others seemed to believe that efforts to achieve high quality were not justified in other contexts.

What was lacking in the discussion was any mention of the cost of poor quality in terms of the cost of bug-fixing, the cost to a company (or organization's) reputation with customers, etc. I'm not saying that these costs compare to the cost of someone's life, but are they not worthy of consideration nonetheless? After seeing many companies struggle to get out of a situation where new features cannot be added due to the instability of the existing code, I think they are.

I'm bothered by the arguments amongst software professionals that claim that users don't care about reliability, that high quality is virtually impossible in software without exorbitant cost. I have never seen any direct, conclusive evidence of either of these claims. It usually seems to me to be an excuse to not do the tedious part of engineering: things like unit tests and code inspections.

How do we balance the cost of quality engineering against the cost of poor quality?

Jeff Kotula
Tuesday, May 25, 2004

I think it's worth talking about the cost of replacement.  One place I worked had a homegrown pseudo-database system written in an archaic language on an outdated OS running on defunct hardware.  Every so often they did a study of how much it would cost to replace the system, and it gradually went from $5M to $10M to $50M.  They kept adding features to it, and never seemed to realize that eventually they were going to be forced to rip the whole thing out.  The continually rising costs should have made it obvious that the sooner they replaced it, the more money they would save in doing it, but they always decided not to because it would be too expensive...never realizing that it was in fact too expensive not to do right away.

No, I don't know what "finally" happened - I got caught in a wave of downsizing in an effort to save money.

anon
Tuesday, May 25, 2004

Jeff, that discussion talked about "what is" while you're concerned with "how we should act." I agree with the meat of your point, so I don't see my post contradicting yours.

It is hard to think of a response to this topic, since it's like there's some assumptions we need to unwind, and text is a slow medium for this; in person we can ask incremental questions.

I just get the sense that you've reduced my position to, "Oh you're not paying $5k/line? No unittest for you!"

Tayssir John Gabbour
Tuesday, May 25, 2004

Tayysir: I wasn't really paying attention to who was saying what in the other thread, so my comments weren't really directed at anyone in particular.

I do think, however, that I am talking about "what is". It seems to me that the point of engineering activities is to balance all the conflicting forces: cost, risk, complexity, maintainability, etc. in a way that maximizes productivity (i.e. the amount of effective effort per unit cost). Safety is one factor. The cost of managing bug-ridden releases is another. Losing face with customers is another.

So "what is" here is the fact that software businesses operate without regard to these other factors to their own detriment. I think it is incumbent on us as professionals to represent these costs fairly and as accurately as possible.

Jeff Kotula
Tuesday, May 25, 2004

The problem is the analogy between software and construction is just bad.  I believe that engineers would prefer if what they do could be more like developing software, and yet so many software developers try to build software like bridges.  The nice thing about software is it is virtual allowing us to continuously run all sorts of tests and make all sorts of adjustments to it at any time.  We can remold it through refactoring.  Change how it is used, how it works internally.  We can make mistakes, get feedback, and improve on the design over time with very short feedback loops.  And during this whole process we can even get valuable use out of the product before it is "finished"

This process took thousands of years of collapsing bridges before people started to figure out how to build bridges reliably.  And if the bridge did collapse, they would have to figure out why and try to build the whole thing from scratch.  Even when software crashes, we can diagnose the problem and we generally don't need to throw everything out to fix the problem.  I believe this makes the techniques of designing good software quite different than good engineering.

If you look at many engineering disciplines today, I think you will see that they in fact are becoming *more* like software development.  That is why you see so much simulation and modeling before building the actual structures.  If you get a chance watch the PBS documentary on NASA's mars mission, particularly the part where they are designing the parachute.  They build it, run it though the wind tunnel, it breaks, over and over again.  Until finally it works.  They had to break the parachute a bunch of times in the wind tunnel before they figured out that their paper design was flawed.  Just like with software + unit tests only far far slower.

Oren Miller
Tuesday, May 25, 2004

That's a good point Oren. I think the big difference is that NASA didn't test the parachute by sending the probe to Mars to see if it would work -- the equivalent of shipping unready software. And while its true that bridge-building has been around for thousands of years, integrated circuit design and construction hasn't.

But my point wasn't necessarily that software engineering and other types of engineering are equivalent in the act. More that the attitude of "engineering" can be shared and requires us to admit, analyze, and report the costs we see resulting from decisions (and working habits!) that compromise the quality of the software.

I think the cost of trying to retro-fit quality (in the broadest sense) into a software project dwarfs the cost of initial development in many cases.

(One difficulty here is that we all work in differing environments with different tolerances, procedures, and norms. So maybe the departmental behavior I've seen is only really relevant to the industries I've worked in. But I don't think so.)

Jeff Kotula
Tuesday, May 25, 2004

Jeff, the problem in your position is that it presumes engineers make the decisions about software. They don't.

An engineer who tries to guarantee "high quality" will be condemned as someone who can't meet deadlines and probably sacked. Read the job ads to see the emphasis on programmers able to meet deadlines or "prioritise their time" which is code for the same thing.

If engineers are really concerned about quality, they need to attack their own lack of power in the workplace, instead of attacking their fellow engineers.

Must be a Manager
Tuesday, May 25, 2004

"How do we balance the cost of quality engineering against the cost of poor quality?"

The invisible hand.

Rob VH
Tuesday, May 25, 2004

<i>I'm bothered by the arguments amongst software professionals that claim that users don't care about reliability, that high quality is virtually impossible in software without exorbitant cost. I have never seen any direct, conclusive evidence of either of these claims.</i>

I used to think as you do.  But then I got out into the real world where there were clients with cash in hand.  People are running Windows 95 and rebooting 2 or 3 times a day; it annoys them but it doesn't cost them enough money to make them want to find something better (and learn it, and switch all their old documents, and ...).

For each program they use, I could probably find one that does almost exactly the same thing, costs less, and has fewer bugs.  Are they going to switch?  Doubtful.  Look at the programs that are dominant in the marketplace: just how many of them are dominant because of their low bug counts?

All else being equal, users would prefer a bug-free program to a buggy one, granted, but all else is not equal.  Writing a bug-free program means (if nothing else) more testing, which means it'll take longer and/or more money to write.

When you're building an airport, designing it so it doesn't crash is very important.  When you're building a program, crashing is a fairly minor issue -- there are much larger issues.

Wally
Thursday, May 27, 2004

*  Recent Topics

*  Fog Creek Home