Fog Creek Software
Discussion Board




Software Metrics and Programmer Productivity

My company has recently experimenting about using software metrics to measure/compare productivity of programmers and use it as performance appraisal tool.  This is being implemented on experimental basis in some project groups.
Currently main metric being used is 'no of bugs/KLOC', etc.

Personally, I think its a bad idea to use the software metrics like bugs/KLOC to measure programmer performance. Most of the measures actually promote bad programming practices. 

For example, one good way to reduce the bugs/KLOC count is to do lot of 'cut/paste' code. This will increase the KLOC count and hence reduce the bug/KLOC.  So a good programmer who opts for code reuse (using function calls) and refactoring is actually get a lower performance rating than a bad programmer doing cut/paste code.

Is your company also uses the Similar metrics to measure programmer performance ?  I am interested in hearing any feedback, experiences, etc.

Nitin Bhide
Wednesday, May 22, 2002

KLOC's must be the worst possible way to measure productivity ever thought of.

Would you measure a chef on the volume of spagetti used per day?

Matthew Lock
Wednesday, May 22, 2002

If you really want to know what you're dealing with, go and buy and read the Fenton & Pfleeger book _Software metrics: a rigorous & practical approach_. This book's not all there is to productivity, but it gives a nice overview of measurement theory and measurement in practice (in software development). There is almost 60 pages of references in the annotated bibliography too. 

In my opinion it gives a nice and solid basis for further studies on this subject. Programmer productivity is just one of the things that are _not_ easy measure. There's definitely some relation with productivity and lines of code, but it's far from direct ...

Jarno Virtanen
Wednesday, May 22, 2002

As others have written far more eloquently before me, the best programmer is generally the one who takes the most code away, not the one who adds the most code. A measure of when a product is stabilizing is when it "turns the hump" from growing to shrinking code size.

Brad Wilson
Wednesday, May 22, 2002

Brad hit the nail on the head.  The best developers I know spend the bulk of their time analyzing the problem, and a small portion cranking out compact, clean code.

The other problem with metrics, perhaps just as significant, is that varying projects have wildly differing levels of difficulty.  If my colleague spends two weeks writing a reusable, documented thread pooling class, and I spend two weeks dropping controls on forms, he may end up with 200 lines of code, and 5 bugs, I may end up with 2000 lines of code and no bugs.  Really, he's the hero and I'm average.  But how will metrics explain this?

I had a temp job in my teen years assembling computer cables.  The best workers got assigned the complex cables with more wires.  Consistently, these workers were layed off because their cables/day ratio was near the bottom percentile.  I left, but heard that they revamped their system later.  This was a bad example of metrics in action.

Again, the best developers solve problems by mentally refactoring code until it is simple and compact, often using 3rd party tools and libraries whenever possible, decreasing LOC, but increasing productivity and reliability.  Code costs money, with (often) only a small portion coming up front.

Those that advocate metrics have their heart in the right place.  But, just like Mark Twain can say in 10 words, what others may write volumes about, code is similar.

Bill Carlson
Wednesday, May 22, 2002

Using metrics as a performance appraisal tool is probably the dumbest idea I've heard in years. It's *such* a bad idea that organizations that take metrics seriously frequently not only forbid the use of the metrics in appraisals, but make doing so an actionable offense.

I'm at work and don't have access to my home library (and so can't give any titles), but the literature is full of books on successful metrics programs. Most of them stress not to do this.

skautsch
Wednesday, May 22, 2002

One fact frequently overlooked is that productivity is usually measured in dollars (or euros, or whatever) per capita.  If Group A has 10 people and generates $100K in quarterly profits, and Group B has 100 people and generates $100K in profits, there's a direct measure.  If a quarter is too short (long term projects), then use an annual measure.  Using such measures also gets the programmers more involved in the actual marketing and sales aspect of the business, which is only a good thing in my view.  Engineers that refuse to understand marketing and view marketing as the enemy have no place in a commercial software operation, IMHO.  And yes, marketeers who refuse to understand engineering realities are equally useless.

The key here is not attempting to measure individual performance (which is simply impossible to do reliably) and not using short timescales, which don't accurately represent actual effort (everyone has non-productive days).  Measures like KLOC are obviously silly, and hours worked is equally useless, since studies have shown a high variability in actual ability between programmers, which a lesser programmer may compensate for by working longer hours.  If a guy can play Quake III online for seven hours a day and generate enough code in that last hour for the company to reach its revenue goals, who am I to complain?  Whereas I'd be equally happy with a guy that has to sweat for 10 hours every day to do the same job (equally, as in I wouldn't pay him any more than the first guy).  Measure useful output, not work.  KLOC is work.  Profits are useful output. 

Finally, if an individual programmer is less productive than their peers, it tends to be pretty obvious to everyone else in the group, and certainly should be to the manager (constant schedule slip at the same workload, or a far smaller workload to avoid schedule slip).  If all the programmers in a group "aren't productive enough", it's management's fault, not the developers, and that should be pretty obvious to management, or there's little hope for the company.

James Montebello
Wednesday, May 22, 2002

"If all the programmers in a group "aren't productive enough", it's management's fault, not the developers"

James... I am a little uneasy with this.  Don't you think that by having an easily available scapegoat (such as management) available it will encourage "aren't productive enough" behavior?

The most productive groups I have work in, were with programmers that did not blame anyone but themselves for their own productivity.

And the measure was effect on profit as you point out... not an unrelated substitute of something that couldn't correlate.

Joe AA.
Wednesday, May 22, 2002

A good developer will know when to blame management, when to blame his peers, and when to blame himself or whatever the actual cause is.. So when he blames management, you should probably believe him.

A poor developer will always blame management, or his peers.  So you don't know when to believe him.  However, hiring poor developers is a sign of poor management..

This means that if your development staff says that productivity is poor due to management,  chances are that the managers are not allowing the staff to reach full productivity through some action (or inaction). 

This doesn't mean poor management, as they may be balancing many things, but the management structure in place should know of the issue, and not randomly fire developers, or randomly institute meaningless metrics.

Philip Rieck
Wednesday, May 22, 2002

> My company has recently experimenting about using software metrics to measure

Excellent!

> /compare productivity of programmers

Uh oh.

> and use it as performance appraisal tool.

Oh dear.

>Currently main metric being used is 'no of bugs/KLOC', etc.

Oops. Bad.

> Personally, I think its a bad idea to use the software metrics like bugs/KLOC to measure programmer performance. Most of the measures actually promote bad programming practices.

Metrics should not be used to compare programmer performance.
Metrics should be used to make an intelligent estimate off how long it will take to create a new system, given a set of baselined performers.

Never use LOC first of all. Use function points at least, which are decorrelated from language choice and programming style. For a given programmer in a given language in a given problem domain, it is possible to calibrate FP and LOC.

I have used the black art of metrics to predict how long it will take me to do something and I have been totally flabergasted to find I can do this pretty accurately. Before I started playing around with fp metrics, I hadn't the foggiest clue how long something would take and believed it was impossible to even hazard a remote guess because I was 'an inventor' and 'this has never been done before.' I was wrong to think that that meant an accurate estimate was not possible.

Metrics are great but ithey can be easily misused if an ignorant company starts abusing them by doing stupid stuff like using them to measure programmers instead of what they are intended for and helpful at -- prediction.

Metrics are fun fun fun when used responsibly! It is a *blast* to predict something and get it spot dead on.

X. J. Scott
Wednesday, May 22, 2002

One of the big consulting firms was asked to assess a big computer company's support centre. They ranked all the support staff by the time taken to respond to problems, and then sacked those who took the longest.

Then disaster struck. It turned out there had been an informal culture in the support centre where the hardest calls were routed to the best engineers, who naturally took longer on their calls. When those guys went, the support centre's capability plunged.

Cap in hand, the dumb company tried to hire their guys back, but their spell in the market had shown them they were worth much more than they had previously been paid, and most stayed with their new companies. As you would.

Hugh Wells
Thursday, May 23, 2002

Okay, I don't want to turn this into a troll, but the post above raises a point I've occasionally wondered about. Specifically, you hear plenty of stories about (the) big consulting firms completely screwing things up and costing their clients big bucks in the process. The post above is a perfect example of this -- they went in, didn't take sufficient notice of the environment, implemented changes indicative of remarkable lack of foresight, with devastating consequences.

So why do the consulting companies continue to get hired? Do the consultants generally do a very good job, it's just that only the horror stories get passed around the net? Note, I'm not asking "why do people hire consultants", I'm asking "given the horror stories about consulting company X, why does it stay in business?"

Adrian

Adrian Gilby
Thursday, May 23, 2002

I remember being part of a metrics programme in 1991, a complete and utter waste of time, quickly forgotten and never applied.

Great for the consulting company that ran the programme, but of no benefit to anybody else.

The wheel, obviously, still turns.

Tony
Thursday, May 23, 2002

X.J. Scott wrote:

"Metrics are fun fun fun when used responsibly! It is a *blast* to predict something and get it spot dead on."

I'm kinda nit-picking here so my apologies in advance, but if you make a perfect estimation (median of the propability distribution) and base the project time target on this, there's 50% chance that the project is late.

(I'd guess this could be called trivia ..)

Jarno Virtanen
Thursday, May 23, 2002

>Measure useful output, not work.

agree. 100%. any other measurement is useless. i think we have discussed this before.

>KLOC is work. Profits are useful output.

sorry but i have to disagree with the last part of that statement. in organisations where profit is not the objective (emergency services and health services are a good example in civilised countries :-) how can you measure profit (should you? how much is a life worth?)? measuring how much profit an individual contributes is also an incredibly hard thing to do -  if i fsck off early every day have i contributed equally to the success of the project i am on? if i am producing my fair share then yes, by your own criteria, but how the hell are you going to know that i'm not just being lazy? if you ask my colleagues they can probably tell you, but you rely on them being truthful - if they don't like me for some reason, i can be "marked down", and vice versa, similarly there will always be one who reckons he did everything (he did, he produced more bugs than anyone else) and nobody else should get the credit (you know the type, the ones your boss makes into team leaders). which leaves you with, as you said, "not attempting to measure individual performance". so you presumably have to pay everyone the same, muppets and superstars alike. this is a good way of ending up with only muppets, imo.

your idea also means that regardless of the development team getting the thing out of the door you rely on it being marketable to judge whether they are successful. in one sense i can see the point (otherwise you just end up with the company going bust), but if the idea is unmarketable i can still have done my job well, regardless of whether the product is profitable or not. i am not even going to start the argument about whether profit (which is just surplus cash given to people who have probably never put anything into the company in the first place) is useful or not.

sorry, but i think the measurement has to take place before "profit" even comes into the equation (no, i don't know where or how, sorry!).

nope
Thursday, May 23, 2002

<broken-record>
a line of code is not work nor an asset; It's a liability. You've paid for it, and you keep paying for it in maintenance until you delete it. It's like owning office chairs and tables - you've paid for them, but if you don't put them to good use, you're better off throwing them out EVEN THOUGH you've already paid for them and they represent real capital - because they take space which costs money.
</broken-record>

What would you prefer - a clean, 100 line solution to a problem, or a convoluted state-of-the-hype 1000 line one? I'd take the former every day, but unfortunately meet more and more of the latter. Any measure of productivity I'm aware of would favor the 1000 line one because on a work-per-line basis, it looks like more productive work and it actually constitutes more evidence for produced work. But it's "the wrong way" if there ever is one.

If anything, features/KLOC would be more the right measure (it measures insight/efficiency, rather than ability to produce code), but features - like bugs - are not quantifyable in any sense that would make this measurement useful.

The notion that "any kind of measurement is useful" is true only among people who understand it, and unfortunately, most don't. AMD has to sell their 1.5GHz processor as ATHLON 1800 because consumers don't understand that clock rating is not the whole story. Don't - in general - attribute any fundamental understanding to a project manager if you don't attribute it to the general public.

Ori Berger
Thursday, May 23, 2002

I've never understood KLOC, I try and write the minimum amount of code to achieve something.  I don't see the point of writing the C++ (or god help me Perl), equivalent of War and Peace to process data.

Most of my time I spend thinking and looking at other people's code wondering how I can steal it or make use of it most effectively.

Simon Lucy
Thursday, May 23, 2002

">KLOC is work. Profits are useful output.

sorry but i have to disagree with the last part of that statement. in organisations where profit is not the objective (emergency services and health services are a good example in civilised countries :-) how can you measure profit (should you? how much is a life worth?)? measuring how much profit an individual contributes is also an incredibly hard thing to do -"

Profit was only given as one useful measure, and since the vast majority of software houses are for-profit, it's likely the most generally applicable.

I already stated that measuring an individual's effect is nearly impossible, not just difficult, and shouldn't be attempted.

At the end of the day, it doesn't matter how efficiently a group of developers can crank out code if the code doesn't produce anything worthwhile.  Code is a means, not an end.  However, if marketing does manage to provide a specification for a completely unmarketable product, and engineering produces a product meeting the specification in the agreed upon timeframe, there's another measure of output, even applicable to non-profits.  It's strictly a "good enough" or "not good enough" measure, but that's about the best you can do.

James Montebello
Thursday, May 23, 2002

Even altruistic organizations won't exist long without profit.  Pretending that something called a donation ain't income would be self-deceptive.  I would hope that efficient operations is important to these organizations - as opposed to justifying a waste of resources with the unanswered question of the worth of a life.  Which by the way, has nothing to do with a profit measure.

Of course, cranking out code is never the ultimate goal.  I believe it was Weinberg that suggested our worth be measured by the productivity improvements our "product" provides to those that use it.

There seems to be a reluctance to measure individual productivity or contribution.  I really don't see the "badness" in doing something like that... however, I do understand you get what you measure, and so you better be darn sure what you measure is exactly what you want.

When I have had management positions, I wanted to be sure that any "bad apples" that decreased the overall team productivity were removed and as quickly as possible.  The team definitely "knows" without any basis in objective measurement exactly who these individuals are.

But things like "voting" or "peer review" does have limitations.  Just watch the so-called "reality" of survivors, or a game of the weakest link, where the best get eliminated first.

Joe AA.
Thursday, May 23, 2002

>Profit was only given as one useful measure,

then perhaps you can explain where in your post you said this, and in future write more clearly.

>and since the vast majority of software houses are for-profit, it's likely the most generally applicable.

we are not discussing software houses, we are discussing software development

>I already stated that measuring an individual's effect is nearly impossible, not just difficult, and shouldn't be attempted.

as i quote in my post: "which leaves you with, as you said, "not attempting to measure individual performance". "

>At the end of the day, it doesn't matter how efficiently a group of developers can crank out code if the code doesn't produce anything worthwhile.

define worthwhile. worthwhile != profit.

nope
Friday, May 24, 2002

Adrian, I guess big consulting firms generally do reasonably good work, albeit with incredibly stupid work sometimes. Other factors are that they're external to the organisation and thus are supposed to give "objective" assessments. Also, they're often used to justify decisions that have already been made by the MD or the board.

But, yeh, sometimes they are really dumb. I have other stories too.

Hugh Wells
Friday, May 24, 2002

Sometimes Hugh?  How about a story when they weren't really dumb... didn't define objective as whatever promotes their own crappy products, didn't cost more than they produced, really helped the company that hired them... etc.

Maybe I ask for too much.

Joe AA.
Friday, May 24, 2002

Joe AA, I must admit I would be scratching my head severely if we were talking about IT consulting firms. My generous comments were actually directed to management consulting firms.

Hugh Wells
Friday, May 24, 2002

*  Recent Topics

*  Fog Creek Home