Fog Creek Software
Discussion Board




Better than KLOC


Remember back when everyone wanted to measure productivity by KLOC, and it was discredited and laughed at?

The problem was basically that measuring by KLOC encouraged inefficient coding practices and cutting/pasting instead of generalizing functions.

My idea:  Instead of KLOC, let's measure Compressed Bytes (CB).  If you cut/paste a 95% of a function, instead of an extra 0.5 KLOC, you'll probably only add 1 to 3 Compressed Bytes. 

Of course, this still isn't perfect.  All my functions return true/false, then the error message, then the result.  They also check parameters on the way in - this results in a lot of redundant code like this:

my $blnOk = 1;
my $strMessage = "Success";
my $strData = shift;

if (length($strData)<1)
{
    $blnOk = 0;
    $strMessage = "Invalid paramters passed to function_name";
}


if ($blnOk)
{
  #Do_stuff
}

So my redundant code actually pushes quality.  My automated unit tests are also highly redundant.

But, still, if you need a null assertion, I would submit that compressed bytes beats the heck out of KLOC.

thoughts?

Matt H.
Tuesday, June 10, 2003

Wasn't there a comment in one of Tom Peters books that if you can measure a job you probably shouldn't be doing it.

Tony E
Tuesday, June 10, 2003

Today's assignment: try to optimize the code that checks for existing appointments in a specific date range.

We start with:

bool returnFlag=false

For dateToCheck=startDate to endDate
    SELECT appts=count(*)
    FROM  Appointments
    WHERE Appointments.StartDate<dateToCheck
    AND Appointments.endDate>dateToCheck 

    IF appts>0 THEN
        returnFlag=true
Next dateToCheck

**************

I stare at the ceiling until just before lunch, then delete the block and replace it with:

returnFlag=((SELECT Count(*)
                    FROM Appointments
                    WHERE Appointments.startDate<endDate
                    AND Appointments.endDate>startDate) > 0)

With proper indexing, execution time just dropped by 95%.

Do I get credited with negative work?

Philo

Philo
Tuesday, June 10, 2003


I'd start out by asking: why are you trying to measure productivity? What's the goal?

DingBat
Tuesday, June 10, 2003

I guess my question would be why would anyone put an SQL Statement in a loop.  Probably means they didn't understand the problem to begin with.

Dave B.
Tuesday, June 10, 2003

I think he said he would delete the block.  So it's not in a loop.

In my experience, the best way to measure productivity is function points.

     
Tuesday, June 10, 2003

Dave B: Apparantly Philo's team bought a giant Hitachi cluster so they can waste a few processes. Maybe they should have gone with a dynamic query and build WHERE clauses on the spot? Or IN clause? Or JOINs?

Li-fan Chen
Tuesday, June 10, 2003

Metrics are difficult things - whether they are used to measure tomatoes picked or code produced.  When measuring code, what are we measuring - total volume, time to generate, execution time (versus what?)?  With tomatoes, are we measuring total volume, ripeness, disease-free(ness)? 

With tomatoes we're looking for some composite of all these things, but there may be some days where you're best off letting most of the tomatoes ripen on the vine for the next day, just like some days you might stare at Philo's code for seven hours and then realize the huge gains to be made by altering the code on a scant number of lines. 

So what is productivity in that situation?  Its hard to gauge.

Lou
Tuesday, June 10, 2003

I think developers are, in general, threatened by metrics. This is probably partly due to the fact that there aren't really any indisputably *good* metrics, but I think it is also because we get away with a lot of crap code and metrics carry the threat of allowing someone else to zero in on it without first wading through a million lines.

That said, I believe that KLOC is quite a good metric if used properly: that is, to measure an entire department or organization, not individuals. Over time, with a larger sample population, variations in the KLOC count due to individual preferences, cut and paste, etc. will largely disappear in the overall trend of the data. Cross-correlate KLOC produced per developer per year with number of bugs submitted/fixed and their severity and I think you have a pretty good 10K foot view of the productivity in the organization.

Jeff Kotula
Tuesday, June 10, 2003

Jeff - is it better to have more submitted bugs fixed, or to have fewer bugs submitted?

If the former, then I'll spend thirty minutes a day submitting bugs that I can fix tomorrow morning. If the latter then I'll be spending a lot of time arguing with QA over what defines a "bug". If my Dept. Chairman is concerned with bug count, he'll just browbeat QA into not submitting bugs at all. (Or at least not via the bug tracking system)

And as for counting KLOC, you're still asserting that "more code is better" when that's not true - less code is better. In addition, there isn't a KLOC counting system invented that can't be gamed by programmers.

So, do you want productivity? Or do you want to spend all your time working on the productivity measurement system.

I think the only metrics management should be concerned with are projects delivered and customer satisfaction.

Philo

Philo
Tuesday, June 10, 2003

Getting back on-topic, I agree that compressed bytes seems like a better metric than KLOC.  I do like the idea.

However, I also agree that code quality is too subtle and complex to be accurately measurable by a simple automatic process.

Brent P. Newhall
Tuesday, June 10, 2003

The only thing you should be measuring is whether milestones are being achieved. This is purely a mechanism to determine whether project progress is being achieved.

KLOC and man hours too are decent measures of expense, with obvious extrapolations of how much you spent/have to maintain. They are reasonably useful in post mortems, but are meaningless otherwise.

Richard
Tuesday, June 10, 2003

Jeff Kotula <second post>:

Jeff, KLOC averaged out over development teams cross referenced by bugs submitted/cleared is still lousy measurement.

A dev department should be measured by what it produces, bringing it in on time and in budget, what quality in the final product and whether it meets spec/user criteria.

In many situations a developer will trade KLOC increases to meet deadlines. Under less pressure the same developer may develop more elegant KLOC-light structures.

Ultimately metrics get used by people with the least understanding of them: PHBs, downsizers & tornado managers. Thats what inspires the fear.

Richard
Tuesday, June 10, 2003

> Getting back on-topic, I agree that compressed bytes seems like a better metric than KLOC.  I do like the idea.

It can be gamed (or messed by different styles) by programmers too: Duplicate code, inline functions etc.

Using a better algorithm is also penalized. 

- Consider business logic in a table vs a million ifs scattered all over the place. The million if's is better by this measure

- If I rewrite a 30,000 line function to do the same work in 200 lines (and come out to much less final code with less bugs), am I doing negative work?

What about ASSERTS, debugging tests, etc.,  If I write code in the debug version to check each object state (which doesn't appear in final program, but shakes out all the bugs), do I get credit?

Finally one thing I'd also point out.  Once I went thru a large Windows program, with the aim to reduce the program size.  I did not change any algorithms.  I changed calling conventions including on internal functions. I changed parameter types (e.g. const CString& to LPCSTR), I changed a lot of calls which manipulated strings to use less CString objects, and instead each time use the next element of a large circular queue (each element char *), etc.  The program as about 70% of the original size when done.  Half the program was libraries, which I didn't touch.  So in actual program code, the program was less than half of the original size.  The original program was not bad or bloated, it could easily have been 50% larger.  So, I'd guess, in this case, the program could have varied by a factor of 3 in size, by almost purely stylistic elements.

S. Tanna
Tuesday, June 10, 2003

It's even worse that that.  There has been a fair amount of research to show that *all* measurements result in "gaming" behavior, if the measurement is not a true measurement of what you want, and you do not have 100% supervision.  It doesn't matter *what* kind of work it is.  The problem is that you get *exactly* what you reward, no more, no less.  If you measure KLOCs you get KLOCs.  If you measure satisfaction, you get satisfaction...  or do you?

Turns out you can't really measure satisfaction...  only what people tell you about their satisfaction.  And maybe not even then...

Unfortunately, all the really important things in life and business are not measurable; "vital statistics" (such as heart rate or profitability) are important for monitoring, but you can't use them as "performance measurements" outside a narrow window of applicability.

Phillip J. Eby
Tuesday, June 10, 2003

"Unfortunately, all the really important things in life and business are not measurable; "vital statistics" (such as heart rate or profitability) are important for monitoring, but you can't use them as "performance measurements" outside a narrow window of applicability."

I'm thinking that the best way of handling it, which seems to be The Theory of Constraints sort of way, is to find a particular failure or success, and then follow it backwards through time to try to find what's special about that event - locate the cause of the failure or the success.

Or, similarly, follow an order (or somesuch) as it snakes it's way through the organization.

In short, it seems the only real way to properly manage or measure something is simply to know how it works, and to try to find a way to make it work better - or at all.

Seems that targetted incentives and individualised measurements are looking alot like Magic Beans.


Oh, one related story on this topic, from my father who works at the US Postal Service (perhaps one of the worst managed places around, or they at least seem to want to make a valiant effort at it). One supervisor was going along trying to get people to work faster, and went up to one older employee that was sticking letters and told him he needed to stick them faster. His response was "OK"...and he then proceeded to pick up handfulls of letters and, without reading any label on them at all, stick them into a random slot, before proceeding to pick up another load and repeat the process.

"Do you want it done fast, or do you want it done right?"

Ooops.

Plutarck
Tuesday, June 10, 2003

Jeff K, here's an example where lines of code is not only useless as a metric; but actually dangerous. This is a real life example.

A bank had commissioned the development of an application with about 30 reports (screens. ) It was developed by one of the consulting firms, using C++.

The programmers worked out how to write one report, then simply copied the code and modified it to create each new report. Measured using lines of code, they were doing a marvelous job.

The problem was that they started to find bits that needed to be changed. They would then go back through the 15, 20, whatever reports already written and hope to make all the changes correctly. This was never done thoroughly so that all the reports contained different implementations, each buggy.

When the app was finished, not only was it a giant mess, but it also took far too long to fetch and display the data. The core display technology needed to be improved. But it was scattered through 30 different implementations and the consulting firm was not able to make these changes.

Not only did the bank waste a lot of money on that first, failed implementation, which showed up well on line-of-code metrics, it also lost a market because it was too slow to provide new customer services.

echidna
Tuesday, June 10, 2003

S. Tanna: I agree that it can be gamed.  I think we all agree that.  Is anyone arguing differently?

The question is, is compressed bytes a *better* measurement than KLOC?

Brent P. Newhall
Tuesday, June 10, 2003

I think they are both terrible measures

You want (goals) the functionality, most mantainable code, least bugs, best fit to the schedule and customer requirements, with LEAST lines of code, and SMALLEST binary size

Frankly I think any correlation between KLOC or binary size, and your progress or whether you achieve you goals, is highly suspect.

It's like if a trucker were driving from New York to California and back regularly, and we'd try to calculate how many trips he made, by counting, how many people he'd ran over in his truck.  There is a loose correlation (more miles = more likely to run a person over), but frankly you'd prefer if he ran over the least people possible.

S. Tanna
Tuesday, June 10, 2003

Missed my last sentence:

Measuring KLOC vs binary size, would be almost like asking is counting run-over men or women, more accurate a measure of the trucker's distance/

S. Tanna
Tuesday, June 10, 2003

Is getting shot in the head with a shotgun better than getting shot in the head with a pistol?

Either way you're dead.

Mister Fancypants
Tuesday, June 10, 2003

I like function points (fp).

And I like KLOC. KLOC allows me to reliably predict at what rate a developer can generate fps.

So, the customer wants system S. Let's see, given the rough requirements and sitting with them for an hour, it sounds like 760 function points.

Now Joe Mary and Sue will be working on this.

Joe produces 1000 LOC/month and his coding style is such that he writes 78 LOC per fp. So 13 fp/month.

Sue produces 580 LOC/month and her coding style is such that she writes 69 LOC per fp. So 8.5 fp/month.

Mary produces 1700 LOC/month and her coding style is such that she writes 90 LOC per fp. So 19 fp/month.

Looks like this team can output 40.5 fp/month. Adjust those figures with the team size adjustment formula,  that becomes 38.

So we can have the system for you in 20 months for this price and I can give you a fixed price on it too if you like. What did our competitors at Code And Fix Ltd quote you? ... Yeah, that doesn't surprise me. They tend to overrun their estimates my 250% in time and 300% in cost, here are some figures showing that. Take your time there gentle customer, there's no hurry. We got this all under control.

Dennis Atkins
Tuesday, June 10, 2003

In my experience, the mechanics of setting a fixed price(and the reasons not to) have a lot more to do with drifting client requirements than they have to do with the measurable productivity of the developers working on the project. 

Fixed pricing is bad*.


* Unless you get unalterable requirements written in stone and a legal agreement stating that YOUR interpretation of the requirements is always legally the correct one.  Which you will never get.

Mister Fancypants
Tuesday, June 10, 2003


Fixed Pricing can be awesome.

1) You need accurate history,
2) You need to have the spec nailed down COLD.

Preferably, we're talking about a short project that is an industry standard:

"Take this XML file and make it conform to the EDI X-12 blah blah blah spec."

Then you do functional decomposition.

Then you do risk-adjusted estimates.

Then you compare numbers and do some risk-management schtuff.

http://www.csis.gvsu.edu/~heusserm/articles/RiskManagement.ppt

and

http://www.csis.gvsu.edu/~heusserm/articles/RiskHandout.zip


Then you sell it.

Of course, YMMV.  If your requirements are even the slightest bit fuzzy, fixed price might be a problem ...


good luck!

Matt H.
Tuesday, June 10, 2003

I concede some of the points that were raised, particularly with respect to gaming behavior and the fact that KLOC has only limited meaning. (I will say that any particular metric has to assume that deliberate efforts to thwart it are not happening or *no* metric would be valuable in any industry. Even milestones as a measurement can be spoofed or meaningless. No doubt about it, they're tricky...)

The situation I'm thinking about is this: Try to convince adepartment that the state of their code base is such that significant rework (if not rewrite) is cost-justified. To show a decline in departmental productivity I can't point to projects delivered on time because only very loose schedules are used and almost never hit. So how does one illustrate this with *some* objective data in a way that is convincing enough that the rework effort is justified?

The only way I could come up with is to mine the code repository and bug-tracking system for relevant (albeit
not perfect) data. Going down this road you are left with few options for the specfic metrics gathered.

Other suggestions?

Jeff Kotula
Wednesday, June 11, 2003

Jeff,

SEI has some suggestions :

http://www.sei.cmu.edu/str/descriptions/mos.html#769585

Cyclomatic complexity, Halstead complexity, function point analysis. Haven't used any of them myself, but would be very interested to here from someone who has. Also would be interested to here if there're any other algorithms out there.

A pratical suggestion might be keep the exact code analysis algorithm secret from the developers to avoid them from "gaming" it.

jbr
Wednesday, June 11, 2003

Jeff Kotula:
"...Try to convince adepartment that the state of their code base is such that significant rework (if not rewrite) is cost-justified. ..."

You are saying that as if it was a foregone conclusion. Reworking a codebase, refactoring in particular is a useful tool. However you are assuming that there is something inherently wrong in the code, and you are trying to back it up with highly flawed metrics rather than serious analysis.

"... To show a decline in departmental productivity ..."

Stop right there. You are looking for metrics or statistics that back up your assumption, and are closed to the possibility that your subjective observations may be attributed to another cause.

"... I can't point to projects delivered on time because only very loose schedules are used and almost never hit. ..."

Well it looks like your organisation has a serious problem with project estimation. You are in good company. Very few organisations have good scheduling.

"... So how does one illustrate this with *some* objective data in a way that is convincing enough that the rework effort is justified? ..."

I am not certain if you realise the significance of what you are saying here, but a true objective thinker would bust your ears for it. Your post comes across as follows: 'I have made an assumption about our company, I want data to back this up, to give it scientific credibility, even data that is highly dubious, I am not interested in data that shows otherwise'. This kind of thinking epitomises software metriphobia.

If you are interested in determining the state of your code you need to look for different things. There are signs of a code base going out of control, typically well known to developers. Mutually exclusive fixes, where a fix for one thing breaks another & vice versa. Hack strata, whereby hacks have been applied on top of each other in ways that ramp up the cost of incremental change. Increases in unreproducible bugs. Dependency breaks, where features that weren't being worked on break.

What you are interested in measuring is whether the n+1 iteration cost of change is increasing due to crufty code, as opposed to just more challenging feature addition.

You need real data. You need proper estimation, so you can track schedule slippage properly. You need complete disclosure (open bug policy) so you have a good picture of the state of your system. You need communicaiton in the team. You need to be feeding from the end of one iteration into the start of the next, and refactoring the thorny issues as you find them.

Don't fall into the trap of 'any data will do'. Metrics are just statistics at the end of the day. You can use them to say anything.

Richard
Wednesday, June 11, 2003

Richard, all points well taken. I actually began the study more innocently than my posting made it seem. The situation was that many developers were asserting that the codebase was difficult to work with. Many others asserted otherwise and said the burden of proof was on the first group of whom I was one.

I didn't set out to collect only data that would confirm my intuition, but rather to collect what could be collected simply and brought to bear on the problem. As it turned out it did support the hypothesis (or rather, countered the anti-hypothesis that productivity was just fine).

No gaming behavior was at work here because this was the first application of such metrics in this shop.  If KLOC means *nothing* then why is so much of our time spent around creating and working with the thing measured? I fully concede that there are other relevant metrics too, but it is an "and" not an "or".

Jeff Kotula
Wednesday, June 11, 2003

Jeff K:
I addressed your points as you stated them. Your counter-point fleshed out the real source of information:
'Your developers were finding the code hard to work with'. Ultimately, the collective subjective opinions of the development team beat metrics hands down. The opinions of these people are far superiour to anything like metrics. Metrics or any kind of statistics can only show when things go pear-shaped AFTER they have happened. Your development team are shouting early warnings here, 'canary in the coalmine' style. They are giving warning of impending schedule crash for subsequent iterations.

Likely the next iteration is going to ramp up completely in time and bug pressure. If you are responsible for this team, and need to prove this fact to your superiours and other stakeholders you can't do it with objective data. You have to MAKE STUFF UP. Gamism of the highest order.

Ultimately you are working from a greater state of informedness than any statistics can realise. You are just trying to show it to the less informed to get approval for re-engineering/re-factoring initiatives.

You don't have to be too creative. Get your team together to identify the project weakpoints. Get them to review all of the serious bugs, or serious bugs-in-making. Turn all of them into time-to-fix issues with horrible schedules (>1 week/bug). Plot ugly bugs (long time to fix) over time or over iteration. Theres going to be lots of them towards the end, the graph will look nice and scary towards the end.

Also identify the really ugly parts of the application. Ugly meaning it has a scalar schedule multiplier for further features (like x3) due to code cruftiness.

Ultimately you need to demonstrate how your re-engineering plan will remove schedule multipliers like these. And you also really need to know how to re-engineer your application. its a tougher job than it sounds.

Richard
Monday, June 16, 2003

*  Recent Topics

*  Fog Creek Home