Fog Creek Software
Discussion Board




Measures of Code Quality

Does anyone work at a place where they measure code quality using some algorithm? Have you found it a useful predictor of future maintenance costs and/or defect-prone areas? Did you think the cost of the tool (either implementing or purchasing) warranted the benefits?

jbr
Thursday, June 05, 2003

I would imagine the best algorithm is one created in your mind. Code "quality" is such a subjective term....

Who's to say that the 20 yr veteran's 400 line routines perform worse than the 2 yr veterans 20 line routine? Perhaps the 20 line version references some external objects which actually cause more of a performance hit..

Ed
Thursday, June 05, 2003

Algorithm that measures code quality?

If you believe there's something like that, then I have a brooklyn bridge to sell you.

Seriously now, quality is subjective and not really measurable. That does not stop people from measuring "bugs found per line of code", or various other ridiculous measurements.

Bugs found per line of code is a measurement that combines both code quality and QA quality into a measurement that has no scale. You can improve it to no end by, e.g., firing your QA team. Similar actions will improve the state for any other code quality measure, but usually decrease code quality (in a subjective but generally agreed sense).

Also, take a look at the K code at [ http://nsl.com ] ; I think any algorithm will rank it "low quality", but if you spend the time to grok it (and it takes a lot of time to grok!), you'll probably reach the conclusion that it ranks among the highest quality code you've seen (in terms of robustness, compactness, feature set per source line; readability will only be added to the list after you're a fluent K programmer).

He-who-would-not-be-named
Thursday, June 05, 2003

Actaully, I was looking at SEI's "Maintainability Index Technique for Measuring Program Maintainability" recently, and was wondering if anyone actually uses it, or something like it, other than IBM or DoD.

jbr
Thursday, June 05, 2003

Link: http://www.sei.cmu.edu/str/descriptions/mitmpm.html#78991

jbr
Thursday, June 05, 2003

> Bugs found per line of code

This is why I wrote my programs like this :-)  Less bugs per line, and I produce way more lines of code per day than other programmers :-)

int
main
    (
      int
          argc,
                char *
                    argv[]
        )
{
printf(
        "Hello World\n"
          ) ;
}



Seriously... 
- isn't code quality 100% subjective?  Witness the numerous discussions here about what is good code/practise. 
- QA has already been addressed by somebody else. 
- And often whether you choose to say a defect is in one function or another (e.g. two functions calling each other or working on the same data), is a decision a programmer makes rather than an objective fact.  In other words, I'm not sure it's realistic (rather than just one particular interpretation) to assign many defects to a particular piece of code.

S. Tanna
Thursday, June 05, 2003

I rather like this measure for code complexity:

http://sern.ucalgary.ca/courses/cpsc/451/W98/Complexity.html#RTFToC10

I think complexity is a good sign of poor code, and this neat piece of graph theory enables you to get a good measure of complexity.

Ged Byrne
Thursday, June 05, 2003

Ged,

Have you actually done a Cyclomatic Complexity analysis (or other measure) on a piece of code (or code base as a whole)?

Was it worth it?

jbr
Thursday, June 05, 2003

S. Tanna, the defects/loc metric definitely depends on how you count lines of code. Generally, you'd only want to use it as a personal tool to see if you're getting better from project to project. If you really want to compare code written by different people you might want to adjust for different styles. One simple way of doing it is to only count open braces and semicolons (this works fairly well for Java and C/C++). 

Also, it's usually easy to assign defects to a particular piece of code, especially if you architecture is fairly modular. This is a great way to see which part of the application are causing lots of problems. In my experience, most apps have just a few components that result in the most number of bugs. You can often improve the whole app by rewriting once bothersome module.

Of course, both of these can be manipulated by someone with perverse goals or in environments that focus on "blame casting" rather than improving product quality.

igor
Thursday, June 05, 2003

I was joking about the lines of code part, and just trying to demonstrate is it easily manipulible, intentionally or just because of different styles. Braces and to a less extent semi-colons are poor measures just because of different coding styles.

for ( int x = 0 ; x < 10 ; x++ )
  for ( int y = 0 ; y < 10 ; y++ )
    for ( int z = 0 ; z < 10 ; z++ )
    if ( array[x][y][z] == somevalue ) callfunction() ;

vs

for ( int x = 0 ; x < 10 ; x++ )
{
  for ( int y = 0 ; y < 10 ; y++ )
  {
    for ( int z = 0 ; z < 10 ; z++ )
    {
    if ( array[x][y][z] == somevalue )
    {
      callfunction() ;
    }
  } // for z
} // for y
} // for x

I personally consider the 2nd example, _because_ it contains the braces, better, as this makes it easier to edit and read.


As for choosing where to place a bug. I disagree

If I call something function in another module, and it doesn't work.  This can fail for many reasons including passing parameters or making assumptions with that other module doesn't handle.  It is often a matter of interpretation whether you consider this to be a bug in the caller or callee.  Even if the other module has a well-defined interface and assumptions, it is still arguable as to whether that interface/assumptions are defined correctly (in accordance with actual use as opposed to anticipated use), or whether my code is incorrect because I didn't work within the assumptions.

S. Tanna
Thursday, June 05, 2003

It's too subjective to be useful, I agree, yadda yadda.  The only general measure of quality that I've actually seen in real life was in the early stages of a new product that we shipped.

Our program breaks down into roughly 4 major sections, and each of these sections is most often handled by one particular programmer each.  In the first two months after shipping, customers were complaining that 3 of the 4 sections were much too buggy, and only one section was up to par.

Little did the customers know that they were in fact measuring the quality of code created by each of our programmers. Since I was the one with the good section, I was rather pleased by this.  Granted, it took me 3 years of creating excellent code before anyone could really say it was excellent...

I guess I don't have a specific point here, but this is the only real-life example I've seen of measuring general code quality across the board.

1/4 Ain't Bad
Thursday, June 05, 2003

S. Tanna, I have no problem with how you code but do your fellow coders not object to the strange way it looks? I guess you cna get a code formatter and get any standard you want...

There are so many tips for writing clean code so I am not going to regurgitate.

For me:

1. normally when code get really convoluted add a little more white space.. I usually sanity check how I code when I realize there maybe another way to code something convoluted so that it is just as fast, but much easier to read. Everyday stuff stay succinct. Don't be terse unless it's the norm expected by other coders.

2. I do expanded hungarian ({m_|g_}{[bool}int|str|arr|bit|err|obj}MainDescriptor_SpecificDescriptor) in the VB world for variables. I don't force all data functions into one module or class. When I have time I subclass or create base objects to depend on. Otherwise I just write more specific classes to work independantly of base classes. No subclassing.

3. In VB I use _ and & _ to try to ensure coders don't have to see one big like of embedded SQL or anything where clarity could be a great help. I do the same when possible with other languages. I use blank lines to paragraph intent.
Some people hate that, and find it clearer if you just code the way Programming Pearls would do it.

-- David

Li-fan Chen
Thursday, June 05, 2003

David I was joking.

I was merely trying to point out that measuring code quantity (e.g. lines or amount of code) on an purely objective basis is hard.  If quantity (lines) is hard to agree how to measure, and quality (defects) is at least partially subjective, measuring quality per quantity (defects per line) is very hard.

S. Tanna
Thursday, June 05, 2003

LMAO.. I should read the thread.

Li-fan Chen
Thursday, June 05, 2003

I was thinking of code quality as a measure to predict future defect rates, not as a post-mortem analysis of actual defects found (though that would be useful to calibrate your predicted rates).

I believe McCabe Code Analysis is a commercially available tool that does this. I was wondering if anyone has actually used it, or something similiar, and if they thought it added any value.

jbr
Thursday, June 05, 2003

There was a slashdot article posted a few months ago that suggested a relationship between defects and the number of variables that set aside in memory (Dimmed) but never set and also to the number of variables that are set but whose value once set is never read.

The hypothesis suggested that these were areas of the code that had been modified in haste and were therefore likely nesting grounds for bugs.

It offers a metric that is a bit more objective than LOC or defects per routine.

Ran Whittle
Thursday, June 05, 2003

Code/Porn similarities: To paraphrase the courts, I don't know how to define quality code, but I know it when I see it.

old_timer
Thursday, June 05, 2003

I've used McCabe.  That's as much as I'd wish to say about it.

Simon Lucy
Thursday, June 05, 2003

Ged - doesn't cyclomatic analysis automatically indicate why goto's are evil?

Philo

Philo
Thursday, June 05, 2003

I've been an architect/developer for most of the last 20- years. Like many in this economy I've been sentenced to unemployment or something very nearly as bad - in my case I'm running production support for a large-scale EAI group.

Now, I've always been sympathetic to supportability, and as a developer I've been successful at putting in place some very robust, low-maintenance systems, but my current work leads me to conclude there are only two true measures of code quality: number of problem tickets per week and the number of hours per week support staff are called in after-hours to fix the application. Other subjective measures of maintainability, efficiency, defects per KLOC, etc are secondary. From my current vantage point, I'm surprised how the actual act of writing code consumes so much focus in the system development/IT management process.

To put it another way, algorithms, particulalry error-handling algorithms, are far more important than whether something has good style and nice object model in J2EE or whether it's some ugly, unfashionable piece of code. If the code meets functional requirements, has predictable capacity requirements, the failure modes fit business workarounds/compensating controls and the system doesn't fall over very often, then it's good code. Otherwise it is exorbitantly expensive (a polite way of saying it sucks.) It just doesn't really much matter if the code takes 20Mb of memory and 1,000 LOC when someone else could make it do the same in 500k and 10 LOC. I know someone is going to speak up for maintainability; I agree that's important too, but not nearly as important as I would have thought, say, a year ago.

FWIW, where I work is full of quite good developers. We all bitch and moan constantly about being stuck in support, yet there is a consensus that the experience will make us much better developers in the future.

Disclaimer: this is for mission-critical, enterpise corporate type software. I have no experience with shrink-wrap, embedded or others types.

Jim S.
Thursday, June 05, 2003

The book _Code Complete_ describes many (quantifiable)aspects of code quality, and is worth its price if you're writing software. As a book, it's an algorithm, not an automated tool. It seems to me a more sophisticated algorithm than the "Maintainability Index Technique for Measuring Program Maintainability" which you referenced.

Christopher Wells
Thursday, June 05, 2003

You can't measure the Tao ;-)

--------------------------------------------------------------
http://www.hsoi.com/hsoishop/workshop/programmingtao.html

Something mysterious is formed, born in the silent void.  Waiting alone and unmoving, it is at once still and yet in constant motion.  It is the source of all programs.  I do not know its name, so I will call it the Tao of Programming.

If the Tao is great, then the operating system is great.  If the operating system is great, then the compiler is great.  If the compiler is greater, then the applications is great.  The user is pleased and there is harmony in the world.

The Tao of Programming flows far away and returns on the wind of morning.

Michael Moser
Friday, June 06, 2003

What i was trying to say was - if you need to measure code quality, then something is probably wery wrong in the way you do things.

Its a wrong question.

Michael Moser
Friday, June 06, 2003

The right question is - is your customer pleased?
Another thing that is hard to quantify.

Michael Moser
Friday, June 06, 2003

Jim S. - I agree that the real measure of code quality is the cost and effort post-release fixes.  But I think the original poster is concerned with how to predict those future costs based on a present-time evaluation of code quality.  In other words, how to look at a body of code and have an idea of how many problem tickets and after-hours work will be required to support the code in the future.

T. Norman
Friday, June 06, 2003

T.,
Bingo.

Moser,

I don't buy that whole Tao shtuff. It's called Computer "Science" or "Engineering", not Computer "Art" or "Interpretive Dance". By definition you shouble be able to apply the scientific process to it.

And BTW, no, our customers are not happy with code quality. Hence this discussion.

jbr
Friday, June 06, 2003

Am I correct in understanding that the original poster wants to predict the number of problems that a given piece of code will have in the future?  How is this not trying to predict the future?

Brent P. Newhall
Friday, June 06, 2003

Brent,

People predict the future all the time... and when your sample base is large enough to give you good statistics, you can make a pretty good a living at it (just ask the life insurance companies).

It works when you're not concerned with predicting the individual but for the group (and the group has to be large).

To continue the analogy, life insurance costs smokers $X more monthly because smokers (as a group) have a higher chance of croaking. Smokers can take the preventative action of becoming non-smokers and save themselves that re-occurring $X (although it might cost them a one time $A in nicotine patches, etc).

Similarly with code, you can see that if a particular section of code has a score of Y on some rating, and you can equate that to a potential cost of $Z (by calculating the expected cost of defect correction based on percentage of defect occuring based on Y), then you can determine if it's worth spending $B to reduce that risk.

So you're not trying to predict if bar() has a defect, but rather if it's worth it to refactor libary foo.

jbr
Friday, June 06, 2003

Brent,

Having just re-read your post, I'm not sure now what you were trying to imply... I thought it was something like "predicting the future is inherently fuzzy like interpretive dance, and thus can't have rules/logic applied to it."

I apologize if I interpretted it wrong.

jbr
Friday, June 06, 2003

>I don't buy that whole Tao shtuff. It's called
>Computer "Science" or "Engineering", not Computer "Art"
>or "Interpretive Dance". By definition you shouble be able
>to apply the scientific process to it.

i guess that code quality cannot be measured by one or several numbers.

code can be very different, so metric A for embedded C code will be useless for GUI code in C++.
Now comes the stuff that lies between different layers - lots of error handling/unexpected things - different metric

If you want to be scientific - then  you will first have
to classify type of code you are talking about.

For a complex project you will spend a lot of time classifying
each function, then applying different cross metrics to each of them.

If a system is very complex (like the economy/software systems in general) then simple numbers just don't do it.


Good luck, i would rather do something usefull.

Michael Moser
Friday, June 06, 2003

Moser,

Good points. You don't want to try comparing apples and oranges. The particular domain I'm working is a large, multi-processor/platform embedded C project, with a (very)little C++ here and there, so any analysis would have to be for just one or the other (unless you can derive a correlative factor...).

From the rest of your comments, I take it then that you've never done such an analysis, nor do you see any benefits of doing so.

jbr
Friday, June 06, 2003

jbr: That's sort of what I wanted to discuss.  The issue I see is the same one that Michael Moser addresses.

Yes, one can predict the future if one has a very large, stable basis for prediction.  I don't know of any code or situation that offers such a stable base.

Brent P. Newhall
Friday, June 06, 2003

I find that John Lakos ("Large Scale C++ Program Design") has a metric he calls CCD (Cumulative component dependency). This is basically the sum of all link-time dependencies in the project.

I believe this correlates well with, say, the time required to fix a bug because a module with many linking dependencies has many files whose data or code may have some bearing on the bug.

(I realize that the CCD can be artificially reduced by combining all of the modules in a project into one gigantic module, but let's assume for the moment that we are not quite that stupid)

Devil's Advocate
Friday, June 06, 2003

>I find that John Lakos ("Large Scale C++ Program
>Design") has a metric he calls CCD (Cumulative
>component dependency). This is basically the sum of all
>link-time dependencies in the project.

That does say nothing about the stability of the interface. You might link to a function that works and that has not been changed in years - these should not contribute to the statistics (or contribute less).
Now that again is a subjective quantity.

Ah! I want to introduce my own sociological measure of code quality (has nothing to do with the code ;-)


WH := cumulative working hours/ man years / cat years.

DT := diversion time (time spent on 
          WEB/EMAIL/PHONE/EXTENDED COFEE BREAKS)


QualityOfCode :=  WH - DT
                            ----------
                                WH

That would measure the sence of purpose of your organisation (belive me, i have seen large projects).

With a high score of this metric, i would regard any reference to 'code quality' as an expression of department internal politics (also seen in large scale projects).

Politics defined:  some dude wants to be department head and invents another metric that makes the present one (or past one) look bad

Michael Moser
Friday, June 06, 2003

*  Recent Topics

*  Fog Creek Home