Fog Creek Software
Discussion Board




McCabe Complexity Analysis

It appears my previous post sparked some interest!

I think it's interesting that I didn't mention the nation, but lots of people got defensive about it.  For the record, there are a lot of developing nations churning out programmers.

While I can't post the code for various reasons, I have thought of another option.  There is a company called McCabe that can do complexity analysis - examining code for bad smells like functions with too many variables, loops that are nested too far, knotted loops, variables declared by never used, functions that are too long to manage, etc, etc.

McCabe stats are empricallly proven - that is, worst practices in code are scored based on how many defects are generated per 1,000 appearances.  ("We saw this technique used 1,000 times and 500 defects appeared; that's pretty bad.)  There's more to it than that, because maintainability is considered, but you get the point.

Anyone can run some US code and some offshore code through a mcCabe complexity analyzer for objective comparison.  That would make a really interesting PhD. Thesis  ...

thoughts?

regards,

www.xndev.com (Formerly Matt H.)
Monday, July 19, 2004

Thoughts - yea, you should have had some before you posted.

How are you going to get a statistically significant set of groups to compare?

How are you going to ensure that the only difference between the two groups is of nationality?

How are you going to allow for the effect of specifications?

Why are you bothering with any of this rubbish anyway?

Stephen Jones
Monday, July 19, 2004

This could work I think, because measuring "code quality" is an easier metric than measuring adherence to specs.

If this were actually done, and my hat's off to you if you can cut through the politics, I think the important thing is measuring only "bad" or "good." It won't mean anything if team1 scores 234.39 and team2 scores 246.05. But there is a cutoff level where code is just unacceptable, and a machine can detect that.

One problem though is not every language lets concepts decompose nicely into programs. So an entire codebase might use languages which this tool doesn't understand. We see people branching out into SQL, XSLT, Jython, Ruby. Cambrian explosion of languages.

And do I trust the coders of this tool to have any clue when it comes to actually knowing what good code looks like? Someone mentioned Brainbench recently, and people seemed to think it was astonishingly bad for testing humans. Will these tests be the same way?

And highly optimized code is sometimes ugly. There are techniques around that, like "telescoping programming languages," but only one language I know supports it; most don't.

These caveats aren't meant to argue the point, just bring up issues to be dealt with. Because I think the fundamental point is important. I've just read a paper recently on this, and it's an old topic.

Tayssir John Gabbour
Monday, July 19, 2004

>How are you going to get a statistically
>significant set of groups to compare?

Call the Software Engineering Institute and ask for a list of public CMM companies with offshore operations? Call each of those companies? 

I'm thinking the results of the study would drive companies to create tentative relationships with sourcing providers, measure the code that is produced, and only create a permanent relationship if the code meets standards - then monitor performance.

>How are you going to ensure that the only
>difference between the two groups is of nationality?

It's not about nationality.  It's about choosing a good provider.  I was offended because companies were paying a low hourly rate for code that was torrid.  I wantd to prove it objectively.

>How are you going to allow for the effect of specifications?

I'm not.  McCabe measures pure software engineering ability.  If a company can't develop _anything_, good specs might mean it's -possible- for them to produce somethng that works.  Those kind of risks are rarely a good bet.

>Why are you bothering with any of this rubbish anyway?

It studied cylocmatic complexity and essential complexity in graduate school.  I've seen samples of the code.  I've actually had somone from a developing nation email me because he wants more objective measurement because of the companies that give outsourcing a bad name. 

Thanks for the input, Stephen!

www.xndev.com (Formerly Matt H.)
Monday, July 19, 2004

Its an old topic. McCabe's first paper was in 1976, I believe. His first metric, the cyclomatic measure, basically measures the number of possible paths through a program. The most common metric used in american business is "function points" (which I think are ok for estimating time to write the code and pretty much nothing else). But this is a good example of "ivory tower research" being 5-20 years ahead of industry.

Metrics can help cut down on bugs.  I had a professor whose own research showed that some metrics could be used to identify which modules were more likely to be buggy than other ones (or more difficult to maintain). Which could let your company spend more time making sure that the modules that were more likely to be buggy got a lot more time being checked. He claimed that some companies in the avionics industry were getting bug rates 5 orders of magnitude lower than commercial software because of a number of things like using metrics. You cannot get to CMM5 without using metrics.

Beware of using metrics to decide if your programmers are productive or not. If you use "lines of code" to decide if they are good, then you better expect a lot of comments and whitespace to pad them out. If you beat them up because ModuleX has a cyclomatic number of 20 (which means it has 2^20 paths through it, which could be as simple as 20 if/then/else statements), expect your programmers to break up modules into tiny pieces so that they have a much smaller number. "Ha ha ha, his code sux because his metrics are bad! los3r!" is a bad thing to say or do: slap yourself if you say it.

What is a good use for metrics?
To decide if something is too complex or potentially too buggy to use. If your math shows that "modules with these numbers are 100x more likely to have bugs in them than modules with those numbers" it behooves you to devote a lot more time and effort making sure the potentially buggy modules aren't as buggy.

Place to start:
http://www.amazon.com/exec/obidos/tg/detail/-/0534954251/

Peter
Monday, July 19, 2004

If you want to test specific code from specific providers then great, go ahead.

I was under the impression that you wanted to try and compare programmers from two countries.

Stephen Jones
Monday, July 19, 2004

We used McCabe where I worked about 10 years ago. My impression was that it was worthless because all it did was count paths; if you broke one big function into three smaller ones you had better metrics, but I didn't see where it was any less complex. 

At the time the best metric I found was the number of commented out debug print statements; more debug usually indicated more bugs still present.

Tom H
Monday, July 19, 2004

"Call the Software Engineering Institute and ask for a list of public CMM companies with offshore operations? Call each of those companies? "


That's one of the problems with CMM.  It's not a lsit publicly available.

There have been many articles about companies who have claimed a CMM level when they don't have jack or when they have it for one subgroup of one project and they claim it for the whole organization.

KC
Monday, July 19, 2004

Tom, there are also some measures of things like "coupling." If you had been using those numbers as well as the cyclomatic #, you would find that the breaking the modules up did not remove the complexity of the code, just moved it around. It sounds like the programmers cottoned onto what was being measured, and then moved the complexity elsewhere "to make their numbers look good." The 3rd para in my post above talked about that.

Basing your whole system on 1 number is like basing your decision to purchase a house based only on the number of bathrooms.

Claiming CMM5 for the whole company should be a major red flag. It takes a long time to get any developer up to speed on a company's CMM5 system: plan on almost a year. More than few auditors have had their certification pulled for certifying places as CMM5 or ISO900x compliant. At least with ISO, you can find out who was certified by those auditors. This sort of behavior is why Arthur Anderson folded after the Enron debacle.

The next version of the CMM is supposed to include publicly verifiable certifications.

Peter
Monday, July 19, 2004

"If you had been using those numbers as well as the cyclomatic #, you would find that the breaking the modules up did not remove the complexity"

I agree.

My recollection is fading after ten years, but I only remember seeing the one number. If it was too high we were told by QA to reduce it. So we did, and QA was happy. No doubt either the tool or QA departments have gotten better since then.

Tom H
Monday, July 19, 2004

Actually, the response to a single module's cyclomatic complexity being too high IS to refactor it into multiple modules.

The idea here is if a single module has to make too many decisions, it becomes VERY difficult to test all the paths through the module, or even keep straight in your head what each path means.

On the other hand, if you refactor it into 3 other modules, each of them can have a more limited purpose (and a lower complexity number).  So, now you can verify the 'mother' module (which is simpler), as well as each of the new 'child' modules much more simply.

That's the theory, anyway.  In practice I have found there can be one or two modules in an application that serve as 'transaction centers' -- there is a lot of deciding going on there, based on several factors, and re-factoring that module is NOT going to make the application any simpler or more clear.  In those cases I usually ask for a waiver to let the high number stand, just for that one module, in the service of clarity.

I have seen others try to make a single 'main' function do the entire functionality of the program (1,000 line 'main'?  What were you thinking?) and McCabe would point that out as A Bad Thing.

AllanL5
Monday, July 19, 2004

*  Recent Topics

*  Fog Creek Home