Fog Creek Software
Discussion Board




"Good Code" - What is it?

What do you look for when you look at a chunk of code and evalutate its "goodness"?  A few of my favorites:

- High "signal to noise" ratio.  What percentage of lines "do something" - vs. housekeeping, function definitions, comments, esoteric error handling, etc.  All of these things are "good" by themselves, but it's still far too easy to bloat source code.  One of my current projects has numerous 10,000 line source files that just don't seem to do much.  Also, call me old fashioned, but occasionally I prefer a 100 line function to twenty 5 line functions.

- Minimal algorithmic complexity.  Was an "advanced" algorithm used when a simple, straightforward one would do?  It leaves the reader to wonder "why?", when in fact there may not be a reason.

- Lack of "cuteness".  I'll admit, I'm a sucker for A^=B^=A^=B, but not for reasons other than sadistic amusement.  Same with unneeded use of C++ templates.

- Use of most primitive technology required.  I'm sure we've all seen code that uses a "technology" such as COM, XML, etc. that left you scratching your head and saying "why didn't you just link this statically?" or "why not just pass a reference to a class, instead of XML?"

- Appropriate optimization.  Aside from fundamental architectural optimization, most code shouldn't be optimized until it's a proven need.  If a list will only ever have <100 items and is accessed infrequently, is a hash table really necessary?  Also appreciated is a small "//Performance optimization" comment that shows that the developer knows why they're doing what they're doing.

- Architectural simplicity.  Only classes that need (or expect to need) remoting capability should have remoting code.  Often, only mission critical classes need to handle all possible errors.  Only time critical classes need performance optimization.  Only certain components need thread synchronization.  It's usually a bad idea to write a lot of code that "we might need later".

- Specificity (avoiding over-generalization).  I think Joel likes to call these people "architecture astronauts".  One needs to be realistic about code reuse.  It's rare.  If your team assessment is that it's unlikely that your code will be reused, just code to the specific case at hand.

- When vs. How comments.  Knowing how to use a class: useful.  Knowing when to use a class: priceless.  Microsoft could learn this one in their documentation.

- Appropriate quality bar.  For example, in decending order of quality need: Nuclear missile launch control, 24/7 medical equiptment (pacemakers, etc.), operating system kernals, operating system accessories, horizontal market software, vertical market software, internal use, one-time-use developer utilities.  It's frustrating to see code that missed the target bar - up or down.  Did someone spend months writing multi-server transaction rollback code for an internal $3M a year billing system?  Gawd, I hope not.  Likewise, I hope the NORAD people do more error checking than our internal CRM product.

- 100% reachable code.  I'm sure I'm not the only one to puzzle over a function and later find that it isn't referenced by any code paths.

- Lack of cut-n-paste.  It's easy to rely on code downloaded from the internet or brought in from another app.  Often this is good.  But did the developer understand what they code really does?  Is the code using APIs that create system dependencies?  What is the quality of the code?  Does the importer have knowledge of the cargo?

- Lack of redundency.  Five developers, five separate routines that parse the same configuration file.  This is very ugly, yet seems to be common.  Do people look at each other's code?

- Visual clarity.  Is the code easy to read?  Were 8 line if statements split up into more readable logic?  Did the person indent consistently?  Does it look like they care?  Are there "why" vs. "how" comments?

There's a theme here.  I don't like code.  Code is an evil that is necessary to achieve business results.  There shouldn't be a lot of code where a little will do.  Code costs money, both now and later.  Complexity costs money, both now and later.  It's a one way trip, kind of the reverse principal of a haircut - you can grow it, but never cut it.  There's a popular theory that says that the cost of systems goes up at roughly the square of their complexity.  Make sure you're getting bang for your buck.

Thoughts?  What do you like to see in code?  Do you "like" code as an end unto itself?

Bill Carlson
Monday, October 21, 2002

I'm very fond of Martin Fowler's _Refactoring_ book. It includes a section by Kent Beck on code "smells": ways to recognize bad code, and what's wrong with it. Many of the things you mentioned are in there.

Chris Tavares
Monday, October 21, 2002

A guy I used to work with had a good strategy for taking over other people's code. He would go through it deleting everything he couldn't understand. Then he would write new code until the applicatoin worked again.

(Note I've somewhat simplified his process).

Anyway, there are a fwe things that indicate "bad code" to me:

- Lots of commented out code.
Commenting code for debugging is fine. However, once a piece of code has been determined to be redundant, it should be removed from the source file. Permanently. It's fine to keep a library of routines you "might use later", just don't keep them in your source code.

- Unclear function purpose.
A function should perform one task and one task only. Functions which perform a random list of tasks make the project confusing and make it very difficult for a new developer to determine what a function does. Sometimes functions seem to perform a list of unrelated tasks, however as long as the seemingly unrelated tasks have been broken down into individual functions, it can often be found that the group of function calls combine to form a higher-level task, which is fine.
The name of a function should accurately describe the task it performs. If you have difficulty thinking of a descriptive name for a function, there's a good chance it's breaking this rule.

- Poor or nonexistant commenting
The purpose of comments is to explain what a piece of code does at a higher level. It is not for a discussing the merits of different algorithms. Nor is it for explaining the code for a non-programmer, or someone unfamiliar with the language the code is written in. As a general rule I like about one comment line per three to five lines of code. Comment lines should describe the purpose fo a group of lines. Brief comments at the end of a line can also be helpful.

- Unclear or inconsistent function and variable naming.
There should be s single naming standard used by all developers within a project. This should be agreed in writing before the project starts. If you use hungarian naming, make sure you all work from the same list of prefixes, and that you agree modifiers for globals, arrays, etc. If you use abbreviations in names, make sure you agree on the abberviations you may use.

- Poor indentation.
Code should be indented in a way that makes sense for the language you are using. Some languages have several different conventions, so you should make sure your team all use the same convention and tab size. If you are an IDE which auto-indents, make sure that either all developers use the same IDE, or that they all support consistent standards.

- Repeated code.
It is suprising how often the same routine can crop up in a piece of code. It is perhaps even more surprising how many programmers will keep repeating the same piece of code, rather than encapsulating it in a function.

I'm sure there are dozens of others, but these spring to mind.

James

James Shields
Monday, October 21, 2002

Read "Code Complete" (MSPress,Steve C McConnell
).  Be happy.

jon Kenoyer
Monday, October 21, 2002

James, your ex-colleague sounds like an idiot. Throwing away working code to replace it with code that does the same thing is crazy! imho.

John C
Monday, October 21, 2002

Unless there's a clear advantage to doing it - which isn't terribly often in my book...

John C
Monday, October 21, 2002

Deep breath. Okay, what I meant to say was that refactoring code is cool, but throwing it away because you "can't understand" it seems rather mad to me.

John C
Monday, October 21, 2002

All that matters to me is that it seems like the coder thought about what the code exactly did before tossing it to screen.  Usually functional description like (taken from some elisp code I wrote):
regexp: line -> {t, f}

For me, the win is that it's hard to make those big functions that do 100 things, and at a glance I can see what I want to handle nulls and other corner cases.

anon
Monday, October 21, 2002

I've just been assigned to add some new stuff to another guys code. Here are a few thoughts....

- Switch-based state machines are bad.
- 1000 line (no kidding) functions are bad.
- Variables 'hijacked' for another meaning are bad. This one indicates the number of lights on, or if it's set to -2 it means the 'OK' button has been pressed!?!
- Classes with unclear and bloated remits are bad.
- Lots of commented out code is bad, lots of code removed via '#if 0' is worse (since the editor won't highlight it).
- Public member variable access is bad.

Mr Jack
Monday, October 21, 2002

I agree with John.  Tinkering with production code when you don't know what it does and what it's supposed to do is a recipe for disaster.  It's been mentioned before, but I believe a better way to understand code is to write test cases and assertions for it.  After the code (and it's intent) is fully understood, then you can begin refactoring if it still makes sense to do so.

Brian
Monday, October 21, 2002

Oops, I should have included a smiley... :-)

I did say I was exagerating...

There was a bit more to his methodology, and I don't think he ever really deleted code without knowing what he was deleting... but he was a seat-of-the-pants coder!

James

James Shields
Monday, October 21, 2002

Good code:

1. Minimization of state variables:
a. Minimization of globals.
b. Minimization of class data.

2. Documentation of state variables.
a. What gets set?
b. Why does it get set?
c. How does it get set?
d. Who relies on their state?
e. A state diagram for the machine that is decribed by state variables.

This is a requirement for writing reentrant code, and in my experience, produces the most readable and maintainable code.  IMO, all good coding practices follow from this.  Everything else, to me, is a "don't care".

Nat Ersoz
Monday, October 21, 2002

I take issue here:

>> Often, only mission critical classes need to handle all possible errors. 

All possible errors need handled in any released product code.  Some errors should not be possible, and therefore should not be handled at all.  Case in point is handling the unexpected passing of a NULL pointer to an internal function (non-public interface).  The NULL pointer should be handled at instantiation or the earliest known opportunity for failure.

All error return codes must be documented - and of course that would mean exceptions thrown as well.

Nat Ersoz
Monday, October 21, 2002


I have this problem with state variables. See, in _every_ place I've worked I get assigned to "fix" something that exists in some production code. Every time, when something was failing, some sort of state variable was involved.

I really _hate_ state variables. Specially when the guy who created the program wasn't kind enough to provide constants or some sort of documentation, or use some dumb name like "struct_def".

For the record, every one of the programs I've fixed was written not by programmers, but sysadmins who knew how to code.

Leonardo Herrera
Monday, October 21, 2002

[Variables 'hijacked' for another meaning are bad. This one indicates the number of lights on, or if it's set to -2 it means the 'OK' button has been pressed!?!
]

Oh how I hate this. I once had a long discussion with 3 other developers on why we should not use a null value coming from a database to determine value. They basically wanted to have 3 values - 1,0, and 'null'. I thought that this third state should have been captured by another variable. Of course these are the same coders whose code was littered with statements like: if (ret = 1){...}. Ok so wtf is 1? It's simple things like this that make code unmaintainable but yet some ignore it anyway. After working on their VB super class (2000+ lines, does everything, no unit tests, 20+ ADO recordsets initialized in the constructor, etc). I finally got so fed up that I packed up all my shit and walked out. I've never done that before but I had to get out of there before they realized I was on to their fraudulence and they tried to have me killed ;-)

trollbooth
Monday, October 21, 2002

Numbers in code are just plain evil.  I once spent a week trying to sort out some code with 26 references to the number "20" - meaning at least four different things that I discovered, and possibly some I never did.  Ouch.

And yes, that code was written by a cowboy coder type too, with a sysadmin like job - though over time it seems the developer / software engineer types who work around them descend to their level.

Mikayla
Monday, October 21, 2002

Disgusting?  I'll show you disgusting...  We get this non-working code from our partners all the time.  And it sucks.  They are a hardware manufacturer, that rather than document the device, decide that they can write code.  The sad part is that the hardware is very elegant, once you reverse engineer the software.  So they've got brilliant hardware engineers, some awful people writing software, and some biz dev/managers that don't know the difference..

Oh the pain...

Changing the names of things so as not to violate NDA's...

void SetRegistry(IGLOB* glob, REGISTRY* pRegistry)
{
    Cglob *this = (Cglob*) glob;
    this->pRegistry = pRegistry;
    this->privateThing = pRegistry->setting1?
        undocumentedConst1:
        undocumentedConst2;
}

BOOL arbitraryInitFunction( IGLOB *glob )
{
    Cglob *this = (Cglob*)glob;
    // set registers based on this->privateThings
}

There are 58 different "privateThings" which are class based (in a perverse sort of way) state variables.

There are over 90 different "elements in the "Registry", and since the registry exists as both a private thing which gets pointed to by Cglob and can be modified by passing in a REGISTRY blob, it convolves the problem.

Elements in the registry are often undocumented bit fields which apply to (undocumented) register bits.  The only way to uncover the meaning of what they mean is through experimentation and observed behavior.

This constitutes a COM-like "class" of settings and interfaces (there are 150 public methods on this class), which get aggregated into a super-class of crap containing several of these misbehaved children.

I need therapy.

name withheld to protect the guilty
Monday, October 21, 2002

Yeh, "switch state machines".

Reminds me of another point, write code so you can debug it easily.

Thus, minimize ?: usage and avoid "switch state machines".

njkayaker
Monday, October 21, 2002

One of the worst problems is having member variables of complex classes that are non-private. One of the first things I do when I take over code, is go through, changing the access specifiers for public or protected data members to private, see what doesn't compile, and provide appropriate accessor functions.

One important thing, in C and C++, is const correctness.

void function(const T* ptr); tells me instantly that the function does not take ownership of, or modify ptr. Also, using consistent rules to indicate custody-transfer is important. It's great when code uses

auto_ptr<T> function();

instead of

//the pointer returned should be owned and freed by the caller
T* function();

Just a Programmer
Monday, October 21, 2002

I just got asked last week to fix an application (VB6) that was written by our sysadmin, basically it was written as a service to run and night and report to him all the naughty people who had gone home without checking in source code, and who did not log out of their PC's.

It used, I kid you not, DAO, RDO and ADO,  he was using a DAO connection object for "SELECT" statements and a RDO connection object for "INSERTS", the ADO connection object was used in a foul looking "data-shaping" selection.
I think some of the examples he read may have been taken a little too literally.

The other developers spent a lot of time today just pissing themselves.

Oh, and the sweetener was that every single variable was a global, he had a global.bas with 227 global variables in it!

The best one was an integer named "GLOB_Unspecified_BAS_Module_Counter", I had to go to the toilet when I read that. Does that even compile?

Ah, the joy, it bought a tear to the eye of us all.

Alberto
Tuesday, October 22, 2002

I once had a chap seconded to work with me, who came from a government agency. He told me about a previous assignment. He had been given a choice of two possible assignments, and had a chance to look at the code before he decided.

He made his choice after finding the following in one of the first files he looked at

#define FUDGE 7

regards,
steve

stephen hill
Tuesday, October 22, 2002

Actually, I have one case that I seem to come across where the rewrite has been successful for me ... changing requirements.

A few times now I've seen wild looking code that needed to do quite a bit.  However, three years down the pipe the set of what the code needed to do narrowed dramatically to something 'describeable' in two sentences or less.

You can call it deleting the code if you like, a favourite refactoring.  But in every case it was faster and simpler to throw out the code and start with code set to the new requirements.

Round hole, square peg.

John
Tuesday, October 22, 2002

4 the [-1, 0 and 'null'] problem. so you left because you wanted to store the non-value in a diff variable? i think the other 3 was lucky that you left. of course i do not know the full story but man, these are values, and each value _means_ something different. what if you have 4 values, eg. two type of nulls, like 'null' because data was not provided or 'null' because some other reason. of course dbs don't support different type of nulls :( - that's why people use values to indicate something.

anyway, i'm maybe wrong.
cheers

n/a
Saturday, October 26, 2002

one example I saw. get a random number between 1-3. I put pseudo-code here only:

while (1)
{
  x = rand() * 10
  if (x < 3) {
    break;
  }
}

w. testing it took the guy 6 hrs based on his daily timesheet entries.

.
Saturday, October 26, 2002

It was mentioned that one way of spotting bad code was lots of commented out code.

I don't quite agree with that. I have encountered lots of scientific software that by now is well over 40 years old, and is likely to survive another 40 without doubt. (One just doesn't rewrite hundreds of thousands of lines of Fortran...)

Generally they are filled with commented out algorithm approaches that in the end turned out either slower or plainly faulty.

Of course one could just delete these failed attempts...and thus ensuring that they will be tried again, if not next year, surely the year after. Isn't it then better to actually have the code in there loudly proclaiming "Don't do this, we tried, it doesn't work!"

Of course in an ideal world the modules would have supplementary documentation which clearly describes the problem, and all failed approaches, so no-one needs to try again...but I'm expecting to wait for a lot longer than 40 years before I see that happen.

Johan Borg
Monday, October 28, 2002

One problem with tearing out working code and rewriting it which no one seems to have mentioned is that you would, no doubt, regress it in the process.  The new code might very well work on your development system, but unless you have thorough regression tests you'd find that the new code would have old bugs.

I heard variables with widely divergent meanings (positive or zero it's the cardinality of the set, but negative means the printer is offline) called polymorphic values, or something like that.  A similar problem is when a local variable changes its role in the middle of a function -- first it's the number of widgets, but later it's the x offset of the left edge of the last widget that was layed out.  Call them overloaded local variables.

If your editor doesn't highlight code commented out with #if 0 / #endif, get a better editor.  Vim's syntax highlighting colors such code the same color as comments.  Not that I'm recommending #if 0, you understand....

On the subject of finite state machines implemented with switch, I have to disagree with earlier writers.  So long as the states are documented, and the function consists of just the state machine, with minimal setup (in other words, the state machine isn't buried in the midst of an already unwieldy function), a switch is a perfectly good way to write a state machine, and shouldn't be harder to debug than any other implementation.

You could break it up into a whole raft of separate functions, but you'd wind up with something equivalent to what you started with, and have to pass the state variables around.  You could use a method class to avoid that, but again you just end up with a synonym for what you started with, like using two different notations for the same mathematical concept.

David Conrad
Tuesday, October 29, 2002

*  Recent Topics

*  Fog Creek Home