Fog Creek Software
Discussion Board




Hungarian Notation

After reading some of the topics below about code commenting and documentation, I started to wonder if anybody uses hungarian notation anymore.

I, personally, have always hated HN. I think it makes code extremely difficult to read. I much prefer to just camel-case my object names (i.e., thisIsTheNameOfAnObject).

But I know there are some people who prefer to tack little type identifiers onto their object names. I've especially seen alot of things that look like "ptrMyPointer", which I'm also not crazy about. This is akin, in my book, to calling something "strTableName" or "intCounter".

Object types, as far as I'm concerned, don't belong in the identifiers of objects.

Benji Smith
Friday, March 21, 2003

Hungarian is dying. Thankfully. The Microsoft .NET naming conventions explictly say NOT to use hungarian.

It might have made sense originally, but it was never used correctly, and required too much of a maintenance burden. Good riddance.

Chris Tavares
Friday, March 21, 2003

That being said, I do have one situation where I use hungarian.

In C++, particularly in COM code, I'll often times have the same logical data stored in different data types. Strings are the big offenders here. So I'll often have variables bstrName, lpszName, and csName, for "Name as a BSTR", "Name in zero terminated string" or "Name in a CString".

Chris Tavares
Friday, March 21, 2003

I think people who think Hungarian makes code "hard to read" have never learned to read Hungarian notation. Heck, I think Greek is hard to read. But if you actually take the time to learn it, it makes code that much MORE readable. You know so much more about what a variable is by looking at it.

I still use hungarian -- in C code it's essential, in C++ code it's helpful, we even have a standardized hungarian notation for SQL here at Fog Creek and it's extremely helpful.

Joel Spolsky
Friday, March 21, 2003

still stick with Hungarian, since it works.

Prakash S
Friday, March 21, 2003

The problem with HN is that the type of a variable can change as the system evolves.

wParam is a case in point.  It isn't a WORD any more, is it?  False information is worse than no information...

Keith Weiner
Friday, March 21, 2003

Joel and others,

What is it about C code that makes HN especially helpful in making the code more readable?

Would code in Java or VB or C# (or any other language, for that matter) benefit from an HN-style coding convention? Or is there something peculiar about C/C++ that makes it more essential for you to have type information in your identifiers?

Also, why is it that HN is falling our of fashion so quickly if it helps so much?

(I really am curious.)

Benji Smith
Friday, March 21, 2003

I've grown to like hungarian notation, if only because associating variable type and scope information in a variable name makes code easier to read and understand. 

Furthermore, I've never found changing variable names along with variable types time consuming at all (just search and replace). 

I suppose it could be irritating if you have code where a global var whose name collides with that of class member or local vars, or some similar situation.  But that never really happens if you're using hungarian properly in the first place.   

I've never really understood why everyone hates hungarian so much or why they think it actually makes code harder to read.  Sounds like more of a cosmetic preference to me.

I do concede it requires additional discipline on the part of programmers and there's nothing that enforces it besides programmer discipline ... perhaps that is good enough reason not to use it in some environments.  However, my experience has been that programmers not familiar with first ask questions about it, and then adopt it.  I suppose I'M SPREADING THE DISEASE.  BWAHAHAHA!!!

Nick B.
Friday, March 21, 2003

I'd actually go one step farther than Nick B. did -- if you're changing your variable types, especially between similar types (different string implementations, for instance, or the like), I find it incredibly useful to also change the name of the variable. Thus, I have to go look at every instance where it is used and make sure that things are really doing what I want.

Of course, using search and replace obviates this, so YMMV.

Steven C.
Friday, March 21, 2003

Everything gets called oWhatever or objWhatever in the end.

optimistic coder
Friday, March 21, 2003

Yeah, but at least you know its an object of type class CWhatever.  Or do you ? ;)

Nick B.
Friday, March 21, 2003

99% of the Linux kernel is written in straight C (with a small amount of assembler for supporting the dozen or so CPU platforms it runs on).  It is some of the most readable code I've ever come across, and there are no Hungarian notations floating around.

I loathe HN, and am glad to see it end of life.  Not that I even encounter it in my daily work, but just to know that it will be out of its misery soon - that's a good thing.

Deconstructing HN:  "lpsz"
LONG:  yes, even as late as 1998, MSFT was still playing with LONG.  Very quaint.
POINT to STRING:  Dammit, its pointer to char.
ZERO terminated string.  Aren't C "strings" by default zero terminated? 

Why don't we add 'r' in there while we're at it - so we know it can be "randomly accessed".  And 'x', EVERYTHING needs an X these days.  Let's add x to lpszr to formally bring it into the new Millenium.  Hmmm, what has 'X'?  How about xenophobic?  There ya go, the X for xenophobic (sort of gives you that 'closed source' warm fuzzy feeling).

xlpszrMyString[] = "Hello World";

Oh the gems...  Forget HN, how 'bout these jewels:

typedef void* LPVOID;  // FIXME: WTF?

What is so difficult about void*?  Would it be so bad to not have to invoke the tagger just to know that you're dealing with a built in type?  Kill it.  Unplug the life support and say a prayer.

Nat Ersoz
Friday, March 21, 2003

Geeze, its as bad as "comment maturbation". 

(Likely the best term I've come across lately, thanks!)

Nat Ersoz
Friday, March 21, 2003

Quite Nick. Heaven forbid we need two instances of the same class, that'd just get confusing.

optimistic coder
Friday, March 21, 2003

Well, there are some reasonable things like denoting member variables (like adding my or m_ or something like that to the front) and other "special case" situations where namespace collisions are inconvenient and obnoxious.

The LPVOID thing is somebody trying to turn C into their own language.  Sheesh.

flamebait sr.
Friday, March 21, 2003

One of the great design points of perl is that all variables start with a symbol, a kind of hungarian notation I suppose.

Scalars start with $mystring, arrays @myarray hashes: %myhash, which makes reading code really easy once you're used to it.

Matthew Lock
Friday, March 21, 2003

I've grown to like 'simplified' hungarian.
So in a manged envrironment, sFoo means string. With plain C (or C++) szFoo means a zero-terminated string; wzFoo means a Unicode string.
But long or far, no. (Though I've seen win16 code where wz didn't mean unicode because it was #defined re-used code from people who really meant OLESTR).

Once you've simplified something to its essence (i or c for an int or counter), you certainly will have to do more than just search/replace if you change the type of a variable. And if it's just something dumb like a size (short/long), the simplified name doesn't change.

mb
Friday, March 21, 2003

Basically, if you make good design choices, like scoping variables appropriately, there is no need for HN.

Allofasudden, I'm doing some math...

y = alpha * sin( omega * t  );

becomes:

dY = dAlpha * sin( dOmega *dT );

Well, of course, d means double - unless you're into math, then d connotes delta - and it reads a whole new way.

And then there is the sin() function, it reutrns double - so make it dSin().  Well, what the hell, it takes double too.  Make it dSind(). 

And back to strings, familiar old strcat() now comes out lpszStrcat() - yet why stop there, it takes 2 strings, the second one const:

lpszStrcatlpszlpcsz();

Geeze, looks like the Bronx cheer.  Whipe off the laptop screen when you're done with that, wouldja?

And don't even try for those Windows GUI functions that take about 17 parameters...  Too much fun.

Nat Ersoz
Friday, March 21, 2003

Does anyone use both?

By this, I mean do you differentiate between private variables etc(1) and public, e.g. parameter names?

Some places I've worked use HN for *everything*; their declarations look something like:

Dim lobjMyObject as company_objComponent.clsClassName

The also like branding things, which makes things even more confusing, especially if its done to every single publicly accessible object/name.

(This is not *my* idea, btw, and I also hate prefixing local scope variables - the first 'L' in lobjMy...).

Is there a "sensible usage guide" anywhere - preferably a heavy one that I can use for other purposes?

(1) Etc means function names to indicate the return type, constants ( I have seen the hateful mstrCONSTANT_NAME used by people)

Justin
Saturday, March 22, 2003

As a humble beginner VBA coder may I say I find it very useful to distinguish between control names:  cboFirstName as opposed to lblFirstName

Same goes for remembering whats a string, what's an integer and what's a boolean.

Stephen Jones
Saturday, March 22, 2003

Speaking as someone who's still stuck, for the most part, back in VB6, I like HN.  I've started learning C# and there's a lot to like about it, but it seems utterly ridiculous to call a control okButton, cancelButton or calculateTheSubtotalForMyUnderpantsPurchaseButton (hey, if you guys can come up with ridiculous examples of HN abuse...).  If you're going to tack "Button" on the back of every button and "Combo" on the back of every combo box, why not use a Hungarian "cmd" wart on the beginning and take advantage of Intellisense to speed up your typing ever so slightly?

That being said, I have started moving away from using HN in parameter declarations.  The API calls don't include them, so putting them in my own functions gets a bit distracting.

Sam Gray
Saturday, March 22, 2003

Good short article:

http://ootips.org/hungarian-notation.html

Key bad points:

1) Often wrong or misleading (e.g. wParam in Win32).

2) Ambiguous (is b a boolean or a byte, is f a flag or a float?).

3) No standard (lpsz vs. lpstr vs. lpcstr vs. lpc vs. psz etc., dw vs. ul, vs n).

4) Effort to maintain.

5) Does not take account of C's automatic promotion of types.

6) Prevents your code from being portable between 16/32/64 bit architectures.

7) Offers no automated type checking beyond what the compiler already does.

8) Lots of finger typing.

9) Enourages unnescessarily verbose variable names.

10) Conflicts with OOP principles.

Longer discussion on kuro5hin.org:

http://www.kuro5hin.org/story/2002/4/12/153445/601

Tom Payne
Saturday, March 22, 2003

Nat & Tom:

I agree whol heartedly. All this goes to show that tracking object types is something that we should leave to the compiler. For my pennyworth, just name things clearly, concisely and consistently.

David Roper
Saturday, March 22, 2003

I would agree, Sam, however most VB etc programmers would *prefix* the control name with a mnemonic denoting the type, e.g. btnOK and btnCancel.

Of course, even with HN, code can be hard to read.

The following sample (apologies for text wrapping) illustrates, in my opinion, the need for coherent, readable variable names. Yes, I can read it...but the coder really isn't making it easy for me.

The argument for mnemonics is that they require less effort to type and are less prone to spelling errors. The arguments were made before 'option explicit' and 'auto complete' were features that were commonly available. In fact, I'm not sure even if they are available. I primarily use VB, Interdev and lately .Net; I have no idea if these functions are available in other development environments


1  #include “sy.h”
2  extern int *rgwDic;
3  extern int bsyMac;
4  struct SY *PsySz(char sz[])
6      {
7      char *pch;
8      int cch;
9      struct SY *psy, *PsyCreate();
10      int *pbsy;
11      int cwSz;
12      unsigned wHash=0;
13      pch=sz;
14      while (*pch!=0
15        wHash=(wHash<>11+*pch++;
16      cch=pch-sz;
17      pbsy=&rgbsyHash[(wHash&077777)%cwHash];
18      for (; *pbsy!=0; pbsy = &psy->bsyNext)
19        {
20        char *szSy;
21        szSy= (psy=(struct SY*)&rgwDic[*pbsy])->sz;
22        pch=sz;
23        while (*pch==*szSy++)
24            {
25            if (*pch++==0)
26              return (psy);
27            }
28        }
29      cwSz=0;
30      if (cch>=2)
31        cwSz=(cch-2/sizeof(int)+1;
32      *pbsy=(int *)(psy=PsyCreate(cwSY+cwSz))-rgwDic;
33      Zero((int *)psy,cwSY);
34      bltbyte(sz, psy->sz, cch+1);
35      return(psy);
36      }

Sample extracted from MSDN copy of Charles Simonyi's "explication of the Hungarian notation identifier naming convention".

Justin
Saturday, March 22, 2003

Sorry, that should have read "the argument for short variable names" - mnemonics are supposed to be short. Doh.

Justin
Saturday, March 22, 2003

I started my programming career in VB3. The best practice back then was 3-letter-prefixed HN. Later in the VB5 or beginning VB6 era, bought a book using a single-letter-prefixed HN. I liked that better, but remember a rather heated discussion with my coworker on which HN was best. The discussion ended up with him using 3-HN, me using 1-HN.

Then I became familiar with Agile Programming and read all the good opinions on why HN is bad, and why energy should be used on selecting better name in the first place. It's years since I used HN, and I have learned it's all about habit.

I have also learned to apprechiate code which express intention. So I think HN is about implementation. Ditching HN freed my mind to concentrate more on solution, which I think is better.

Many of my classes starts out as primitive datatypes, mostly strings. Sometimes they deserve to be strings, but other times they evolve with behaviour or more complex data representations.

Thomas Eyde
Saturday, March 22, 2003

"One of the great design points of perl is that all variables start with a symbol, a kind of hungarian notation I suppose."

"Scalars start with $mystring, arrays @myarray hashes: %myhash, which makes reading code really easy once you're used to it."

No, all variables start with a symbol that indicate HOW YOU WANT TO USE THE VARIABLE.  They're typecasts.  Every Perl statement is littered with typecasts.  You have to typecast both sides of a simple assignment like "$foo = $bar".

And this casting is useless in distinguishing simple scalars like "3" from references to "real" data structures (of arbitrary complexity).  The "$" symbol LIES about what the variable holds.

rwh
Saturday, March 22, 2003

I used to use hard-core HN, but now I mostly just use the following prefixes:

p = pointer
n = count/number of
sz = null-terminated char string
wsz = null-terminated WCHAR string
is = boolean

runtime
Saturday, March 22, 2003

For Hungarian notation to work, there must an explicit company standard. Then it only takes an hour to read the standards document, and it's always clear what everything means.

Overall, Hungarian notiation added value when I started programming C in 1995 without an IDE, but it isn't worthwhile writing Java today with Eclipse. Here are some reasons for the change:

1)  An IDE instantly tells you the type of any variable, so you don't need to search for the variable declaration to get that information.
2)  Better programming practices, such as short functions and clear variable names, make Hungarian less necessary. Those practicies are more widespread now than they were before.
3)  In C, not knowing a variable type can easily shoot you in the foot, such as when you call scanf. In Java, the compiler detects almost all type errors.
4)  Other Java coders bitch endlessly when they encounter Hungarian notation.

A few months ago, my employer implemented a no-Hungarian policy. It only took me a few days to adjust, once I wrote the logic to convert the existing code.

Julian
Saturday, March 22, 2003

Any explicit convention (such as Hungarian) is very helpful when coding in a team, and for new developers to get acquianted with a large code base.

In OO, a notation which identifies the scope of a variable helps clear up what a piece of code is doing, instead of having to navigate back and forth through the source to look at where it was declared.

In a dynamically typed language such as Python, variables begining with two underscores are automatically made private variables.

For individual developers, the benefits are still there. I'm not the type who remember everything I have coded months after, and the improved readability helps.

For competent programmers, learning a notation is not much compared to picking a language. Hungarian notation throws off people a bit if they are learning C at the same time.

Chui Tey
Sunday, March 23, 2003

From Nat: "ZERO terminated string.  Aren't C 'strings' by default zero terminated?"

I'm with you I that one.  I never could figure it out. Is there such a thing as a non-zero terminated string in C?

Nick
Sunday, March 23, 2003

Well, it IS technically possible to implement a pascal style string using char arrays in C.

Why you'd want to is a completely different question.

Steve C.
Monday, March 24, 2003

What do HN users do in languages that allow you the create new types (i.e. classes)? Do you simply use some generic "obj" prefix for anything that is a reference? Doesn't this destroy one of the the purposes of the notation: to indicate type?

I've also noticed that in languages where HN is not used (Java) there are still some conventions to indicate scope. In Java code, I've seen use of a leading underscore to indicate that a variable is an instance variable on the object in question e.g. this._collection. Of course this is a bit redundant, because the use of "this" tells you that you're accessing an instance variable. I guess that if you use the underscore then it wil prevent aliasing problems if you forget to use "this".

The only other scopes are global and local/parameter, and if you design well then you shouldn't have globals (just Singletons, hahahahaha), so there's no need for any other scope indicator conventions.

Regarding access labels (private, protected, public), I've never felt the need for notation to indicate access, mainly because I've always used getters and setters (even for access to ancestor data).

I'm firmly in the "HNs are comments and comments lie" camp. I think HN is no use in modern strongly-typed languages (i.e. not C).

Alistair Bayley
Monday, March 24, 2003

I am really mixed up.

GUI Controls get control-type prefixes e.g. "btnWhatever" and "txtWhatever" and "chkWhatever".  Combo-boxes are a bit of a mismatch, "cmb" or "cbo"?  I picked this up from VB IIRC, but I apply it to C++ and Delphi and Java etc.

Variables get a scope prefix often - iWhatever for member variables, aWhatever for params etc.  Think I got this from Symbian.  But actually I sometimes use _whatever for member variables.

Beyond that, not much type info in the variable names.

Nice
Monday, March 24, 2003

Steve C,

Apparantly Microsoft used Pascal style strings in Excel.

http://www.joelonsoftware.com/articles/fog0000000319.html

Ged Byrne
Monday, March 24, 2003

I can deal with most hungarian conventions, and the lack of consistency never bothered me as long as it is somewhat intuitive.  But whever I seem to go, the one convention that annoys me most is using hungarian for the name of a VB standard module.  Why?  To my mind, a module of string functions should be "StringFunctions"  not "modStringFunctions" or "basStringFunctions". 

Has there ever been a situation where someone was heard to say, "Thank God we use Hungarian for the name of the modules."

Ran Whittle
Monday, March 24, 2003

Ran, I have a tendency to do exactly that, and you're right; it's extremely silly.  It's probably related to my having come to VB programming by way of Access 97, where it was advised to distinguish between tables and queries (although this is of dubious utility) and got taken waaay too far.  (=  'mod' could also, theoretically, be used to distinguish a standard module from a class module, but now I'm probably being overly generous.

Sam Gray
Monday, March 24, 2003

A better place to put String functions, Ran, would be under String itself, but of course VB6/VBA don't have static methods.  In Java you could group static and nonstatic methods all under class Whatever, but in VB6/VBA you have to put static methods in their own module.  So one potential naming convention is to have the class Whatever and the module modWhatever; makes it easy to find your static methods.

(Or you could just move to VB.NET where you can use Shared to denote static methods.)

Kyralessa
Tuesday, March 25, 2003

Have you ever had to do this in your VB work?  Can you give me a brief example?  (I am not a Java guy)

Mixing static and non-static methods would seem to create an (possibly) unclear dependency between the class with the static method and the calling class, but in JAVA the advantage would seem to be that the static method prevents you from needing a third class.  Since VB is forcing you to create a third structure for the method anyway, how does the naming similarity between the class and the module help?  Wouldn't a neutral name for the module be better?

An ocean class and a mirror class would both have a reflect light method, but the dependency is not obvious.  If VB is forcing me to create a thrid module to contain the reflection method wouldn't that module be better named Optics or something?

Interesting point, but I am having trouble envisioning this.  What am I not getting?

Ran Whittle
Wednesday, March 26, 2003

Well, actually, any public procedure in a standard module is available throughout the project without having to prefix it with the module name; so the way you name your standard modules is more of an organizational issue than anything else.  I guess as much as anything, using "mod" helps make sure you don't have a module named the same as a reserved word or a procedure, and it reminds you that you're dealing with a standard module so you don't go trying to instantiate it.

It also makes it easy when I'm looking for a procedure in my library; I type RLLib.mod and Autocomplete goes to all the modules, I pick the right one (RLLib.modDocumentation, say) and then when I hit the dot again I get just documentation-oriented procedures instead of having to hunt through every procedure in the library.  Without a prefix like this, your standard modules are scattered through the Autocomplete list and you have to search for them.  So basically the "mod" thing is just a time-saver.

Kyralessa
Wednesday, March 26, 2003

This is what I use in my C++ code:

SomeType SomeClass::SomePublicFnMember( int _firstParam, char const* _secondParam, SomeType& _thirdParam ... )
{
SomeType someLocalVar1 = m_memberVar1;
SomeType* someLocalVar2 = m_memberVar2;
SomeType& someLocalVar3 = _thirdParam;
SomeType& someLocalVar4 = somePrivateFnCall( _firstParam, _secondParam );

// just use HN where type DOES really matter
PoolOfNumbers numberPool;
int iResult = numberPool.GetResultAsInteger();
float fResult = numberPool.GetResultAsFloat();
char const* szResult = numberPool.GetResultAsString();

// though you could get rid of HN in this case as well
int intResult = numberPool.GetResultAsInteger();
float floatResult = numberPool.GetResultAsFloat();
char const* stringResult = numberPool.GetResultAsString();

// You get my point?

// About 'p' prefixes for pointers:
// What good is prepending a 'p' to a pointer's var name?
// What kind of 'useful' information would it give to you
// at a glance? Just to know if a '.' or a '->' operator
// should be placed when referencing any of its members?
// If you declared at the beginning of a function:
Client* pMyClient = getClient();

// and 30 or 40 lines later you had to use that client
// again, you could end up like this: "hmm, was it pMyClient
// or MyClient? was it a pointer or a reference or what???
// (yes you guessed, prepending a 'p' did not help, and
// made the code a little bit more cryptic.

}

Also, I use s_ for static vars and would use g_ if I ever happen to deal with globals (I doubt it)

As you can see, I don't use hungarian notation at all, just some scope context prefixes.

All I can say is, after being used to HN for well over 9 years at my company, I've ended up realizing that it makes no sense at all, except in a very few cases, where a var and its type are so tightly linked that you couldn't figure out its meaning without using some kind of HN (see example above). On top of that, my code is much more readable now, or at least, that's what it looks like to me.

I know for a matter of fact (it happened to me) that quitting HN can be a really painful change for a programmer who has been used to it for many years, but it's worth the effort.


My 2 cents. Thanks.

Trapazza
Wednesday, March 31, 2004

*  Recent Topics

*  Fog Creek Home