Fog Creek Software
Discussion Board




C/C++ programming question

I've been developing in Java for 5+ years now and somehow I've found myself in an introductory C/C++ class (working on my Masters).  Last week the professor began covering pointers (you should see the blank stares from about half the class), and he spent half the class time on iterating through arrays with pointers instead of subscripts.  Good so far.

Assuming "a" is a pointer to an array, he explained this code:

for (int i = 0; i < 6; i++)
  for (int j = 0; j < 6; j++)
  {
    cout << *a;  // display value
    a++;  // move pointer
  }

No problem...  But then he spent twenty minutes explaining how it would be better to use this:

for (int i = 0; i < 6; i++)
  for (int j = 0; j < 6; j++)
    cout << *a++;  // display value and move pointer

We then had a long discussion about how...
  *a++ != *(a++) != (*a)++
I'm pretty sure he lost two-thirds of the class.


My question is this: Would you consider it common practice to use "*a++"?  Is this something so common to C\C++ programs that it should be taught in an intro class?

As for myself, I always code defensively.  I'm less likely to make mistakes if my code is broken into discrete, readable steps (see the first example).  On the other hand, this is probably a gatekeeper course in which students must sink or swim.


BTW- I am not criticizing the technique.  I am merely questioning its use in an intro-level class.

Russell Thackston
Thursday, November 20, 2003

> But then he spent twenty minutes explaining how it would be better to use this:

What were his arguments for why it was 'better'?

Len Holgate (www.lenholgate.com)
Thursday, November 20, 2003

You know, he never really gave any reasons!

Shorter code, I assume...  Maybe *he* thinks it follows a standard.

Russell Thackston
Thursday, November 20, 2003

Like most programmers of terse languages, C/C++ programmers sometimes get a little too in love with clever, concise idioms that work the way an expert would expect, forgetting that not everyone's an expert.

Stick to your defensive coding, and don't worry about the prof: he's just showing off.  Just because you can write code a certain way doesn't mean it's a good idea.

Although the line about *a++ != *(a++) != (*a)++ is one you should bother to figure out and commit to memory, since it's a compact expression of scoping and precedence in operators.  Understanding the example will go a long way towards learning C++.  Walk through it a few times and you'll see what I mean.

Justin Johnson
Thursday, November 20, 2003

I think he's dead wrong.

There are plenty of things you "can" do in C/C++ that you still shouldn't do simply because they make the code more confusing without any positive benefit other than showing that the programmer knows C/C++'s order of evaluation (and whoop-de-do for him).

  Leave the single-line coding gymnastics to the Perl freaks...  In C/C++, the compiler (assuming it isn't broken) is going to generate the same code for either of those cases, so there is no reason to not use the form that is more easily parsable by humans.

Mister Fancypants
Thursday, November 20, 2003

Defensive coding is good, however sometimes it can be used as a form of mental-laziness - it saves investigating how the compiler and runtime operates, for instance. In this case C/C++ has a very clear order of operator precedence and associativity that dictate exactly how such behaviours will operate.  Adding parenthesis overrides the order of precedence, which is why your examples are functionally different.

I remember one co-worker who insisted upon making it a "standard" that all smart-pointers were assigned nil at the end of each function to "avoid memory leaks"...yet in doing so they were intentionally deciding to treat smart pointers as "magical" rather than understand how they work and why the nil assignment was just a wasted operation.

Dennis Forbes
Thursday, November 20, 2003

Agreed with Justin above, that C/C++ programmers have a culture of "economy of expression".

In addition, I would say it's appropriate to teach this stuff so you're not lost when you get exposed to it later on.

Look at the source for strcpy for example:

void strcpy (char *s, char *t)
{
    while (*s++ = *t++)
;
}

It's just a common practice in the language, even if it's not the most readable.

Speaking of obfuscation, they have contests for it...


Thursday, November 20, 2003

I don't know whether it's better, but it's certainly an idiom.

Any c/c++ programmer that's been doing it for a while instantly recognizes it & knows what it means.


Thursday, November 20, 2003

You also might want to take into account what the compiler is going to do with it. On most of today's processors, with today's optimizers, I suspect you'll get no difference, but in a non-optimized case, the *a++ might compile down into one instruction easier than *a; a++; would.

Even if you don't use this idiom, if you plan to program in C/C++, you BETTER know what it means. Otherwise you're toast.

Oh yea, and if you are doing C++ you also ought to know when to prefer ++i instead of i++.

Michael Kohne
Thursday, November 20, 2003

I write my code so that only one logical operation happens per line of code. I am a bear of little mind, so I want my code to proceed one simple step at a time (within reason). If some code is too clever, refactor it into a new function and give it a descriptive function name.

Brian Kernighan once said:

"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it."

runtime
Thursday, November 20, 2003

<i>
the *a++ might compile down into one instruction easier than *a; a++; would.
</i>

It won't compile down to one instruction in either case, with any compiler, on any CPU.

Perhaps the convoluted case obscures this fact, thus yet another reason why it shouldn't be used.

Mister Fancypants
Thursday, November 20, 2003

Doh, I forgot, no HTML here.  The stuff between </i> is a quote.

Mister Fancypants
Thursday, November 20, 2003

The other posters beat me to the word idiom.

You'll see *x++ a *lot* in C and C++ code. You need to know what it means. Don't use it if you don't want but there is no doubt that you will see code that uses it.

Likewise,

x = x+1;
and
x++;

do pretty much the same thing.
In the old days, compilers would certainly use an increment operator to do this and sometimes such code resulted in better performance. Modern compilers on the other hand will theoretically optimize it all and such shortcuts are not necessary.

Me, I use *x++ and y++ and ?: all the time when its appropriate.

Dennis Atkins
Thursday, November 20, 2003

I used to be against this sort of, but once you get used to it, it's second nature.  Just write like 5 loops a day with this style, and you'll get it.  The trick is to know the idiom well enough that you will not write off-by-one bugs and you can spot off-by-one bugs from a mile away.

As others mentioned, if you're doing any significant C programming, you'll have to know it just to read other people's code.

Roose
Thursday, November 20, 2003

Oh and it's not better, so the instructor was wrong there. But it's definitely appropriate for an intro class.

Dennis Atkins
Thursday, November 20, 2003

Regarding pointers, I always set pointers to nil when deallocating them, just as I always assert that pointers are non-nil when there is no conseivable way that they could be nil. The purpose of this is so that when that contract is broken by some other coder or occasionally myself, I am alerted to the problem right where it happens rather than wonder why there is a memory leak or where exactly the rare slammer is coming from.

Dennis Atkins
Thursday, November 20, 2003

"It won't compile down to one instruction in either case, with any compiler, on any CPU."

Mister Fancypants,

Your astonishing level of ignorance regarding processors and compilers is showing.

Dennis Atkins
Thursday, November 20, 2003

"*a++ != *(a++) != (*a)++"

I'm amazed that I got all the way through this message, and am the first one to realize that the first half of this statement is wrong.

*a++ is equivalent to *(a++).

Brad Wilson (dotnetguy.techieswithcats.com)
Thursday, November 20, 2003

"*a++ is equivalent to *(a++). "

Oh and this is why I hate C++...

The deference operator and ++ have the same precedence so I assume that in the first case a is deferenced and result is incremented.  In the second case, a is incremented and the result is deferenced.  Thus they are different.

However, correct me if I'm wrong!

Almost Anonymous
Thursday, November 20, 2003

Brad,

Actually the value dereferenced differs.

Dennis Atkins
Thursday, November 20, 2003

To the OP,

A good rule in C is "When in doubt, parenthesize."
C's precedence rules are not entirely rational and even Dennis Ritchie now says that he got it wrong. Thus, intuition is not your guide. Parenthesize whenever possible, possibly excepting the *x++ idiom, where parens would be used to call attention to something being done messed up.

Instead of *(x++), one should definitely use *++x which does == *(x++).

Dennis Atkins
Thursday, November 20, 2003

"*a++ != *(a++) != (*a)++"

"*a++ is the same as *(a++)"

Uh, beg to differ.

Let's assume a points to memory location 100, the basic type is one memory cell wide, and the contents of location 100 on are {1,3,...}.

*a++: Dereference, increment pointer. a is 101, result is 1

*(a++): Increment pointer, dereference. a is 101, result is 3

(*a)++: Dereference, increment result. a is 100, result is 2.

Understanding the difference is elemental to understanding C/C++. If you don't get it, those are not your languages. Seriously.

I'm amazed that it's a problem for people who actually /do/ try to stay educated (viz. reading JoS).  Makes me curious about hard-to-get idioms in Java, Ruby, Python, etc. that seem totally obvious to the experienced practitioner.

Groby
Thursday, November 20, 2003

heh.  strcpy. 

compare the "production" version from say, NetBSD (or FreeBSD or OpenBSD or GNU) to the job-interview-favorite from K&R mentioned above sometime.

http://cvsweb.netbsd.org/bsdweb.cgi/~checkout~/src/lib/libc/string/strcpy.c?rev=1.11&content-type=text/

It's generally pretty instructive to check out the BSD versions of every example in K&R that's an oversimplified version of a Unix utility or C library function.  I've been working through K&R2 like finger exercises lately, making it more interesting by being severe about checking for every error that  a library function can return.  Not that I think that's necessarily how One Should Code, it's just so that I remember in the future that I've made a design decision every time I choose to ignore a return value.

And to the original poster, yes, incrementing pointers in assignments is a pretty common C and C++ idiom. 

Brent
Thursday, November 20, 2003

whoops, this link should work:

http://cvsweb.netbsd.org/bsdweb.cgi/~checkout~/src/lib/libc/string/strcpy.c

Brent
Thursday, November 20, 2003

"I'm amazed that I got all the way through this message, and am the first one to realize that the first half of this statement is wrong."

No, you're just the first one to mention it. :-) I haven't had much time with C++ over the past year, so I wasn't confident in my own observation, though a follow-up test in C++ proved it.

    int *deref, val1[2], val2;
    val1[0]=90210;
    val1[1]=55274;
    deref= &val1[0];
    val2 = *deref++;
    printf("*deref++ = %d\n\r",val2);
    deref= &val1[0];
    val2 = *(deref++);
    printf("*(deref++) = %d\n\r",val2);


Yields

*deref++ = 90210
*(deref++) = 90210

(Which just goes to prove that sticking parenthesis defensively isn't always doing what you might expect if you don't know what to expect from the compiler)

Dennis Forbes
Thursday, November 20, 2003

Groby, you almost got it right.

*a++ and *(a++) are the same thing.

In both of them, first the postfix increment operator takes place, returning the value of 'a' BEFORE any such increment, to be used in the derefence. So, assuming we have:

int z[] = {5,6,7};
int *a=z,*b=z;
int foo;

foo = *a++; /* foo is now 5, but *a is 6 after this statement */
foo = *(b++); /* foo is still 5, and now *b is 6 */

Remember, the ++ operator returns the value of the variable before the increment takes place (although it may increment the variable at any point (doesn't matter in this case)).

So, we have *a++ is the same as *(a++) and *++a is the same as *(++a). This makes sense given the table of precedence given in K&R2 on page 53. The ++ operator has a higher precedence than the unary * does.

So, of course, (*a)++ is different from *a++. :)

Byron
Thursday, November 20, 2003

Next time you are interviewed by Joel and he asks for a function to reverse a string, whip out:

void strrev(char *start) {
    char *end = start + strlen(start) - 1;
    while(end > start)
        *start++ ^= *end ^= *start ^= *end--;
}

i like i
Thursday, November 20, 2003

Ah shoot. Looks like I need to go back to school. I am such an idiot.
Good catch Brad.

Dennis Atkins
Thursday, November 20, 2003

I can't believe there are fundamental things about C that I am wrong about after programming in it for 20 years. I really need to switch over to Python exclusively. Or maybe I should start a shrimp farm...

Dennis Atkins
Thursday, November 20, 2003

Ah, my dear dear i like i, that is a beautiful solution there.

Dennis Atkins
Thursday, November 20, 2003

Pointer arithmetic is used in place of array references to speedup the code.  It can result in faster running programs depending upon the compiler, the machine, and sometimes the procedure in which it's used.

The other added benefit is that if you write your loop without an incremented variable [e.g., a while (*a++) instead of a for (i = 0; ...) loop] is that a register does not need to be used for the incremented variable. Freeing up a register can also speedup the code if another variable can now use that register instead of being stored on the stack or the L1 cache.

So it only really makes a difference when you are trying to optimize the code.

Lastly, think of it not in terms of where computers and computing are now.  Think of it in the historical context of memory and processor speed was more limited and tighter, faster code was essential.

Nick
Thursday, November 20, 2003

And let's not forget the "production" version of "Hello, world" from the GNU folks:

ftp://ftp.gnu.org/gnu/hello/hello-2.1.1.tar.gz

as
Thursday, November 20, 2003

As a long time c/c++ programmer i never
combine dereferencing and pointer math.

Everytime i look at such code i go huh and have to
remember how everything works.

son of parnas
Thursday, November 20, 2003

well it wasn't the fastest, but adding three statements just to have a temp in the swap takes up far to much time typing!

Short code isn't always fastest.

I always use *ptr++: stops them sacking me

i like i
Thursday, November 20, 2003

wait to you start overloading those operators..

Nice
Thursday, November 20, 2003

I was wrong -- I had a sneaking suspicion I was wrong.  I basically did what any newbie C/C++ programmer would do and googled for operator precedence. 

I do a lot of C++ coding but I generally stay from the prefix/postfix operators combined with pointers (or in the middle of complex expressions).  It's just too damn confusing -- I like code to tell me in no uncertain terms what it means.

Almost Anonymous
Thursday, November 20, 2003

as:

The GNU "Hello, world" is pretty entertaining, and I know I learned a couple things from it when I first stumbled across it.  It's really a pretty good example of how to put together a source package according to the GNU standards, which are also worth reading.

On the other hand, the mail reader is a little over the top.


son of parnas:

Yeah, I'm not a big fan of the idiom either, but it's pretty standard, and mastering it puts you in probably the 90th percentile or so of working programmers.

Brent
Thursday, November 20, 2003

"Leave the single-line coding gymnastics to the Perl freaks"

I think that wins the award for "JoS quote of the week!".

--vince

Vince
Thursday, November 20, 2003

Oh, and Nick:

I'm under the opposite impression but I haven't run any tests.  The story usually goes that if you're doing anything more complicated than running straight-through a null-terminated string,  using array notation gives the compiler a better chance to look for optimizations.  Again, I haven't run any tests nor have I looked at the internals of any modern compilers, so the story I'm repeating here could be dead wrong, or maybe hasn't been true for ten years or something.

I do personally prefer to use array notation whenever it makes sense for clarity's sake.  Premature optimization is the root of all evil, after all.

Brent
Thursday, November 20, 2003

As  a 'perl freak' I'm inclined to agree. Leave those single line code elements to us. Nothing makes us happier than writing an entire decompiler from a bash command line with perl -e. Unless, of course, we can somehow gate it into the processor manually with switches.

;-)

Dustin Alexander
Thursday, November 20, 2003

In my years as a C coder, I never used (a == b++) notation, and I usually used array notion instead of incrementing pointers when iterating. It just made the code a lot easier to read and write. And that stuff was never the bottleneck when I profiled the code.

In general pointers are very useful; a C class should focus on the situations where pointers are the best way to do things.

On the other hand, my looping code has lots of break and continue statements, which seem to confuse other coders. Every programmer has some idioms they prefer and some that they aren't comfortable with.

Julian
Thursday, November 20, 2003

Thanks Byron!

Lessons learned:

a) *Never* post code I haven't tested. The probability of it being wrong is close to 1.

b) I should get the heck out of C/C++. (I wish I could. I really do)

Groby
Thursday, November 20, 2003

> Would you consider it common practice to use "*a++"?

No, it's only used in school.

Textbooks say you can implement strcpy using something like "while (*a++ = *b++);" ... whereas a real-world library implementation of strcpy may be written in assembler, copy 32 bits aligned at a time (for run-time efficiency over readability).

For readability, I prefer your method, like "*a; ++a;"

Re operator precedence, if you or anyone you know might ever get it wrong, use the parentheses ... parentheses are cheap.

Incidentally, when I increment I use like "for (int i = 0; i != 6; ++i)" instead of like "for (int i = 0; i < 6; i++)".
 
> Is this something so common to C\C++ programs that it should be taught in an intro class?

It is taught in intro classes, and therefore you need to know it, because other programmers will be doing that and you need to be able to read what they've done.

Christopher Wells
Thursday, November 20, 2003

Brent,

I have run tests - just last week in fact. I had two functions that handle very large arrays to optimize. In one function, switching to pointers improved performance by 1.5X. In the other, it reduced performance by 3X. So, there's no hard and pat rule. It's just one option to try when you're getting down to brass tacks.

Usually, however, I ...

-- Don't use tricky pointer arithmetic constructs because 2 months later they confuse me just as much as the next guy who reads my code.
-- Avoid mixing left and right associate precedence operators in a single line.
-- Use parentheses instead of memorizing precedence rules.

Nick
Friday, November 21, 2003

Try some regular expressions. Fun:

/^(\w+)*\s*(\w+)\s+((?:'.+'|\S)*)\s*(.*)\s+([X0-9*]+)\s*$/

I apparently wrote this a few months ago. Do I know what it does? Sure, because I wrote a comment on top of it that said something along the lines of: This ugly mother does ... Otherwise, would have had to run through it by hand.

Moral of the story: Write it like that if you never need to touch it again. Otherwise, break it into pieces.

Dustin Alexander
Friday, November 21, 2003

Using pointer arithmetic when arrays will suffice is counterproductive in this day and age, not only for the people who will read the source, but for actual runtime efficiency.

Please read up on aliasing and how it relates to modern compiler optimization.

Mister Fancypants
Friday, November 21, 2003

Actually *p++ is better coding than *p; p++ because, while you're using the pointer, you are already thinking, okay, next I'll need to increment this thing.... oh, how convenient, I can say it right here, and not *remember* to say it later.

Alex
Saturday, November 22, 2003

If you use "while(*a++ = *b++);" as a strcpy, then you end up doing one more increment than is necessary.

Christopher Wells
Saturday, November 22, 2003

Ack!

Please you guys, look into assembly.
Many processors support register indirect addressing with postincrement, and many support registor indirect addressign with predecrementing. This is why we see *x++ and *--y. So "while (*a++ = *b++);" takes an awful lot fewer instructions I think a lot of you are surmising. 

On the 86 architecture, there's even a -single- instruction that will do the entire strcpy!

Dennis Atkins
Sunday, November 23, 2003

> Many processors support register indirect addressing with postincrement

Really? If you're thinking of the x86 "lodsb", I've never seen a compiler emit one of those.

> On the 86 architecture, there's even a -single- instruction that will do the entire strcpy!

Which one? "rep movsb" is two, and needs you to initialize [e]cx.

In answer to the OP's question, I commonly see preincrement (e.g. ++i) and rarely or never see post-increment nor decrement.

Also, using the MSVC compiler, "cout << *a; a++;" emits the same number of opcodes as "cout << *a++;" (so there's no objective reason to say that the latter is better ... it's just a matter of coding style).

Christopher Wells
Sunday, November 23, 2003

It is really useful for me
                                  Thanks a lot


Thanks & regards,
Prasad ..

Prasad Parasnis
Monday, December 22, 2003

friends,

please correct me if i am wrong...

"in MSVC, a _cdecl function call caused the calling function
to insert the called function parameters into the stack.
In the __stdcall function call, the called function pushes
parameters into the stack."

am i right or wrong.... ??

one other doubt :-
int a = 10;
printf(" %d %d %d", a++,--a,++a);
the result is - 10,10,11 isint ??
i.e., the parameters are evaluated from right to left..

in that case, shouldnt the parameters be inserted into
the stack from left to right... ??

                                                        Deep George Zachariah.

Deep George Zachariah
Wednesday, August 11, 2004

*  Recent Topics

*  Fog Creek Home