Fog Creek Software
Discussion Board




Faking pass-by-value for pointers in C/C++?

I've noticed that some libraries go to great lengths to ensure pass-by-value semantics for strings and other non-scalar pieces of data.

The common situation is this: you have a data structure; you have functions to manipulate the data structure; you have a function to "store" a string within this data structure—should it just store a pointer to the specified string, or should it copy the entire string and save a pointer to this coppied string instead, ensuring that any modifications made later on to the supplied string won't affect the one "stored" in the data structure?

One the one hand, total encapsulation is very nice, but on the other hand, needless dynamic memory allocation is not.

Preference?

Fame and Fortune acquired via knock hockey.
Wednesday, April 07, 2004

"Preference?"

If you use something like the STL string class then you get the pass-by-value semantics without the needless dynamic memory allocation. 

STL strings are copy-on-write; so you can assign one string to another without incurring any penalties until you modify one of the strings (at which point, the copy is made).  In C++, it's pretty easy to roll your own copy-on-write system for any type of data.

Almost Anonymous
Thursday, April 08, 2004

Are you sure STL strings have to be copy-on-write? I know they often are, but is that actually mandated in the standard.

Anyway, I agree that an STL implementation is a good place to look to learn how the technique works.

Jon Hanna
Thursday, April 08, 2004

Nothing in the standard says they're copy on write.

In fact, most implementations are NOT copy on write, as it's a disaster for performance in a multi-threaded environment (all that locking going on for all your strings).

As I recall, it's also very hard to do a good copy on write implementation anyway, as you can store references to characters within the string (eg by using the [] operator). So you basically have to be very conservative about what you consider a write, which also destroys much of the benefit.

Sum Dum Gai
Thursday, April 08, 2004

The standard does not specify how a string class is to be implemented.

Mr Jack
Thursday, April 08, 2004

MFC's CString is copy-on-write, for what it is worth.

Keith Wright
Thursday, April 08, 2004

Immutable strings are a much better option.

For the original poster, "faking pass by value for pointers", the answer for C++ is to use a const reference and don't modify the thing. Who in their right mind uses C anymore? :-p

Brad Wilson (dotnetguy.techieswithcats.com)
Thursday, April 08, 2004

Also, auto_ptr is there to put a pointer on the stack, and clean it up if it goes out of scope, unless the pointer gets assigned to another object.  Terribly misunderstood object, it is, but it can be pretty useful.  You can allocate new memory and return it from a function, and not have to worry about whether someone saved the pointer or not.  Helps in making exception-safe code, too.

Keith Wright
Thursday, April 08, 2004

Having the structure copy the string tends to be safer and more flexible.  It allows the user to pass in a const char* or even a literal constant string, without worrying about how the string is used.  The structure knows that it has allocated the string and that it is responsible for freeing the string.

When a structure is taking ownership of a pointer you give it, this needs to be made absolutely explicitly clear in the interface.  I'm sure this is the source of a lot of bugs and complications.

Both ways work, it depends on what you need.  For strings I generally copy them in my structure.  For pointers to other structures, I tend to just store the pointer (though obviously it depends on the use)

MikeMcNertney
Thursday, April 08, 2004

"The standard does not specify how a string class is to be implemented."

Good point.  I had to build my own copy-on-write string class (for a C++ platform w/o the STL) and I found most of the implementation details by looking up information on the STL string class.

Almost Anonymous
Thursday, April 08, 2004

The C++ standard does not require COW (Copy On Write) strings in any way.  In fact, from what I understand more and more STL implementations are moving AWAY from COW.

The benefits of this are far less than is typically imagined.  For really good reading on the topic see Herb Sutter's "More Exceptional C++".  Appendix A and B have performance statistics for different implementations of string, and COW actually ends up hurting more than helping in a lot of cases.  (Particularly when dealing with multithreaded programs, because the cost of locking outweighs the benefits of saving the copy)

Use "const std::string&" where you can to avoid needless copying.

Tito
Friday, April 09, 2004

It is a good idea to use GC and pass everything by pointer / reference like in Java. GC has better performance than COW, and it is helpful not only for strings. Current GC implementations impose only 2-20% total average overhead.

Piotr Kolaczkowski
Saturday, May 15, 2004

*  Recent Topics

*  Fog Creek Home