Fog Creek Software
Discussion Board

Welcome! and rules

Joel on Software

String value reference identity crisis

I understand the difference between a reference and a value type but for the life of me I can't figure out why a string (which is a reference type in .NET) behaves like a value type.

Why when you copy a string do you get another string instead of a pointer to the original?  Why are they immutable?

I guess it boils down to why didn't they make string a value type & get it over with? 

Just random musings on my lunch hour.  If anybody has the good comp sci explanation I'd love to hear it.

Tuesday, December 14, 2004

Check out Effective Java by Josh Bloch. Besides being incredibly well written and insightful, it will explain these concerns about things like immutability (and why it is basically desirable), and the weird things that can happen if you don't use it.

(Yeah, its about Java, but as everyone knows, .NET shares more than a passing resemblance, and the String class is no exception).

On the other hand, my guess is that outside the world of programmers at Sun and MSFT who write libraries used by millions, not very many developers pay a whole lot of attention to these issues, at least not to the extent that Bloch wants you to.

Tuesday, December 14, 2004

String is not a value type because then you'd have to copy around all characters in a string whenever you pass it as an argument, return it from a method etc.

But don't let this issue confuse you. String is a real, honest-to-Gates reference type -- except that the referenced data is immutable. (Use StringBuilder if you ever need a mutable string.)

Well, and why is the data immutable? Because that allows "string interning" -- all strings with identical contents reference the same entry in a global string table maintained by the runtime system. And *this* allows extremely fast comparisons and switch/case on interned strings. (String constants are interned automatically by the compiler. String variables must be interned using String.Intern.)

Also, .NET doesn't have the C++ const attribute for parameters, so immutable strings eliminate a host of possible errors when a method writes around in a string that it shouldn't touch.

Chris Nahr
Wednesday, December 15, 2004

*  Recent Topics

*  Fog Creek Home