Fog Creek Software
Discussion Board




managing with C++

A while back nathan started a topic, "strings in C++" that morphed into a discussion of the merits of Boost and the STL.  Joel's recent article about Microsoft losing the API wars talked about virtues of managed code, including why automatic memory management is good (roughly, because it translates into greater coder productivity).

I'd like to pick up somewhere between these with a new topic:  why can't we use C++ _as_ _if_ it were managed by using Boost and the STL and never, never using pointers, the new and delete operators, and even many native types if necessary?  This would amount to insisting on pure value-semantics implementations all of the time and trying to stay away from reference semantics and indirection.  It could even be taken to the point of using the envelope-letter idiom all the time and never even passing things by const reference:  envelope-letter objects can be passed cheaply by value when done right.

Whenever needed, we go "downstairs" into what for these purposes would be the language "basement" and create new classes obeying whatever rules and doing whatever clever storage management necessary to work properly on the ground floor (anyone care to undertake an appropriate extension of the obvious metaphorical extension:  "going upstairs"?  I think it ends up writing to a subset of C++ that looks a lot like managed code).  If needed, we provide them as templates, and if necessary we do any of the nifty tricks (like partial specialization) needed to make things efficient or compact.

The idea would be to exploit the intersection between object-oriented programming and generic programming into which C++ has and is growing.  An immense benefit is that things that often end up hidden behind the veil of the run-time in managed languages (garbage collection and memory management in general, for example) are just sitting around coded in C++.  You can change them if you wish... they're not platform specific (OK, you're going to hit me about GUIs...anyone know of a library that fits into the world-view that lets you easily create GUIs?  Via the web or otherwise, I don't care).

Managed code is really just a lifestyle choice... don't tell me you just can't stop yourselves from using those cool and shiny storage allocators; I'm not buying it... we're not children, after all.  Managed code is not a mechanism decision, people:  it's a policy decision.

Letting someone else make your policy decisions is giving into someone else's, you guessed it, "fire and motion" (copyright Spolsky, some year...).

Seriously, why *not*?  I think I've given the reasons why, explicitly and implicitly.

Thomas E. Kammeyer
Thursday, June 24, 2004

Can Boost garbage-collect objects which have circular references to each other?

Christopher Wells
Thursday, June 24, 2004

Sup, fool.

If managed code isn't at the base of the language, sooner or later you'll find yourself using APIs that were designed by people who don't share your views on memory simplicity.  Once your code is infected with one such API, you're fucked, you always have to think about memory management. 

The only way to avoid this eventuality is to write all of your own code from scratch and never use other people's APIs, and if you're willing to commit to that you might as well just switch languages to C# or Java or D anyway.

Mr. Fancypants
Thursday, June 24, 2004

'Sup, my homie.

"Can Boost garbage-collect objects which have circular references to each other? "

Sort of, but you have to be really careful and know when to use weak vs strong references for your smart pointers.  So you're right back to having to care about the low level details, so the point of your question is valid. 

Mr. Fancypants
Thursday, June 24, 2004

Christopher:  good question.

I'm not sure about Boost but I'm sure you can write reference counting that will make this
work.  Just not with references.  You get to do this kind of thing:
  template<typename L>  class envelope {
      struct flap {
        L *ltr;
        int refs;
      } *f;

    ...private members to manage the flap and letter...

  public:

      ...public members to convert to and L& or const L& and other public management fns...

  };

...which i've got part-way fleshed out already from days gone by (it's basically stalled).

The idea is to try to get to something like...
    envelope<string> es;

  es = "hi";

that is, you use variable es like it's a string...you *only* pass it and work on it by value.
if you want to update it with a function maybe you do this es = fn(es);.
In fact, you'd probably keep explicit use of references to some extent.

If you're careful and you stay on strict value semantics then
there aren't references per se and this suffices.  Note that with the envelope-letter
idiom you get copy on demand... so the only real need for references would come from mutual update needs... and maybe something like Boosts shared pointers helps here... basically, you
end up casting about for a managed pointer type (so to speak).

The point is that you could make what essentially passes for a managed code run-time
all within C++.  The extent to which it already exists, I don't know (anyone?).
It seems like it would be an immensely powerful environment when you needed it to be
and an immensely easy-to-use environment the rest of the time... if people with deep
knowledge of the language controlled themselves around the source...

This feels like less than a whole answer... can anyone elaborate?

Thomas E. Kammeyer
Thursday, June 24, 2004

Mr. Fancypants:  with all due respect:  bull.

Most C# and Java programmers are already leaning heavily on unmanaged code via one of a few
interfaces... or... umm... didn't JNI and IJW happen?

My point is this:  those mechanisms are going to incur the same liabilities w.r.t. caring about
reference types and similar issues, and in the bargain (fun!) you get to deal with cross-paradigm
programming issues.

Thomas E. Kammeyer
Thursday, June 24, 2004

"Most C# and Java programmers are already leaning heavily on unmanaged code via one of a few
interfaces... or... umm... didn't JNI and IJW happen?"

'Sup, fool.

Some, maybe, but "most"?  You have sources to cite for that?  The vast majority of people I know that use C# and Java are NOT using JNI or IJW.

So step off.

Mr. Fancypants
Thursday, June 24, 2004

foshizzle

a
Thursday, June 24, 2004

I see what you are getting at. If you designed a C++ system from scratch, you might be able to avoid explicit memory management entirely.

I think this would work fine, but only if you strictly enforced a "one-owner" system. i.e. as long as one object is always designated as the "owner" of another object, it will work fine. The C++ language idioms like resource-acquisition-is-initialization are really well-suited for this. It gets problematic when more than one object needs to share ownership of something. You'll need reference counts or manual garbage collection. I claim that in the many-owner case, you'll end up writing your own garbage collector, or using reference-counted smart pointers everywhere, which is essentially the equivalent of a garbage collector.

Dan Maas
Thursday, June 24, 2004

most C++ programmers get along just fine by making sure everything is owned by something.  Some quite high percentage of tasks work very well in that way.  Eventually I realized I'm not attempting certain solutions because I want to make sure it doesn't have any messy ownership problems.

But what is really annoying me about C++ is writing so many damn constructors, assignment operator, clone, swap, and other assorted glue code.

Keith Wright
Thursday, June 24, 2004

Dan and Keith:  I'm not thinking you'd keep on going along and writing lots of glu code, allocation and deallocation code, and the like in C++.  What I'm proposing is that a tight, small, templated set of reference-counting classes be used to do the management on enough basic objects that everything else can be made from them (at least in excess of 99% of the time) *without* appeal to writing, e.g., an override of the default assignment for a class.  Then you *just* use that base of libraries.  This is what you do with most managed code systems.  For example,
if you really need to in Java you can use JNI to get down and dirty (and no, Mr. F:  I don't have hard data; and I'll admit that my purported observation is a bit dusty given the last few years hardware improvements...).

I think most of the glue is basically done in Boost and the STL and needs to be "gussied-up" to make it yield the kind of productivity improvements claimed by Joel for VB.  OK, and enhanced a lot, too.  But my point is that you don't have to build your code up from scratch to do these tasks... specifically not... you'd actually just use "self-managing" objects instead.  Code can be rolled into this "idiom suite"; I've done some of it, though nothing that really fits the bill for what I'm trying to suggest here.

As far as co-ownership, ownership cycles, and others... I think you can do it with something like the envelope class I presented above... kind of like Microsoft's MFC CString implementation to take an example.  But there's got to be at least one difference:  you probably need explicit control of the copy-on-demand mechanism that underlies such classes (you need to defeat it on writes to the object for a given variable... that gets you your mutual ownership, ownership cyclers, whatever you want, really).  You never have to declare a C++ reference or pointer on this system of thinking... which is what I mean by pure value semantics.  Note that my envelope<string> above has a feature in common with CString:  sizeof on it returns nothing more than the size of a pointer.

Whether I could make good on making it elegant enough to make coding entirely within the bounds of the "self-managed" ("automanaged?") objects feel easy and swift is, I admit, open.  And yes, there's a performance hit, just like C# and Java... but coding the underlying runtime mechanisms directly in C++ lays bear the source of that performance hit.  I admit that this makes us want to tinker, violate the abstraction, and second guess the framework... and that's an issue, but I think it's a good one.  Here's why:

I still claim that the possibility of automanaged code at least strongly suggests that whether to use "managed" code is a policy and style question, a "lifestyle choice"... not a mechanism question.  And I still maintain that it's a bad idea to let Microsoft feed us something that creates what to me looks like an arbitrary division between the "upstairs" code that we generally write and the "downstairs" code that handles management issues like (to take my main example here) memory management.  It forces us to turn a policy question into a mechanism question on their terms and engage in a conflation that is legendarily a bad idea:  now that's F&M!

By way of full disclosure:  I keep harping on memory management... how about if someone who knows a good deal about C#, Java, or another relevant languages points to aspects of being "managed", compiled to byte codes, incrementally compiled (like SML), or just plain interpreted that maybe can't be picked up in C++ this way... I'm selling, to a certain extent, syntactic sugar... what do people think are the limits on how sweet it can be made?

Thomas E. Kammeyer
Friday, June 25, 2004

C++ offers libraries such as the boost ptr libraries and STL, which if understood properly often negate the need to do anything apart from allocation. Although learning C++ template syntax is not easy.

When you start getting objects with complex copy constructors etc, maybe the level of abstraction is wrong.

As usual., it's all about trade off's, almost any application can be coded by value, though it will be slow. But C++ offers the scope to mix by value semantics with smart pointers, so that the programmer can make a pragmatic choice.

In the end that's what differentiates C++ from Java and C#, choices.

Craig
Friday, June 25, 2004

This is why...
http://www.joelonsoftware.com/articles/LeakyAbstractions.html

Coward I am!
Friday, June 25, 2004


Craig:  so I guess I'm asking how far the limit can be pushed towards
making C++ act more like "safe" or "managed" code and even have
it look that way with all of the right libraries up front...

Coward I Am!:  yes, yes, I read the stuff about leaky abstractions... you think
just because you enshrine an abstraction in a run-time it won't leak?
It's just harder to deal with the leakage across programming paradigms or
languages.  I'd rather look at some stuff in C++ (the same language I'm
already working in) than learn all about JNI.  One thing I read
last night w.r..t C# is the way to mark code as "unsafe" and then just
use pointers where you need.  This is a lot like what I had in mind and
it's already in C#.  I don't know much about it.

What about this:  maybe this discussion suggests a healthier way to *teach*
C++ to beginners:  start having them use it through available libraries in parallel
with Java and C# and try to use them the same way. THEN teach folks
about JNI and about pointers and about the various ways code can get
grittier and less managed.

This is *not* how I would've recommended teaching these things 10 years ago,
or even 5 years ago.  Now, I think I have some sympathy for the argument that
it's possible to be very productive in C# and Java and I think maybe we should
teach people productivity first and then, as it were, turn the pinball machine around
and teach them how to *make* one.

Ultimately, a computer science curriculum would cover the same ground.  But would it do it in a more practical-minded way?

Better yet, if we could make working within a few libraries and a syntactic subset of C++ "feel" more like Java or C#, it seems to me we'd find it easier to move between those languages as developers and choose the best tools for the job.

A long time ago in A.I. there were three major knowledge representation frameworks (frames, semantic nets, logical notations of various particular kinds) and people to some degree measured themselves by how fluidly they could move between them in their heads, because it meant they were concentrating on the ideas underlying the notations and not tied to one notation.  Without holding forth about whether that was good or bad, it does seem like a nice sign-post here:  the more we can move fluidly between languages the better.

And, of course, I like C++ for the fluidity of motion between the "upstairs" and "downstairs" region... though I think most people could be using the same "downstairs" code... in fact... that code "in the basement" as I put it in my original post, would make an interesting open source project.

Thomas E. Kammeyer
Friday, June 25, 2004

My point is even when you work your ass of to make the stuff work you need to know the details how it works to debug it.

My experience with smart pointers in COM is just that. You need to referer to the source of the smartpointer implementation from time to time, just to figure out if your latest assignment operation did an addref to your pointer and the target smartpointer released it's previous pointer.

Doing managed code with no pure pointers makes life easier but it also makes some of the bugs extremely hard due to the law of leaky abstraction. Thus making the programmers life very hard.

Coward I am!
Friday, June 25, 2004


Good point, Coward I am!.

I didn't even address debugging!  Even if I can make the experience of programming in C++ "feel" a lot like C# or Java it's possible that some of the oddities of reference counting will leak through in debugging.

But I claim that this is nothing new, really:  abstractions like the ones we're discussing have always leaked in this way.

But in my case, I think this is an advantage... the way down into the details is not opaque and you're not traversing a bunch of ghostly glue code (like stack traces through COM references).  It can all be in the same high-level language and all compiled from static libraries into the same exe if you want.  This should make debugging easier in a system with a smooth, source-level transition to the managing runtime mechanism.

Thomas E. Kammeyer
Friday, June 25, 2004

But you still need knowledge of the runtime system.

Coward I am!
Friday, June 25, 2004

You need runtime system knowledge in any event... so it's not a point of division between my proposal and existing managed/interpreted/virtual-machine systems.

I still think having more of it in static libs will make things easier.
Though I should 'fess up at least this far:  sometimes debugging through all those layers of STL is just downright nasty...

If STL+Boost+some niceties can be made so that descending that far is as rarely necessary as similar maneuvers are in C# then it's all the same.

Thomas E. Kammeyer
Friday, June 25, 2004

I may confess that I'm playing the devils advocat here. I actually like the idea.

Coward I am!
Friday, June 25, 2004

"What about this:  maybe this discussion suggests a healthier way to *teach* C++ to beginners:  start having them use it through available libraries ... THEN teach folks ... about pointers and about the various ways code can get grittier and less managed."

More up-to-date introductory C++ books do exactly this, Accelerated C++ being the foremost example I can think of.

I aim for this and achieve it fairly regularly.  But, it does have cracks along the lines of dealing with 3rd party libraries.  Most people drop to pointers as the default.

C++ Wannabe
Friday, June 25, 2004

With regard to how often Java developers use native, unmanaged code....

JNI is hardly the best or only example of how a typical Java programmer might touch native modules.  Almost all of the IO, NIO, zip, and AWT that actually do something (open a file, write to a file, get entries from a zip file, blit images to a canvas, etc) are actually delegated to native code.

Chas Emerick
Saturday, June 26, 2004


C++ Wannabe:  I agree about dealing with 3rd-party libraries... one abiding frustration I have is dealing with Win32 functions to whom I'd like to pass the storage behind a std::string.  Or at least avoid a copy on the return.  But we could get this kind of thing by writing a collection of allocators to bridge the gap that would hide the pointers... it still leaks but you don't have to do your own management.

Chas:  thanks for the mention of other native touchpoints from Java... I confess to *not* being a Java programmer, so that list is particularly interesting to me.
But, um, if so much native access happens, what happens in turn to the purported advantages of Java?  It seems like my proposal re: C++ would have similar issues to any arising from the way you would answer that question, though at least retargeting the "native bridges" would occur in C++, which would be the same language as the one "upstairs" developers used.

I guess I'm arriving at thinking that doing the native/non-native maintenance in one language might have some nice advantages.

Thomas E. Kammeyer
Monday, June 28, 2004

Oh why don't the lot of you just stop using C++ already? it sucks wrinkled poo-holes.

bleep
Tuesday, June 29, 2004

*  Recent Topics

*  Fog Creek Home