Fog Creek Software
Discussion Board




Garbage Collection Mechanism Motivations

I made a recent post to my blog on the motivations for garbage collection mechanisms at http://www.heron-centric.com/2004/08/garbage-collection-motivation.html . I listed the following motivations:

- to faciliate the "last one out turns off the lights" design pattern
- to detach the programmer from the concern of when to destroy an object
- to avoid the common programmer error of accidentally orphaning of memory
- to avoid the common programmer error of accessing invalid memory blocks

Do people here generally agree with the motivations, am I missing any?

Christopher Diggins
Monday, August 16, 2004

You went to the trouble of creating a blog just to give yourself more credibility when you ask people to do your homework for you?  Kudos!

muppet
Monday, August 16, 2004

I don't use garbage collection precisely because it does NOT help to implement "the last one turns off the light" pattern. Precise resource management becomes a disaster; all objects get an indefinite lifetime, you cannot guarantee something is destroyed - and god forbid if you try to use destructors for anything other than releasing memory, like closing a port - you can't predict what will happen).

I use reference counting when I need managed memory.

.
Monday, August 16, 2004

Reference counting is for wimps.

muppet
Monday, August 16, 2004

No, I disagree.  You are wrong on all but one point.

Seen your frog
Monday, August 16, 2004

muppet, where's your blog. we would really love to know who u r


Monday, August 16, 2004

I agree with all the motivations, especially the first one.  However, you can really only accomplish the first one with reference counting (which I believe is your implementation).  Unfortunately, reference counting is very slow when combined with multithreading (which is why Java/C# doesn't use it).  How do you plan to resolve that?  Or do you not plan to resolve that?

I like your design of Heron although I think the entire meta-programming syntax goes against the KISS principle.  It's essentially a language-in-a-lanuage -- certainly that's more complicated!

Almost Anonymous
Monday, August 16, 2004

this is so pathetic (from the heron site). sometimes i'm so sad living in this world:
++++++++++
If your organization is interested in learning more on Heron Christopher Diggins is prepared to give powerpoint presentations on the Heron programming language customized to your specific needs and level of technical expertise.
++++++++++
Heron on your next project
Take advantage of the fact that I am currently offering free support and consulting with regards to Heron, contact Christopher Diggins.
++++++++++


Monday, August 16, 2004

Holy crap...  give the guy a break!

Almost Anonymous
Monday, August 16, 2004

Hey, Chris, I'm back for more trolling and general purpose arch-nemisis behaviour. ;)

The one interesting point about GC the way that Java and CLR implement it is that it's actually pretty poor for proper "last one turn out the lights" behaviour because destruction isn't ensured.  Ruby does it better and, in general, is the only system that I've ever seen actually think about such things.

Also, the big advantage to copying GC systems is that all of your data structures are in a nice heap and are contigious.  You just lop off a chunk at the end of a heap every time somebody allocates memory.  This is the main reason behind the GC-fan argument that they can manage memory better than you can.

I think you are giving even bright programmers too much credit for #2.  In a truly huge system, you aren't going to be able to understand the memory flow.

The danger of coming from a primarily C++ background (said as somebody who has) is that you tend to want a non-GC language and you tend to think that reference counting is "good enough" for those little cases where it's harder to manage memory.  But some of the more advanced GC algorythms will have fewer special cases than reference counting and be faster to boot.

Have you given much thought to the flame I posted to the last Heron-related thread?  I really think that Heron could be an awesome language, with respect to memory management, if you were to allow the user to control memory management via metaprogramming.  And it would be a great tool for GC research, to boot.

Flamebait Sr.
Monday, August 16, 2004

W.r.t "Seen your frog", I don't know to what you are responding. Please be more precise.

W.r.t the A.A. about metaprogramming the next version will remove a good number of the restraints. The language within a language design is in fact simpler than C++ metaprogramming. For instance in C++ sometimes a constant is available at compile-time (like a constant int), other times it isn't (like a constant float). It is simple from a grammar standpoint but not from a semantic standpoint. My goal is to simplify the semantics rather than the grammar.

W.r.t to A.A. about disadvantages of reference counting, I agree with you. The multi-threaded issue is very sticky and I have no sol'n at this point.

W.r.t. to dot, I agree with your point about the "last one out" design pattern. It still seems for other programmers to be a raison d'etre for GC which is brought up over and over again. 

Christopher Diggins
Monday, August 16, 2004

To Flamebait Sr,

"I think you are giving even bright programmers too much credit for #2.  In a truly huge system, you aren't going to be able to understand the memory flow."

Without meaning any disrespect, this embodies a defeatist and sophmoric attitude towards software development. An organization which takes this attitude towards software would (hopefully!) not be writing software for mission critical tasks like financial transactions, aircraft control, defence, medical applications, industrial robotics ... etc. etc.

I only just read your last post on Heron, and it is quite interesting. Your suggestion of two kinds of weak pointers is something that I will consider carefully. The more sophisticated metaprogramming that you are encouraging is not a bad idea, but is not apporpriate for the kinds of development I want to see being done with Heron. I think I will let Perl, Lisp, Haskell, Ruby, etc. reign in this respect.

Christopher Diggins
Monday, August 16, 2004

Referring to motivation #2 as a "red herring" is ridiculous.  You might as well say "knowing which registers a subroutine uses is an integral part of fully understanding the behavioral characteristics of the software."  If you're programming in assembler, it is.  The point of a high-level language is to abstract you from such details.  When to free a chunk of memory is the same sort of detail.

rob mayoff
Monday, August 16, 2004

Your comparison is false. Knowing the lifespan of objects within your program is not equivalent of knowing about the usage of subregisters. Object lifespan impacts the memory usage requirements of software in a significant and observable manner. If a programmer is "abstracted away from such details" then they can not possibly know if their software at any given point during its execution might require an unreasonable amount of system resources.

When you say "such details" you are lumping together what group of software development concerns?

Christopher Diggins
Monday, August 16, 2004

Flamebait Sr, may be particularly interested in the latest post http://www.heron-centric.com/2004/08/more-on-heron-references.html because I think are viewpoints are starting to converge with regards to Heron references. Correct me if I am wrong.

Christopher Diggins
Monday, August 16, 2004

It is possible to reason about and/or observe the resource requirements of a program, regardless of whether the program uses GC.

Knowing about object lifespan and knowing about register allocation are not equivalent, but that are comparable in that they represent points along the spectrum (or lattice) of abstraction.

Your claim that "[k]nowing when an object is freed or deleted is an integral part of fully understanding the behavioural characteristics of the software"  is vague: what does "fully understanding" mean?  What are the "behavioural characteristics"?  It also implies that "fully understanding" is important, but perhaps it's not.  I'll suppose that "fully understanding the behavioural characteristics" means knowing that the program's output is correct for its input, and knowing tight upper and lower bounds for the program's resource usage.

Yet if I can write a program with a reasonable certainty of correctness, and determine even loose upper bounds on the characteristics of the program (e.g. memory, CPU, and I/O usage as a function of the input), and those bounds are within the design requirements, then I have a satisfactory, but perhaps not "full" understanding of the "behavioral characteristics".

One could argue that it's harder to be certain of correctness when using GC, or that it's harder to reason about resource requirements, but my experience is that it's easier to write correct programs with GC and no harder to reason about resource requirements.

rob mayoff
Monday, August 16, 2004

What I said was "to detach the programmer from the concern of when to destroy an object" is a "red herring". I did not critique GC at all but rather this particular desire on the part of some programmers to disregard the concern of lifespan of objects. Knowing the lifespan of an object is an integral part of knowing the resource usage which can affect program behaviour in many ways such as by causing programs to go into near-infinite loops or cause exceptions to be thrown. I will concede that in trivial software, which makes up the majority of software that is written, these are not signficant practical concerns.

Christopher Diggins
Monday, August 16, 2004

See, Chris, taking useful things away from programmers "because they might abuse it" or "because they should know better" is the road to Pascal, and I think we both know where that ends up. ;)

I think you need to better understand metaprogramming before you start saying that it's not appropriate for the sort of programming you want people to do.

Any programming language designed for anything other than dashing out quick scripts (i.e. Perl) necessarily needs to be prepared to be used to develop *large* systems.  Do you wonder why C++ is so popular and so featureful?  Because Bjarne spent most of his time building large systems and designed the features of C++ around that.

Metaprogramming really starts to shine with large pieces of software.  This is why Paul Graham spends so much time blathering about Lisp and how it's his secret weapon.  You've opened the door by making a nicer syntax for template metaprogramming, but there's more to be had.

And, sure, your average user isn't going to craft code that uses it.  But think about the cool stunts that Boost has done and remember that if there was more places to plug in, Boost would be 10 times cooler and involve even more pleasant syntax.

Similarly, GC also becomes very important in large pieces of software.  It is often at least part of the 50% of common lisp that Phil Greenspun's law talks about.  The important thing to remember is that, like many of the features of C that Pascal deliberately doesn't have, just because a feature is sometimes misused, it doesn't mean that a feature is always misused. 

There are very real cases where the usage path of an individual "message" object is so convoluted that you won't be able to adequately track it's lifetime, say in a complicated message-passing architecture.  And sure, you might be able to understand the entire code path, but the second somebody changes something, you start leaking memory.  Or you can create multiple copies of the same object which, no matter how you dice it, means that you are losing performance.  So in the end, even though you may be paying a little bit of a performance penalty, it's a worthwhile abstraction.

I am not a defeatist.  I am a pragmatist.  A defeatist would force the users to use GC always.  As a pragmatist, I suggest you provide a structure so that GC fits nicely into the language without unpleasant hacks so that a programmer can make an intelligent choice.

I mean, really, if it's "ok" to use reference counting a la the Boost library in mission critical production code, why is it not OK to use something more sophisticated?  You have to understand that I am a c++ junkie when I'm not singing the praises of Ruby and tend to believe that most of the time, I can manage memory just as well as GC can.  I just believe that there are cases where I might as well trust in the Church of GC.

Flamebait Sr.
Monday, August 16, 2004

The old lisp machines had "areas" where one could use different GC strategies. The most common use was for protecting it against GC completely, so it didn't waste time there and you could use manual (de)allocation.

Would be nice to see. Don't know whether it benefits much from hardware support or if it's fine for general hardware.

Tayssir John Gabbour
Monday, August 16, 2004

Incidentally, I think you'll want to look at the Jones/Lins GC book. (If you haven't yet.) It has more motivations than the ones you listed.

I think it's best to look at GC as "proving when an object can no longer be used by the program." Computers are meant to automate things, so why not use the computer to automate something? ;)

Tayssir John Gabbour
Monday, August 16, 2004

*  Recent Topics

*  Fog Creek Home