Fog Creek Software
Discussion Board




The Law of Leaky Abstractions

Excellent article Joel.

Totally agree with you. (I better if I have to justify that BS & MS in computer engineering :-))

One needs strong fundamentals, if one has to build upon and learn, the 3 tier architecture, XML, SOAP, anything for that matter.

Prakash S
Tuesday, November 12, 2002

btw, Joel mispelled the word "exciting" as "exiting".  :-)

 Z M 
Tuesday, November 12, 2002

its that time of the night:-)

Prakash S
Tuesday, November 12, 2002

Ahem...

TCP is built on IP, but UDP is the unreliable *transport* protocol used to send data across the beloved internet (just ask Al, he'd know).

IP is the network layer protocol, upon which UDP and TCP are built.  IP itself, by itself, is never sent.

More corrections to follow...

Nat Ersoz
Tuesday, November 12, 2002

Great article!  I agree with most of Joel's points.  However:

- Joel (like myself) works on vertical market software that has a wide variety of setup options and deployment scenerios.  This requires more sophisticated developers and development techniques.  Needing to go "outside the box" is more likely than on an internal project.  It's tough to know which features to perfect, as the usage will vary dramatically between customers.

- Modern platforms are more forgiving.  C++ is an accident waiting to happen.  I'm not sure you can be a competant C++ programmer without understanding memory layout.  C#/Java help dramatically here.

- Modern OO platforms (.NET CLR, for example) have better functional coverage.  The tasks that most people do right now can be had right out of the box.  This wasn't the case with MFC/Win32/WinINet/Shell32/etc.  We'll see if these platforms scale well to the API needs of the next few years.

- Very often, it's okay for large chunks of an app to be slow and unresponsive.  The fast, critical parts can be done by a "bare metal" developer.  For example, a given database might have 5 tables with heavy volume, but 20 tables that are populated via one-time setup.  Screw up the critical tables and you're hosed.  The rest?  Drag and drop.

- We all love to rail on low-rung VB developers, but I for one am glad they're around.  Provided they don't overstep their expertise, they'll do projects that would bore me silly.  As long as they know when to "cry uncle", I'm okay with it.

On the other hand:

- 3-tier development is hard.  Network throughput and latency over the internet are 3 orders of magnitude slower than a LAN.  How can you write a responsive app without counting bytes over the wire?

- User expectations are set by horizontal market software.  How many times have you heard "Microsoft Word does it".  This causes vertical market developers to push the envelope.

- Data manipulation abstractions are notorious for killing performance (XML, .NET DataSets).  If you've got a lot of rows, it's back to counting bytes.

Bottom line: I've been burned more by developers that didn't understand cost/benefit analysis than I have by those that didn't know what was going on under the covers.  There's a place for the technically mediocre developer who understands the problem to be solved and doesn't want to do anything fancy.

Bill Carlson
Tuesday, November 12, 2002

The article is quite correct, however, I don't agree with the idea I read between the lines. For me, as a software architect of larger embedded systems, I think abstraction is great. It is wrong to think that an abstraction is a complete, unambigious description (this is where I agree with Joel), however, I think abstraction is what helps a great deal in understanding complex systems and also for solving real problems. It is, however only the top of the design space and many design / implementation / optimisation choices can be made.

As for TCP: TCP is not reliable in that it can guarantee the arrival of every packet. It does, however, try very hard. And if it fails, errors are (almost guaranteed) to be detected. Almost, as it relies on underlying protocols to detect bit-errors in packet data.

More important: for most people, TCP is just what makes their remote applications work reliable without bothering about retransmits etc. In that, I think the inventors of TCP/IP have done great job. The abstraction works 99% of the time.

For C++ I agree. That is one of many reasons I don't think C++ is a great language (but I respect other people who disagree with me... don't want to start another 'language war'. )

Adriaan van den Brand
Tuesday, November 12, 2002

one huge problem with this article is that IP is not a "method of transmitting data." You can't "send a bunch of messages with IP." IP specifies things like packet structure and addressing schemes, not how things get from point A to point B.

MCSE certified?
Tuesday, November 12, 2002

Nat,

Make sure you read the article well first. It says:

"By comparison, there is another method of transmitting data called IP which is unreliable."

Which is entirely correct.

You say:

"TCP is built on IP, but UDP is the unreliable *transport* protocol used to send data across the beloved internet (just ask Al, he'd know)."

Which is also correct. However, it is not the point Joel was trying to make. His point was that you can build reliable services on unreliable services if you want to create an abstraction that hides the shortcomings of some tool, IP in this case.

"IP is the network layer protocol, upon which UDP and TCP are built. IP itself, by itself, is never sent."

True and false. The latter depends entirely on the abstraction level you're working on. If you're a TCP stack programmer, you send IP all of the time.
And there are some other protocols beside TCP and UDP. Anyone who has good reason to send IP packets directly, can do so. You would if neither TCP or UDP serves your purpose. Now that may not be common, no doubt, but not impossible whatsoever.

Anyway, I digress. Your correction of Joel's article was inappropriate not because you are wrong with what you say, but because Joel did not say or mean what you corrected.

Erik
Tuesday, November 12, 2002

"If you're a TCP stack programmer, you send IP all of the time."

Actually, its a little more complicated than that, and besides the point anyway.

Erik
Tuesday, November 12, 2002

If joel wrote what he did on a basic networking exam, it would be marked incorrect. there is a problem when the largest analogy in an article is NOT CORRECT.

MCSE certified?
Tuesday, November 12, 2002

Please be more specific. Which part, sentence, word or whatever was incorrect? In what context? The context of the article? Or some other context?

Erik
Tuesday, November 12, 2002

Or better yet, if you don't like it, write a better article.

I have lots of little disagreements with the article, but the idea is pretty sound and it's engaging.

Random Nitpick: the 419'ers are from Nigeria, which is in West Africa.

mb
Tuesday, November 12, 2002

For me, the best example of the dangers of abstraction is the London Underground map.

The map is a marvel of abstraction.  It makes the planning routes on the Underground easy.  However, there is a trade off.

http://www.afn.org/~alplatt/tube.html

Take, for example travelling from Charring Cross to Embankment.  The map shows two routes that can be taken: Either the Bakerloo or Northern line.  Just one stop either way.
    
Problem is that the Charring Cross station is just 1 minute walk from Embankment.  From street level, it takes longer to walk down to the platform at Charring Cross than it does to walk round the corner to Embankment.

However, time has shown that the value of a simplified map far outweighs these problems.

Ged Byrne
Tuesday, November 12, 2002

The Underground map problem, where the abstraction doesn't include scale, isn't in practice a problem, since the map doesn't pretend to give information about the time or length of journey but does succeed in providing a route map.

Other knowledge of the geography of London, combined with the abstraction of the map produces a third model, held entirely inside the head.  That third model lets you choose the best method of getting where you need.

And if you're a tourist or visitor and don't have that geographical knowledge then you'll still mostly get there.  The leakage in the Tube is more about points failures at Queensway.

Simon Lucy
Tuesday, November 12, 2002

There is a solution to the limitation of abstractions, which is to use multiple views. Each view has a different abstraction to the system. And each abstraction is clear.

The london underground map is perfectly clear for determining the route from station to station (which was revolutionary and I think very succesfull). It does not include scale. Had it included that (as ordinary maps), it would not have been clear, and for sure it would not be possible to print the map on a business card format.

Sometimes it is possible to have more detailed levels of description. E.g. you could model the underground map as a mindmap, where each station provides information on which streets are closely located to it.

But then: may be the important lesson is that when you use an abstraction, you must be aware of its limitations and know what information has been left out (e.g. scale for the London underground map).

But then:

Adriaan van den Brand
Tuesday, November 12, 2002

Am I missing something: why would anyone want to concatenate two literal strings?

Surely "foo" + "bar" could be rewritten as simply "foobar" ?

Don't get me wrong, I'm not trying to defend C++ in any way; we Delphi programmers have a wonderful string datatype built into the compiler :)

Gus
Tuesday, November 12, 2002

"Am I missing something: why would anyone want to concatenate two literal strings?"

It is like with trains. Why not simply have one long carriage instead of several shorter, connected ones. It is for those situations that you don't want them connected. You can't simply divide the large carriage into several smaller without a substantial amount of work. Several smaller ones you can simply disconnect and re-arrange.

Same with "foo" + "bar" versus "foobar". At least in C++. Delphi I can't comment on.

Erik
Tuesday, November 12, 2002

surely the underground example says it all - don't look at the abstraction as a true simplification.

evaluate it, use it, and then TEST IT.  surely, being software professionals, we all test?  right?

Baz
Tuesday, November 12, 2002

Also, a frequent use of "foo" + "bar" is when a string goes beyond 70 or 80 chars in length. In a lot of languages, you can solve that problem by splitting the string into smaller chunks and then using + to concat them back together.

so you get s="fasf...."+"asdfsd..."+"asdfasdf..." with each new string on a separate line.

Sanjay Sheth
Tuesday, November 12, 2002

Maybe the TCP/IP exemple was not the best (because TCP abstracts the IP packet transmission and not IP bla bla ...).

While reading the article, I felt the thread coming...

But this exemple is simple and accurate to define abstraction.

What frustated me a lot is the lack of solution.

How can we handle the leak ?
How can we minimize its impact ?

Maybe some advice or best practices would be welcome here.

Ralph Chaléon
Tuesday, November 12, 2002

"btw, Joel mispelled the word "exciting" as "exiting".  :-)"

No, he didn't. Read again.

"Also, a frequent use of "foo" + "bar" is when a string goes beyond 70 or 80 chars in length."

In C you can just write

char *ThisIsAVeryLongString = "This is a very "
    "long string. "
    "No need for '+' here.";

Leonardo Herrera
Tuesday, November 12, 2002

.. And in C++ you can write string("foo") + string("baz") if you really want to do it on runtime.

Sergey Petrunia
Tuesday, November 12, 2002

It's an interesting effect that strikes anyone, not just programmers.  Asimov once described it as a fractal view of knowledge, that looking at the details of any implementation of abstraction will lead you into more and more details, which add up to something that you can't ignore.

But it just means you have to be mature in building abstractions.  That is probably the definition of "good taste" in programming.

The scary thing is that politicians know this well -- they appeal to our patriotism, and build many flawed models of how the world really works.

Tj
Tuesday, November 12, 2002

Funny how most of the discussion here seems to revolve around misinterpretations or misunderstandings.

For instance: "Maybe the TCP/IP exemple was not the best (because TCP abstracts the IP packet transmission and not IP bla bla ...)."

The example said exactly that.
To quote the article:
"there is another method of transmitting data called IP which is unreliable"
and:
"TCP is built on top of IP. In other words, TCP is obliged to somehow send data reliably using only an unreliable tool."

If there is one argument that would at least be technically valid - but still irrelevant in the context of the article - it is that the TCP does not rely on the IP. It was developed in the context of, and most frequently used on top of, but not at all dependent on. Any other datagram service would do, in fact, any other protocol would do, as long as it is able to carry and deliver TCP-type data packets (which means it must at least understand IP-style addressing).

Erik
Tuesday, November 12, 2002

"All non-trivial abstractions, to some degree, are leaky."

Post your proof.


Tuesday, November 12, 2002

The proof is in the dictionary.

"The act of considering something as a general quality or characteristic, apart from concrete realities, specific objects, or actual instances."

Whenever concrete realities or specific objects come into play, the abstraction becomes invalid. Whenever you go up in the level of abstraction, you leave out detail. But the detail is still there and can't always be ignored. Even if it appears to serve no purpose whatsoever.

Here's a silly example:
People have appendices. You can ignore that most of the time, safely abstract the biological body as a person. And even though you can safely remove an appendix, you can't always ignore it while it is still there. If you did, you would die of appendicitis.

Oke, any way to get even more off topic?

Erik
Tuesday, November 12, 2002

A hammer is a very simple and powerful tool. If you are too careless, it may hurt you terribly. That does not mean that a hamer is not the best tool for driving a nail into a piece of wood.

Same with abstractions. They're great for a given purpose, but not for every purpose. An abstraction which is not 'leaky' is no (real world) abstraction. abstracting is the art of leaving out details which are irrelevant for that view (or at that time)

Adriaan van den Brand
Tuesday, November 12, 2002

The thing I found most interesting about Joel's article is the sort of hinted allusion it makes to  the simple machines we learned about in grade school - the lever, the inclined plane, the wheel, etc.  They're a tradeoff.  They save time at the expense of requiring more brain cells.

Actually, I suppose abstractions are more like plumbing, cars, refrigerators, etc., except that currently, all the pipes and cars and refrigerators are massively customized, which means if they break, you, the owner, must fix them, which means you need a lot more skill than your typical consumer.  Every programmer has to be a jack of all trades to an extent.

Eventually there'll be so many layers of abstraction that the saved time no longer makes up for the required knowledge, and there will be a kind of equilibrium.  It won't be a nice one; we'll feel stuck, spending half our time getting real work done and the other half fixing what went wrong in all those abstraction layers.  This is affecting some people more than others.  I, for example, haven't had to worry about network protocols very much lately.  I've been able to stick to end-user apps for the most part.

To some extent, the brain-drag imposed by all the abstractions is alleviated by the same two things which make a society full of 20th-century-plus inventions viable: specialization, and standardized parts.  The computer industry is catching up to older industries in these regards.

Plumbing is understood well enough to support an industry of plumbers.  A plumber can enter a home with plumbing he's never seen before, and still have a good chance of fixing it.  On the computer side, a network admin still needs to spend some time getting to know a company's existing network if he is to be able to manage it.  Fortunately, various factors can ease this job - the overarching TCP/IP standard, the shrinking number of router vendors, etc.

The parts aren't as standardized as toilet floats and fan belts, but they're getting there.  RAM chips, routers, virtual memory, JavaScript - all of these artifacts and more are at various levels of standardization (and conformance), and (hopefully) are headed in that direction.  They won't completely stabilize until we finally break Moore's Law and hit harder limits in CPU speed and transistor size, so that the hardware will stop shifting under us.

Abstractions are leaky because the industry hasn't matured yet.  Eventually we'll be able to stop putting up so many buildings, and devote more time to shoring up the existing ones.

Paul Brinkley
Tuesday, November 12, 2002

Since the embezzling emails come from Nigeria, I presume that Joel meant "West Africans" and not the poor Kenyans in East Africa.

OutofAfrica
Tuesday, November 12, 2002

I agree in the sense that abstractions are "leaky" if

"leaky" = hiding specifics; reducing the original.

BUT that's the WHOLE point of abstractions!  Abstractions helps us understand large issues in a coherent way.  It helps us build bigger and better things.  I can use Windows to access my files and folders (which are abstracted from the OS, which are abstracted from the hard disk drivers, which are abstracted from the ones and zeros on the platter) assuming that Windows doesn't crash on me.  Imagine if you had to worry about all those details in order to read your email or view a webpage.

Joel said that "Abstractions fail."  Abstractions don't fail, our IMPLEMENTATION OF THE ABSTRACTION fail.  It's not the abstraction concept of our hard disk that is failing but Windows or the driver or the mechanical arm of the hard drive that is failing.

But even if the implementation of the abstraction do fail sometimes, they question is why does everyone need to know about it?  An example is cars.  How many of us knows what going on "underneath the hood" when we turn on the engine?  Or the individual mechanisms of the car and how they all work together in order to get us to work everyday?  I'm sure that we rely on the car even though we don't know the specifics inside.  However, It's good to have knowledge of these things when our cars break down so we can fix it instead of bringing it to a mechanic.  BUT that doesn't mean all of us should know about it.  Why take weeks or months of my life learning all about the car to fix a $1000 repair job when I can make $10,000 in the same time period?  All we need to know is performing basic maintenance and let the mechanic do the rest.

The bottom line is that not everyone needs to know about the specifics of everything.  And if we did, it would take away from time spent doing "bigger and better" things.

Tuan Pham
Tuesday, November 12, 2002

Excellent article Joel!  You said it much more eloquently & succinctly than I could.  I had started writing an article entitled "Musings on Abstracted Thought" but you saved me the trouble of finishing it.  I have noticed for some time the abstractions of HLL's keeping people from understanding what is really going on.  That's why I DON'T want calculators in the classroom until the basis for their use is completely understood.

Greg Kellerman
Tuesday, November 12, 2002

<paul>
Abstractions are leaky because the industry hasn't matured yet.  Eventually we'll be able to stop putting up so many buildings, and devote more time to shoring up the existing ones.
</paul>

I think we won't live long enough to see this happen. As far as networking is concerned, we won't achieve worldwide reliability that easily.

This is also true in 'more mature fields'. You always find projects that show a leak in abstraction (See the Architecture abstraction of Physics in bridge building).

You can repel the problem far enough to feel safe, but it will never vanish, it's like using a float to avoid integer size problems.

Ralph Chaléon
Tuesday, November 12, 2002

I can't say I understand the attacks of this article. Disregarding the suitability of the examples chosen and the semantics, there's a real topic here.

No matter how well you try to make a black box, something will slip through now and again. The article was *not* about users - you know, where such "leaks" are usually bugs. The article was concerned with the perspective of the developer; if you are completely unaware of the implementation of an abstraction, you're going to be stuck for days when you find a leak yourself.

The SQL example was dead on. Here's another I use to worry about myself. There's seems to be very little difference between the C++ statement x+=y and x=x+y. However, if x and y are actually huge matrix structures and we're overloading the operators to represent matrix addition, then these operations are a great deal different. The former is likely to be the performance winner. In the problem I used to work on in fluid dynamics, it was important to appreciate this difference.

Leaks in abstractions may be avoidable, but probably as avoidable as bugs in code.

A.J.
Tuesday, November 12, 2002

First question: how many people here know how to take apart a non-digital wristwatch and put it back together again? Second question: how many people know how to tell time on that same wristwatch?

I don't need to understand how the tool works in order to use it. The abstraction works just fine. But then again, clocks have been around just a little longer than, say, Visual Studio.

Cake mix also works well at "abstracting" the complicated process necessary to bake a cake. Same applies to "EasyMac," which simplifies making mac-and-cheese to the point that my five-year old can do it herself. Paint-by-numbers. A gas fireplace.

Note that none of these "abstractions" allows me to create world-class cakes, or paintings, or whatever, but they do successfully simplify an otherwise difficult or complicated process. Next time you buy a pound of ground beef, or a bag of frozen vegetables, think about the "abstraction" and whether or not it worked.

Every example Joel gives here (with the exception of raining on cars, which is just plain inaccurate -- a roof, windshield, headlights, etc. is not designed to allow me to drive 55 MPH in any weather condition; rather, the design enables me to drive, period, in all but the most severe weather) describes a tool failure, not a "leaky abstraction."

What we're really dealing with here is the immaturity of the software development industry, where developers are forced to build increasingly complex systems with ever changing tools, in ever-changing environments.

Take an imperfect development tool, running on an imperfect operation system, and ask an imperfect developer to create something from their imagination, and you're suprised that things fail along the way?

Robert K. Brown
Tuesday, November 12, 2002

The thing that is puzzling me is that anyone finds the notion of a "leaky abstraction" surprising. Abstractions are suitable for a certain context - in that context they represent an accurate model of the world.  As you venture out of that context, though, the information that the abstraction tosses out as "irrelevant detail" becomes increasingly relevant. 

For instance, Newton's equations are a very useful abstraction of the physics of motion.  There are boundaries where these abstractions begin to break down, however.  As velocities approach the speed of light, Newton's model becomes leakier and leakier.  As you look at smaller and smaller particles, Newton's model leaks like a sieve.

Physicists do not find this surprising - they understand that when they use Newtonian mechanics it applies in a certain context.  They accept reduced scope in order to gain reduced complexity.  They make intelligent choices about which level of abstraction to use when solving problems based on the context of the problem.

One of my biggest gripes about our "discipline" is that much of the work I see being done out there does a horrible job of partitioning the problem context.  Most of the APIs that I use (from microsoft, primarily) bounce from context to context (and therefore abstraction to abstraction) with carefree abandon with the result that I often can't figure out what the hell is going on without a lot of study and experimentation.

Leaky Abstraction
Tuesday, November 12, 2002

Joel isn't saying that abstractions are bad and should not be used. He just said that they leak and that dealing with the leaks requires detailed knowledge of what the abstraction is meant to hide.

fool for python
Tuesday, November 12, 2002

As soon as I read the first part of the article, my first thought was:

"Oh gee, there are going to be a dozen ninnies nitpicking his definition of the problem of TCP/IP."

And sure enough.....It should have been self-evident that the purpose of the analogy was to demonstrate an abstraction, not a discourse on the fundamentals of TCP/IP. If you couldn't get past the technical issues to see the bigger point, then you need to...well...nevermind.

I thought it was a good article that brought up some good points.

Mark Hoffman
Tuesday, November 12, 2002

This was a horrible article.  Nothing more than an observation of the obvious.  Does Joel suggest an alternative to this said catastrophe?  No, of course not because one doesn't exist.  Abstractions have thier place, we all know that.  To point out that there are issues with the notion of abstractions is really a frivilous academic exercise.


Joel, I smell your BS and I'm calling you on it.

been there, done that
Tuesday, November 12, 2002

#include http://jw.servebeer.com:8080/space/On+Leaky+Abstractions

(btw, why are there no "Discuss!" links, anywhere? to keep out the riffraff?)

Jeff Winkler
Tuesday, November 12, 2002

Bleh.  First of all, the use of the word "abstraction" in this article is very loose. 

Just because you layer a tool on top of another, does not really mean it is an "abstraction" of the first layer.  It may simply use the facilities to do something different.

I finished the article wondering what the non-obvious point was supposed to be.  That abstractions have limitations? (Duh.)  That layered software tools tend to inherit the limitations of the underlying layers? (Duh.)

What's the point?

Bob

Robert Anderson
Tuesday, November 12, 2002

The IP layer is used for the transmission of packets to and from IP addresses. There is no reliability when sending packets at this level. Joel is absolutely correct about this point. The TCP layer performs the necessary ordering, error checking, etc.
Please read RFC 791 before attempting to correct people. http://www.ietf.org/rfc/rfc0791.txt?number=791
For those who are lazy sections 1.2 and 1.3 will suffice.
We must be importing people from Slashdot. Rather than discussing the issue, people are just attempting to find errors the article.

Mark Brown
Tuesday, November 12, 2002

It's a small compensation I know - but in C and C++, two string literal "tokens" next to each other are considered to be concatenated.

So while you cannot write "foo" + "bar", you can
write
"foo" "bar"
or
"foo"
"bar"

So at least there's that - no need to resort to "+" or ugly backslashes for really long string literals.

On the whole, I agree 100% with this article.
IMO, you pay the highest price for an abstraction if there is a bug in the implementation - but a mis-represented performance cost is almost as bad and often harder to "debug". People don't complain about C because
a) there are no bugs in modern C compilers
b) for the most part it accurately reflects the performance cost of your code

The only thing that saves us is that most abstractions are still "deterministic" - making problems/bugs in the implementation reproducible.

Oh wait - I just thought of multi-threading :-(

http://PaulHollingsworth.com

Paul Hollingsworth
Tuesday, November 12, 2002

Abstractions don't need to be leaky. Sure your TCP connection can fail: time out, power failure, inquisitive cat, whatever, but this reason for failure _does not_ need to be propagated up the abstraction hierarchy: it is sufficient to know that an error has occured. The abstraction remains useful without requiring knowledge of the underlying.

So, I think that Joel has missed the point: it is not the case that 'abstractions are leaky' but rather 'if an abstraction can fail then it is a poor abstraction (or implementation thereof)'.

Examples:

1) MFC, when I used to use it a few years ago, was a poor abstraction. It claimed to provide an abstraction of the Win32 API but was impossible to use without a detailed knowledge of the API.

2) The UNIX model of processes connected by pipes is a good abstraction. Sure grep/awk/sed could fail but I never really need to know _why_: it is sufficient to know that a component failed and even this is typically handled well enough by the shell (i.e. I don't need to worry about it).

3) C++ strings are a poor abstraction. Joel's example of "foo" + "bar" demonstrates that you do need to know the difference between <char*> and <string>. But lots of languages get it right: Perl, Ruby, and PHP for example.

To conclude: I disagree that abstraction implies leaky. Rather, I assert that leaky implies poor abstraction.

Tom Payne
Tuesday, November 12, 2002

"To conclude: I disagree that abstraction implies leaky. Rather, I assert that leaky implies poor abstraction.
"

I thought that one of the points Joel was trying to make was that *all* abstractions tend to have leaks.

I cannot come up with a decent abstraction off the top of my head that does not.

So from a practical pov, abstractions are leaky.  whether of not they *have* to be is a different question.

A good example of this is creating code for the average operating system.  An OS basically consists of layer upon layer of apis, each abstracting the underlying layer a little more.
..until finally we get a 'CreateWindowAtLocation(x,y)' type function.
Every layer of abstraction contains its own 'leaks'

In order to find a bug in my code, its often useful to understand how the apis Im calling work underneath, this allows me to work out why calling 'CreateWindowAtLocation(c,y) occasionally results in a file being deleted from the desktop...

His point about VB was a good one, people who use it *may* not have a good enough understand about what its ding when they program it to uncover the reasons for the occasional burst of odd behavior.

Personally I think its a miracle that operating systems work at all...<g> but maybe thats a different thread as well..

god will punish us all when he finds out
Tuesday, November 12, 2002


In 25th Anniversary edition of "The Mythical Man Month", Frederick Brooks wrote that he was wrong when he initially postulated that all programmers involved in a project should know as much as possible about the entire project.  He admitted that abstraction is critical to the success of large projects.

There are two main reasons (as I understand Mr.Brooks): 1 - Humans can only jugle so much complexity in their brains.  2 - Once projects grow sufficiently large, any individual will spend too much of their time understanding the project rather than moving it forward.

Joel's article suggests that Mr.Brooks is really only half wrong.  While you might not have to understand everything about a project, you have to understand enough about the abstractions with which you interact to plug the holes when they leak.

Uh oh, gotta go plug another leak!

Jason Reusch
Tuesday, November 12, 2002

"The proof is in the dictionary."

How strange your dictionary must be. Mine contains definitions of words.


Wednesday, November 13, 2002

Exactly. If you hadn't stopped reading after the sentence you quoted, you would read a dictionary definition of the word abstraction, and the explanation why that proofs that abstractions are leaky.

Erik
Wednesday, November 13, 2002

Sorry, proves...

Erik
Wednesday, November 13, 2002

"So, I think that Joel has missed the point: it is not the case that 'abstractions are leaky' but rather 'if an abstraction can fail then it is a poor abstraction (or implementation thereof)'."

Then by this definition, mankind is incapable of making abstractions that aren't "poor" at one point or another. 

Personally I'd like to think of this along the quantum mechanics, in the sense that there aren't any absolutes, but merely probabilities.  You cannot say for absolute certainty that an abstraction won't fail, but merely what the likelyhood/frequency is of failure.

Let's forget the modern hi-tech age for a moment and drop back to basics: Pottery.  You'd think, after millenia of knowing how to make bowls out of clay, that humanity would have refined the concept of a "clay bowl" into a complete, non-leaky abstraction, right?  After all, we're talking about one of the oldest pieces of "technology" that our species has to offer.  And yet, if you manage to accidentally take a chip out of the side of such a bowl, the abstraction fails.  It's still a bowl, and you know it's still a bowl... but suddenly you have a "leak" in the sense that you now have to think about the underlying clay as well, in order to repair the damage.

The treatment of a clay bowl as a bowl (rather than clay) is a very effective abstraction; it is not "poor" in any way, shape or form.  Nonetheless, there are situations (at different levels of probability, depending on whether or not you have young children in the house) where one must see through the abstraction in order to contend with the unignorable details that comprise it.

I have to side here with the people who believe that "non-leaky abstraction" is an oxymoron.  Abstraction is at the heart of how our minds work, and it's one of the most effective tools we have in dealing with the world around us... but just because we abstract away details in order to simplify our everyday lives, it doesn't mean those details don't exist.  All we're doing is reducing how often we have to confront them.  Unfortunately, in young industries (such as ours), the frequency of confrontation is often much higher than we'd like.

Chris Hargrove
Wednesday, November 13, 2002

"why that proofs that abstractions are leaky."

While you are there you might also consider looking up the word "all".


Wednesday, November 13, 2002

Err, "along the quantum mechanics" -> "along the lines of quantum mechanics".  My keyboard abstraction is leaky. ;)

Chris Hargrove
Wednesday, November 13, 2002

"While you are there you might also consider looking up the word "all"."

Fine, have you looked up "also" recently?

Erik
Wednesday, November 13, 2002

"have you looked up "also" recently? "

Here is the quote again: "All non-trivial abstractions, to some degree, are leaky."

"Also"... nope, doesn't seem to be there. What definition of "all" did you find?


Wednesday, November 13, 2002

"I thought that one of the points Joel was trying to make was that *all* abstractions tend to have leaks."

Yup.

"I cannot come up with a decent abstraction off the top of my head that does not."

You don't have to. The assertion has been made that all "X"s have a certain condition or behaviour, but no proof has been offered - all that has happened is that various people have said that if that condition exists for the one or two "X"s they are thinking about then it must be so for all cases of "X". Personally I have no idea whether all abstractions are leaky, but then I never asserted that they are.


Wednesday, November 13, 2002

when individuals begin referring other individuals to dictionaries it usually means that all goodwill in a conversation has been compromised.

Come on people, surely we can do better than this?  Discuss the idea..avoid getting obsessed with the detail...who cares whether joel was exactly correct with his ip /tcp/ip example....(<g> although I cannot stop myself from ponting out that he *was* correct)

Did anyone get the hidden detail?  why did he use an abstracted term 'leak' instead of 'bug'?

...he was abstracting the concept of abstractions....

<g> ..and his abstraction contained leaks...

god will punish us all when he finds out
Wednesday, November 13, 2002

OK here's an attempt to logically "prove" Joel's assertion (obviously not rigorously)

* Any abstraction, by definition, leaves out some details of the problem at hand
* These details may either be relevant in some situation, or irrelevant in all situations
* If these details are irrelevant in all situations, than the abstraction applies to the problem with any other "set of details" (using the "bowl" example someone made earlier... if the material the bowl is made of is irrelevant in all situations, then a ceramic bowl behaves identically to a metal bowl in all situations).
* If any "set of details" make the abstraction work identically, then these details do not matter.
* If the details do not matter, I would say the abstraction is trivial, because you are just ignoring irrelevant information.
* Therefore, if the details are irrelevant in all situations, the abstraction is trivial.
* Thus, if the abstraction is non-trivial, the details must be relevant in some situation.

Obviously there are one or two leaps there, but that is more or less the thought process I went through in thinking about the "law of leaky abstractions"

Mike McNertney
Wednesday, November 13, 2002

Notice that the key in my "proof" is Joel's use of the word "non-trivial".  In this he left himself an "out", in that any abstraction you manage to come up with that has no leak, he can say is "non-trivial".

Since he didn't define what "non-trivial" means, he in effect could say anything he wants.  "All non-trivial abstractions make hell freeze over" is a perfectly valid law if "non-trivial" means nothing will satisfy that condition.

Of course this is sort of a silly nit-pick, which fits in perfectly with this whole thread.

The point of the article is the idea people.  He's not trying to give a mathematical proof here.  I will agree with some that using "all" may be inappropriate, but since he also places a restriction on the set of abstractions, I think it is sort of silly to argue that point

Mike McNertney
Wednesday, November 13, 2002

<quote>
Also, a frequent use of "foo" + "bar" is when a string goes beyond 70 or 80 chars in length. In a lot of languages, you can solve that problem by splitting the string into smaller chunks and then using + to concat them back together.
</quote>

You sure can solve this is C++, just look:

std::string str("some text "
        "some more text");

Passater
Wednesday, November 13, 2002

"Any abstraction, by definition, leaves out some details of the problem at hand"

If it leaves out details entirely then it is by definition "leaky" because there are situations to which it cannot be applied although the thing it is abstracting could be, but I disagree that "all" abstractions leave out details. An abstraction may only leave details out of the interface, because the abstraction itself has enough information to deal with what is being abstracted without "external stimuli" (Example: in programming, by maintaining and checking the state of itself and the environment and behaving acordingly, rather than expecting the user of the abstraction to do so - e.g. if a call to function X1 must always be followed by a call to X2 then the abstraction can leave this detail out of the interface and deal with the detail itself).


Thursday, November 14, 2002

"using "all" may be inappropriate"

This is what my argument is with Mike, not that some abstractions are leaky. I happen to think that when you're chatting about stuff between friends the wording is less important than the idea, you're right. But if you put forward  concepts in a published medium you have to be very careful indeed in what is said - the audience is far wider, and what is said is far longer lasting, than some chat in the kitchen while waiting for the coffee machine to brew.


Thursday, November 14, 2002

I liked this article.  Just wanted to put in my $0.02 as to why.

The key point of the article to me was the implications for system development in the last paragraphs.  Anyone building complex enterprise systems today has to have a very broad toolkit, and the abstractions that are the basis of each tool in that toolkit can and will leak on each other, even if they play nice internally.  As we increase the number of tools that each person is expected to use, it is likely that the time that must be spent understanding the underlying abstractions and their weaknesses will grow and the deployment issues will surface later in the development life cycle. 

I take this to mean that the agile development methods will get increasing traction in this new tool-rich environment because they encourage early deployment discovery.

On the "do all abstractions leak" question, that is a rhetorical questions that can't be answered.  Abstractions are assumptions (simplifying assumptions from my view) and whenever you simplify, you have removed information. In those rare cases when the removed information is critical to understanding, the abstraction will leak, not because it was defective, but because it was an abstraction:)

The only way that an abstraction is defective if is it does not perform per its specification, a problem I see as distinct from whether it leaks, which I always assume it will.

Craig Johnson
Thursday, November 14, 2002

Hey, Joel's article is on Slashdot!

Prakash S
Thursday, November 14, 2002

For the record:

"By comparison, there is another method of transmitting data called IP which is unreliable. Nobody promises that your data will arrive, and it might get messed up before it arrives."

This would be a comparison of a network protocol and a transport protocol.  Why is raw IP data not a commonly recognized method for sending data?  Because, if you open a raw IP socket, you would get every packet available on the interface - including all TCP and UDP data - to the exclusion of none.  You get it all.

"Imagine that we had a way of sending actors from Broadway to Hollywood that involved putting them in cars and driving them across the country. Some of these cars crashed, killing the poor actors." <blah, blah, blah...>

There is nothing magic about TCP that it can reconstruct lost data out of thin air.  (Enter the twin analogy, or triplets, or the cloning of sheep...)
Perhaps this analogy was created as irony - to illustrate how crappy abstractions can be.  How humorous.

"If a large UFO on its way to Area 51 crashes on the highway in Nevada, rendering it impassable, all the actors that went that way are rerouted"

Additionally, TCP lacks this capability.  Interestingly, routers operate IP headers only.  And when a router  gets a table update, it may be instructed to route around trouble spots.  TCP plays no part in this action.

"This is what I call a leaky abstraction. TCP attempts to provide a complete abstraction of an underlying unreliable network, but sometimes, the network
leaks through..."

TCP is a specification, not an abstraction.  I suppose that TCP is abstraction in the same way that reality is an abstraction of the metaphysical.  Beauteous.

Next time:
<bong>
...
</bong>

There isn't anything leaky about TCP, UDP, or IP.  They are what they are, no more and no less.

However, having said that, and also having done time at the minimum security facility in Redmond, I can see the thought progression:

1. Make money by abstracting hardware.
2. Make money by abstracting (and extending) existing standards.
3. Make money by keeping developers dependent on your abstractions.

Perhaps Joel is feeling rebelious and pining for the good old days of calling open() on a socket and actually getting what he asked for, unmolested by layers upon layers of crap all jiving for attention.  With that I can sympathize. 

Nat Ersoz
Friday, November 15, 2002

... replace open() with socket() ...

now I feel better.

Nat Ersoz
Friday, November 15, 2002

I thought this was an excellent article, and I think it should be required reading for all the undergrads here in my Uni. Not because we can expect them to all know all the machine-level stuff under they Java they are taught, but so that they at least *apprecaite* there are machine level things under the Java. It would at least be a start.

I thought of another example too: timezones. Timezones abstract the way that sun up and sun down times actually change continuously over the Earth's surface, and mean we don't have to change our clocks when travelling between Paris and Berlin. But they leak at the dateline, where a distance of just a few feet changes our measured time by 24 whole hours, despite the sun up/down times changing by just a few seconds.

Richard Gaywood
Friday, November 15, 2002

Maybe the range of reactions to this article is due to the fact that the "leaky abstraction" idea is itself a leaky abstraction. Hence when applied to some things that at first seem like good examples of "leaky abstractions", on closer inspection we find that they aren't.

However, wizards (as they occur in MS Access) are the very essence of "leaky abstractions" and I spend a great deal of my time trying to explain this to people; so it is quite wrong to dismiss complaints about "leaky abstractions" as a worthless academic point.

Also, I wouldn't take this article as bashing the whole idea of abstractions. It should be noted that the more mathematical "pure" abstractions tend to be watertight. The leaky ones are the ones where people kept adding features in response to every idea that came up. The more sprawling the abstraction, the less abstract it actually is, and the harder it is to check for leaks. Wizards are full of little extra buttons that probably seemed like a good idea at the time.

I'd define a leaky abstraction as one that is liable to be misunderstood. So to me,  TCP over IP is a perfectly good encapsulation, because I know what "reliable" means in that context (if it fails, you find out.) Wizards are absolute failures as abstraction because they are packed full of stuff that you have no way to understand unless you make the effort of understanding the layer below first!

Daniel Earwicker
Monday, December 02, 2002

Hello,

thank you for linking my London tube page.  If you have the time could you update the link to this site: 
http://www.geocities.com/alplatt//tube.html

The site you use while still available, is no longer accessible to me for updating.

I think you might find the Geocities site somewhat faster to load in spite of the popup page.

Thanks again

allan platt
Friday, February 27, 2004

*  Recent Topics

*  Fog Creek Home