Fog Creek Software
Discussion Board

Leaky Abstractions

  If technology is about synthesizing existing materials,
concepts, or methods into new materials, concepts or
methods, aren't we also in trouble if we adopt a culture
of merely accepting that certain things work? ( By the
way, I think we already have. )
  Someone once said, I believe Arthur C. Clarke, that to a
a given culture, a sufficiently advanced technology is
indistinguishable from magic.  We've already reached this
phase.  Worse, even if we know how something works,
there are many things "unfixable" by the layperson
because of the need for special tools.
  The analog in the programming world is the programmer
who gets something to work, and then doesn't quite
understand how or why, but doesn't care.  A mentality like
this is dangerous, and a test for it should be designed and
added to Joel's Interviewing Guide.  Those who fail the test
should be thrown out the back door into the alley.

Thomas R. Dial
Thursday, December 05, 2002

Hmm.  If it works and people pay money for it, the how and why are irrelevant.

I don't think that Joel's point about leaky abstractions was that abstractions are bad - only that they are imperfect.  This necessitates keeping one person around who knows the inside of the black box.

The fundamental principle of self-interest allows us to trust others, most of the time.  I don't worry that the food I eat is poisoned or that the car in the oncoming lane is going to swerve.  Why?  It isn't in their best interest.  When self-interest fails to provide protection, we have laws and the courts.  What does this have to do with software?  Provided code is signed, it can be trusted as much as you trust the signer.  I don't know how my Magnavox TV works, but I trust Magnovox to make sure it doesn't explode.  If other Magnovoxes had exploded, I wouldn't have bought one.

As a general rule, grand societal plans that rely on people getting a lot smarter just aren't going to work.  Specialization is here to stay.

Bill Carlson
Thursday, December 05, 2002

As a practical matter, everyone has to rely on abstractions since it's impossible to master all aspects of any topic.

For example, my knowledge of electronics, semiconductors, assembly language, operating systems, compilers, networks, and most other computer-related topics is negligible. Nonetheless, I'm a rather productive programmer, since my mental abstractions are good enough for the programming tasks that I face.

Additional knowledge, in terms of both breadth and depth, has definite value. That extra knowledge improves your internal abstractions, but it doesn't replace them.

Thursday, December 05, 2002

>>Hmm.  If it works and people pay money for it, the how
>>and why are irrelevant.

Right, but if you're talking about software, which we are,
that attitude can get you into trouble.

Thomas R. Dial
Friday, December 06, 2002

Don't forget, the last time someone tried to "plug" all the leaks, the end result was PL/1 :-P

James Wann
Friday, December 06, 2002

I agree with Thomas on his last post.  I try to maintain competance over all the layers of abstraction that a programmer faces.  Even with .NET, I can still write code and mentally imagine what processor instructions execute and which traces light up on the motherboard.

However, it may not be possible to do this for very much longer.  I can't know every protocol, every optimization technique used, etc.  A couple generations from .NET, one person may not be able to trace a single byte from "cradle to grave".

This is scary, but inevitable.  Simpler interfaces, with 3rd party independent black-box testing might assist, but what will it all look like 30 years from now?

In my mind, this is the problem with web services.  Now, when I distribute an app, all that really needs to happen is that the user's computer is running.  If I begin to use a few web services, I'm all of a sudden dependent on a hundred people I don't know doing their job correctly EVERY DAY or my app is toast.  One could argue this is already the case with the internet, I suppose.  However, this is a single entity, with enough usage to garner financial subsidy if needed.

Bill Carlson
Friday, December 06, 2002

Regarding leaky abstrations -

Two points:

1) There's a quote I really like regarding models, which are after all, abstractions:

"All models are wrong; some are useful."

don't know who said it but I came across it while I worked for an US Army analytical agency (basically an agency full of Operations Research professionals working on materiel acquisition and operational issues). We had to do a lot of systems modeling of various types, from mathematical modeling to large scale distributed computer simulations of corps-level combat. Models were everywhere for us, and you'd find this little quote by most folks' desks someplace. I like it because it acknowledges the utility of models as abstractions from reality, while it contains a caution to be careful when you use them.

When presented with a model, IMO it's essential before using it and interpreting any insights you might attempt to derive from it to understand the nature of how the abstraction differs from reality, what it can tell you, and what it cannot since it has ceased to be, in fact, reality itself. It comes down to this:  the abstraction is a tool; to use the right tool for the right job, you have to know what it will and won't do. If you step outside those limits, you're probably not going to get full value out of the tool (e.g. abstraction).

I think this basic idea--abstractions are useful, even necessary, but one must be careful in their application, especially because they are not, in fact, reality, is the core concept of Joel's article. At least when I read it that's what I came away with. No disrespect intended to Joel, but those are not new ideas. Though it's good that he wrote the article because unlike many of us, he has a bit of a voice; if he says something it'll get heard by more people than if many of us do, and given the remarkable dearth of critical thinking evident in so many circles, it's good to have the reminder to take care, no matter whether it's a new idea or not.

2) Whichever of Arthur C. Clarke's 2001 A Space Odyssey series was last, I think it was set in 3001 or there abouts, 1000 years after the previous episode, Clarke describes something very like what I'm hearing projected in this group.  I believe it was the character Frank Poole who is the focus of Clarke's last episode. The point is that during that time, the protagonist takes a trip through space with a contemporary space crew. He's appalled at the time to find that the crew know next-to-nothing about how any of the systems on their ship work. He can't help contrasting that to the extreme amount of hard-engineering background he had had to master to get into the space program, get selected for the original Odyssey mission, etc. The contemporary crew knows little more than how to execute specific functions via the system interfaces provided. They don't know how their systems work, and they couldn't fix them if they did.

IMO, it's a disturbing, but probably fairly accurate prediction of how our interaction with technology is going to unfold. As we increase the incidence of systems designing other systems, there may in fact be few if any humans (well, organic ones anyway :) who can understand what's going on at any but the most abstracted levels.


Saturday, December 07, 2002

One interesting "leaky abstraction" that is going to cause a serious scientific meltdown in the next 5 years, is all the genetic data and computational biological models in use.

I got bored with straight up software engineering and got into bioinformatics about a year ago. 
At first I was freaking out because all the models and results being published seemed really sketchy. However, I just attributed that to my lack of domain knowledge; surely, when I understood biochemistry and molecular biology, I would see what these guys were talking about. Unfortunately now I am freaking out because now that I know the molecular biology and biochemistry behind these articles, I know realize all the models and results being published ARE really sketchy! 
There are a lot of proclamations in the popular scientific press that are being made on the basis of really flimsy experimental evidence. My prediction is that a lot of people have really overstated their results, and in the next 5 years the genetics research/industrial sector will undergo a serious dot-com style meltdown.  (businesses imploding, drug company R&D budgets slashed, no more grant money, etc)

Sunday, December 08, 2002

slappy, thanks. will keep in mind when investing in the market.

could you be more specific?

Prakash S
Sunday, December 08, 2002

I've been recommended to study the Law of Leaky Abstactions in another forum and did so during the weekend. I have come up with a list of comments. All these comments are of course my humble opinion, and not in anyway meant to flame.

Mr. Spolsky argues that object orientation, or non trivil abstractions always leak, and that this is making programming unnecesarily hard. I would like to disagree. He starts his text by describing a very good abstraction that has been used and reused for many years in thousands of applications, the TCP layer, using the IP layer to provide easy message passing over the internet (or any other IP net by the way). He claims that the TCP abstraction leaks when, for example, a network cable breaks so that the communication cannot take place. I would like to say that this is a part of the abstraction: TCP connections can break. It is a natural part of a communications protocol and must be taken into account. TCP is, to me, a protocol used to send messages in one or more IP packages and to get them back they way I sent them. If the cable is broken, of course I do not expect my message to arrive. To put it in plain English, TCP allows me to send longer messages over a network and takes care of the splitting into packages and reassembly. That is all, it does not perfrom magic like sending messages through broken cables.

Mr. Splolsky goes on with a number of examples of leaking abstractions, I'll comment them one by one here:

When iterating over two dimensional arrays the order that you access the items leads to different performance. This is mainly due to page faults (I'll ignore cache misses here). To solve this, simply use the right tool. If you want predictable response times and the ability to force a page to stay in memory, use a real-time OS. The general, flat memory space, OS abstraction does not guarantee access times. It does not even guarantee that your process is running all the time, a UNIX system might even swap out your entire process if you'r out of luck.

As for the SQL problem, this must be a problem in the language specification. It is a leaky abstraction, but you can always use another access method to retreive data from the database if performance is an issue. It is always hard to combine a high level abstraction with really high performance.

He goes on with files accessed over a network. I must ask a simpley question: what makes this a network problem? Local files can also fail. If the .forward file is located on a local harddrive that failes (CRC error in the actual file) you still experience this problem. This is a part of the file abstraction; Things can fail. Just as in the TCP example. This would not change if we went back to writing individual blocks to the hard drive platters, it would just complicate access, *alot*.

As for the string class, I discuss this later.

The last example is just not true. The roof, wind screen and climate controls of a car is not used to abstract away the weather. Neither is traction control systems and such devices. These are simply tools that increase the comfort of the driver and passengers.

Now, lets deal with the C++ string class. The example "foo" + "bar" is wrong. To declare a string constant, you should type string("foo") + string("bar"). This is just as odd as the declaration of long constants: 1L + 2L. This is just a part of the language and not an abstraction problem. Another point: I believe that the reason that C++ does not have a native string type is because C++ only knows scalar values (a string is a pointer which is a scalar value). This is due to the heritage from C.

He goes on discussing about accessing OUT LPTSTR arguments, COM objects, ASP.NET flaws etc. These are bad abstractions by Microsoft and should not be concidered useful examples. Then to compain about Visual BASIC failing now and then for non-basic related issues; I dare to call VB an ugly hack. It is not a real tool to be used in professional software development. I get frightened when I see how many business critical systems that are based on VB code.

Then he goes off topic. He starts to attack code generation tools and RADs. These are not abstractions, they are tools to allow people without the right competence and knowledge to try out programming. RADs are in fact _a_very_bad_thing_ as they encourages bad software development practice. They make it easy to forget how important it is to sitt down and thing before you start implementing.

The next thing to complain about is that you need to know more to develop software today than ten - fifteen years ago. I do not find this strange as we construct far more complex and interdependent systems today. This while (in my experience) spending less and less time on planning.

I'd say that the law of leaky abstractions is greatly exaggerated, but not entirely invalid. If you suffer from leaky abstractions you should concider changing tools or approach but not run away crying saying that abstraction is bad. Abstraction is a great tool, but as with all other tools it takes time to master. When used correcly it can reduce debugging time and increase the reusability. As a great example of code reuse I must point out that the TCP abstraction has saved thousands (if not millions) of source code lines. Just imagine if everyone had to rely on IP directly; How complex wouldn't the code be.

Abstraction helps us develop better, more complex software and avoid reinventing the wheel every five minutes!

Johan E. Thelin
Monday, December 09, 2002

"There are a lot of proclamations in the popular scientific press that are being made on the basis of really flimsy experimental evidence."

This applies to any modern science work, so it seems a bit unfair to single out one particular field and nail them to the cross on it.

Just me (Sir to you)
Monday, December 09, 2002

Johan, how did u suggest that Joel accomplishes the miracle of switching all users of CityDesk to "a real-time OS"?

Yes certain problems have solutions when you use different abstractions but those have other "leaks".

Monday, December 09, 2002

Tekumse, CityDesk does not require real-time abilities. I have a hard time understanding why predictable memory access times is a requirement in an application such as city-desk. Isn't it better if it runs on less expensive hardware, just more slowly, than not at all, but at a predictable pace?

When developing applications such as CityDesk the flat memory space view _greatly_ simplifies development. And when sharing memory with other (perhaps less or more reliable) applications memory protection is always a Good Thing. It is better to only have the faulty application fail, than your entire system (as in Win 3.x and the old Mac OS).

Johan E. Thelin
Tuesday, December 10, 2002


"Mr. Spolsky argues that object orientation, or non trivil abstractions always leak, and that this is making programming unnecesarily hard. "

The first clause of your summary is correct, the article does argue that non-trivial abstractions always leak.  But I contend that the second clause of your summary is flat out wrong, and misses the point of the article.  The article does not imply that object orientation, or abstraction in general, make programming unnecessarily hard.  The point the article makes is that even though we use these abstractions, and create our own abstractions, we must, if we are to be successful, understand underlying mechanisms -- or at least be able to understand them --should a "leak" in an abstraction cause us problems.

The article therefore *isn't* suggesting not using TCP, not using SQL, not using VM, not using any particular abstraction.

I agree with you that the some of the abstraction examples, especially the TCP one, are perhaps ill-considered.  Certainly, if you defined the TCP abstraction as a transport mechanism that guarantees delivery of arbitrary length messages over devices connected via a network, then a broken or unplugged cable is not a leak in the abstraction -- because the devices are no longer connected by the network.

Many of the exemples boil down to the fact that abstractions can usually abstract away functional details, but cannot so easily abstract away performance details.  However, I do believe that these are valid examples of 'leaks' in abstractions.  It is impractical to suggest that all applications for which performance is an issue use a real-time OS or use a non-relational data store.  Performance can be an issue for most applications, and understanding how to optimise an application's access to a relational data store, or to optimise its memory access in a demand paging environment, can be crucial.

For most applications, it is better to use the 'leaky' abstractions (relational databases, VM) and understand how to optimise performance, than to completely abandon the abstractions.

Tuesday, December 10, 2002

I highly doubt that Joel would ever attept to say that abstraction is "bad", or for that matter, that abstraction is "good", any more then I would. 3 points seemed to have been missed here. First, abstraction, or the "want to hide the complexities" is one of the driving factors in human nature, we want it all, and we want it now. Second, the NEED to make things more complex then required, drives  Point 1 (above), also human nature.
Third, City Desk (a fine peice of software) IS an abstraction built from many abstractions, and produces even more abstractions.

I think the real point was that abstractions are inherently flawed IF the details that they try so hard to mask are forgotten, or unknown, and that regardless of platform, language, or "world", abstractions can, and often do... Leak!

Robert French
Thursday, December 12, 2002

*  Recent Topics

*  Fog Creek Home