Fog Creek Software
Discussion Board




Impedance Mismatch

I've had the misfortune of inheriting a code base that suffered from impedance mismatch. As a result of the digging I've done into the subject of ODBMS, it seems to me Joel's definition is a little akward as it seems to focus more on libraries or that glue in between. Based on what ODBMS vendors state, this mismatch is the result of trying to map a hirearchal data structure into a relational system. While it can be done intelligently, I suspect that just how to do so isn't immediately apparent.

In the system in use with my previous (woo hoo!) employer, their were tables for classes, objects, object relationships and one huge table that stored all data called objectData. The amount of code needed to get one piece of information out the system was ridiculous. The system had to first determine what class the object is an instance of, then it's relationships (it's place in the tree) and properties, and then finally yank that information out of the objectData table.

This was an extremely slow process that was well masked on a fast machine and small data sets. However, when it came time to throw a large data set at it, which is about the time I was hired, the system suddenly felt as though it were an old 286!

Evenstill, I'm not a proponent of ODMBS. Perhaps I've read too much Fabian Pascal.

Anyway, how does this fit with what Joel said? Is it possible to effeciently store object hierarchies in an RDBMS? If we insist on using an RDMBS, isn't this first a question of schema?

Cheers,
BDKR

BDKR
Sunday, April 04, 2004


Sounds like the problem there was they tried to build an object-relational database on top of a regular relational database.

Normally this should not be necessary. Just take the simple approach, and map classes to tables and any peristent, non-derivable member variables to columns in those tables.
Use sequences / auto-number columns for primary keys.
If you need inheritance, used joins with parent tables based on the parent object ID.

If you think about it, you can probably automate 95% of this (using something like the Java Hibernate package.)

Dwilliams
Sunday, April 04, 2004

It seems simple, but gets complex quickly.  Particularly with relations between different classes.

Hibernate does a good job with this, I agree.

Will
Sunday, April 04, 2004

It’s not a matter of trying to model hierarchies relationally.  Since you know Fabian Pascal’s work you’ll know that the relational model is quite capable of modeling hierarchical structures.  The “mismatch” problem is three-fold:
1) Current SQL DBMS vendors have poorly implemented the relational model leading to difficulties in dealing with hierarchies and objects
2) Developers consider the RDBMS to be nothing more than a place to stick data – hence terms like “persistence” and “data store”
3) As a consequent of points 1 and 2 there is no pressure on DBMS vendors to “get it right” (so they won’t) – thus leading us to such bad ideas as ODBMS and XML DBMS (and past bad ideas as IBM IMS)

The relational model and object-oriented “theory” are much similar than you might think.  The RM includes the idea of “complex” data types which consists of (from “Practical Issues in Database Management”):
* a name
* One or more named possible representation(s), of which
--- one is physically stored
--- at least one is declared to users
* Possible additional type constraints
* A set of operators permissible on the type’s values 

Note that this is a superset of what objects offer.  If objects offered a table-equivalent then it would be, in this context, no different from the relational model – indeed there would be no mismatch at all!

Of course, since the RM is based on predicate logic and set theory there are other significant, practical advantages that it has over OODBMS.

Depending on your outlook this raises one of two questions:
1) So what?
2) How do we move forward? 

Question one is an example of the quintessential close-minded “I just need to get my work done leave me alone with those sorts of theoretical problems” person.  Certainly we have a limited set of tools with which to work and have real-world problems which must be solved – otherwise we don’t get paid and then all sorts of bad things happen.  But that is no reason to simply ignore better, proven solutions which are out there.  If everyone was of this type we’d not have light bulbs, or automobiles, or electricity, etc.

Question two is a harder question to answer.  Certainly we have to put pressure on SQL DBMS vendors to “get it right” and allow true relational domains and relational hierarchical operators (Oracle’s CONNECT BY is not a good example of one).  The payback is obvious – decreased development time, decreased likelihood of bugs, safer data, etc. 

In either case it’s good to be aware of the limitations of the DBMS product you’re using so you know the practical implications and how to work around them (and demand that your vendor fix them).

And some good/interesting Wiki links are here:
http://c2.com/cgi/wiki?RelationalModel
http://c2.com/cgi/wiki?ObjectRelationalPsychologicalMismatch

MR
Sunday, April 04, 2004

Thanx for the responses.

DWilliams:
"Sounds like the problem there was they tried to build an object-relational database on top of a regular relational database."

That's exactly what they did! As for what you said about mapping classes to tables, that's exactly what I did with the last peice of code I wrote for the company. It would read the existing class, property, and relationship tables and dynamically generate a relational structure in a target db, then migrate the data from the old structure into the new one.

That was a tough piece of code to write at first, but it got easy once things came clear.

MR:
Great links! I'd forgotten just how powerful and useful Wiki can be.

As for this:
"Question one is an example of the quintessential close-minded “I just need to get my work done leave me alone with those sorts of theoretical problems” person.  Certainly we have a limited set of tools with which to work and have real-world problems which must be solved – otherwise we don’t get paid and then all sorts of bad things happen.  But that is no reason to simply ignore better, proven solutions which are out there.  If everyone was of this type we’d not have light bulbs, or automobiles, or electricity, etc."

You sound like Ayn Rand. :-) But seriously, my curiousity concerning what it was they (the developers that hired me then took flight when they saw how bad the project was going)  were doing was what led me to take the job in the first place. By this time I had allready chewed on a bit of FP and had some ideas forumlated going in. But none of that prepared me for what I saw when I took a look at the schema. Keeping in mind what DWillliams said and something I had come to realize, they very well could have treated each object as a table then just have the system dynamically generate new tables when they feel their business logic warrants / requires a new class. At least at that rate, it would be possible to do ad hoc queries against the database and share that same data with other applications.

But that was just the sound of my dreaming while I was working there and the impetus for writing that data migration utility.

Anyway, great links! I've spent hours today reading and following links here and there.

Cheers,
BDKR

BDKR
Monday, April 05, 2004

I've enjoyed working with both ODBMS and RDBMS from Smalltalk and have some observations.

The problem between OOPLs and RDBs isn't the impedance mismatch, but the attempt to force squares into circles.  Programmers desire to make the database an extension of their program instead of treating it as an object onto itself with its own methods and API.  Do we think HTTP should be bent to OOP's will?  Is everything OOP bumps up against supposed to bend to it?  I don't think so.

OODBMS on the other hand do blend more naturally with OOPLs.  Their biggest failing is the lack of a common query language.  What you save in programming your application (a consistent 25%) will be made up in writing reports in your OOPL instead of a simply SQL-like language.

Don't be fooled, however.  Both approaches ultimately require tuning.  Tuning an ODBMS requires some of that tuning code leaking into your application code.  Having just gotten rid of it (remember the 25% savings?) it begins working its way back in.

Of course, tuning a SQL db can be done outside the application.  It's isolated behind its interface (API) like any good object ought to be.

Thomas Gagne
Wednesday, April 07, 2004

*  Recent Topics

*  Fog Creek Home