Fog Creek Software
Discussion Board




N-Tiered Solutions/In-Memory Databases

Folks,

I was wondering if anyone here has produced n-tiered solutions which utilized an in-memory database?

The product we are developing has reached critical mass.  We have a calculation engine which makes use of a cache mechansim  we built in-house.  As the product requirements have expanded, this particular cache needs to become more sophisticated.  At the moment I am analzying the feasibility of implementing sophisticated indexing algormithms versus commercial tools on the market.

In theory, these products sound great and I could see many applicable uses for them.  However, we have all been bitten by black box solutions at one time or another.

So, in summary, I would appreciate any input, both positive or negative based on your experiences using any in-memory database technology.

beach bum
Thursday, November 07, 2002

If you're doing J2EE development, an intriguing solution could be the Prevayler ( www.prevayler.com ). It's a system for keeping all of your business objects available in memory. I guess the idea is that 1) If you can keep your whole DB in memory, it will run much faster. and 2) If you eliminate all of the N-tier connections and put data and logic together in the same tier, you eliminate needless parsing and connectivity overhead.

The system seamlessly backs up the in-memory database on a continuous basis using Java object serialization. After any change is made to the data, an incremental update is saved to disk. After a certain number of incremental updates, the system saves a massive snapshot of all the data. Then, if you experience a crash (or a shut-down), you can recover your data into memory by loading the most recent snapshot and then stepping through all of the incremental updates saved since that snapshot.

A possible drawback (or benefit, depending on how you see it) of this system is that there is no built-in querying mechanism. So you have the additional task of developing a way to query all of your objects. Of course, you eliminate the overhead of parsing SQL queries, and you can do things with your data structure that are impossible to do with SQL.

Having said all of that, I've had my eye on this project for a while now. The project administrators are very active. A new version was recently released, and very few bugs were reported. Those bugs that were reported seemed to be fixed quickly. I've looked over the code myself and it looks pretty clean. It's small (less than 2 KLOC, if you don't include sample projects, etc). And it seems to have very good benchmarks.

It's LGPL, so you can incorporate the code into a proprietary system if you want. The only reason I've never used it is that I haven't had a project recently that would call for a massive persistent object database. But if I had the opportunity, I would jump at the chance to develop a project with this system.

<DISCLAIMER: I have no affiliation with the Prevayler project. I just think it looks pretty cool.>

Benji Smith
Thursday, November 07, 2002

BTW, there are also prevayler implementations ported to C#, Python, Smalltalk, and Ruby. There are links at http://www.prevayler.com/wiki.jsp?topic=PrevaylerPortsToOtherLanguages

Benji Smith
Thursday, November 07, 2002

"and you can do things with your data structure that are impossible to do with SQL."

Benji, can you give an example of the data structure that SQL can not do?

Rover Still
Thursday, November 07, 2002

Hi!

It's very nice to see that people are talking about Prevayler, as I'm one of it's developers! :)

An example to your question, Rover, is that when you're working with Prevayler, you can use pure Java Objects. So, you can get full object-orientation support: inheritance, polymorphism and encapsulation of your data.

So, it gets a *LOT* easier to do trees, N:M relationships, and graphs in general. If you're storing more than a simple Customer:Product:Stock graph, then Prevayler can help you a lot... well, Java can help you a lot. Prevayler is just the persistence trick added to it ;)

Shameless self promotion: there's an article I wrote about it at IBM developerWorks, here:  http://www-106.ibm.com/developerworks/web/library/wa-objprev/

Carlos Villela
Saturday, November 09, 2002

Okay, I suppose that "impossible to do with SQL" was a bit of an exaggeration. But in many many many cases, a relational database model is a lousy representation of the data you're trying to work with.

I worked as a tech writer recently for a project that I thought was a database nightmare. (Since I wan't a developer on the project, my input about the architecture didn't carry much weight). The data structure consisted of an enormous tree of information, with different types of attributes at each level of the tree. The tree could contain variable levels of depth in its different branches, and each branch of the tree could make reference to one or more branches of another distinct tree structure.

Knowing the purpose of the data structure (organization of CAD standards information for massive engineering firms), it made complete sense to organize it as such, but storage and retrieval of the data was a huge issue (consuming lots of time and server resources), since it was all stored relationally in a MS SQL server database.

The product was implemented as a server application that would receive an API call from the client and return an XML fragment based on that call. (The XML fragment was essentially just a portion of the main tree structure, without any transformations applied to the structure or data.) So, the API call would generate a series of SQL queries (sometimes in excess of 1000 different queries) that each had to parse and execute. The rows returned from the queries were used to assemble some really complex XML documents, which were then returned to the client.

I thought that this was a pretty stupid way of doing things. If the data structure is a tree, why not store it as a tree? Why take the trouble of breaking a natural tree structure into dozens of different tables just so that it will fit into a relational database? What's even more absurd is that after breaking the tree structure down into a row/column structure, you just have to turn around and re-assemble the whole thing back into a tree before returning it to the client. It's preposterous.

Relational databases are lousy at storing *some* complex data structures. Of course, working with native XML all the time can also be an idiotic way of doing things, since you're constantly parsing a bunch of text.

What makes much more sense is to work with objects as objects. Save them as objects. Query them as objects. And return them as objects. Then, if you need to turn the object hierarchy into an XML document, it's trivial to implement. Or, if you need to return some rows and columns from the hierarchy, that's trivial, too.

The *only* problem with using some other system (other than RDBMS's) is that you lose a very handy query language. SQL is a fantastic language for working with data that fits well in RDBMS's, but not all data structures fit well into the mold. I thing the reason that so many people use an RDBMS is that they don't want to develop another method to query their objects.

Benji Smith
Wednesday, November 13, 2002

> Relational databases are lousy at storing *some* complex data structures.

This is not true. Relational database handles complex data structures much more gracefully than any other model. The value of relational database is "relation". A common mistake in understanding a relational model is to think they are "rows and columns". They are not.

For example, a tree can be modeled as a single table:

create table tree (
child char(16),
parent char(16),
value char(128) )

A sample tree:

child              parent        value
------            ----------      -------------
boss            NULL          company villa
manager      boss          company car
programer1  manager    new mouse pad
programer2  manager    nothing

manipulate this tree is much easier in SQL than in, say, Java

A general graph can be modeled with two tables:

create table node (
node char(12),
value float )

create table edge (
node1 char(12),
node2 char(12) )

Rover Still
Tuesday, November 26, 2002

*  Recent Topics

*  Fog Creek Home