Fog Creek Software
Discussion Board

Strings are actually good

There was quite a long discussion on the use of strings, and how slow they are. However, strings are the most basic and usefull data structure that one can deal with. The idea was presented that a string based (xml) database would be far two slow. Anyone who suggests this has never used a database system based on strings. Notice how I said not text files…but strings. All too often those pearl scripts and the like that crunch through a large text file give string processing a very bad name.

It is possible to design a very efficient and high speed database around xml. In fact, a design exists now that is been around for about 30 years.

I not going to toot about some OS that we all must run out and use. However, anyone who has used a multi-valued database will understand how a xml database can be structured. The effort to change the architecture of a mv system to a xml system would be a match made in heaven.

As for multi-valued databases…well they have been around for a very long time. It is interesting that the speed of these systems is really based on how fast strings can be processed.

If you want to read up on why these databases are so useful…then you can jump to the following:

When I get a chance, I’ll outline exactly how these amazing systems work.

By the way, the above web site was created in one after noon (last Saturday) with City desk. Joel’s idea to give away a trial version that only limits the number of files was very brilliant…as now I am starting to get hooked on this product…and approaching that file limit!

Albert D. Kallal
Saturday, December 22, 2001

Pick was very popular in vertical market data transaction systems in the 80's and I've seen a number of excellent implementations.  I've been involved in replacing some of them as well.

The most common reason for replacement was the original hardware needing to be upgraded and the rest of the organisation running on desktop systems that were difficult to interface with their Pick system (actually probably not difficult just expensive to go back to their Pick consultants).

I'd not even claim that the replacements were any better than the original systems, often it just made the Board feel better.

If there are interfaces to Pick systems now which make integrating straightforward then it would be workthwhile pursuing it.  An Open Source Pick OS would be very interesting (he says not having bothered to find out if there was one already).

Simon Lucy
Sunday, December 23, 2001

Multi-valued... XML (tree-structured data) orientation... primarily string-based... sounds like you're describing an LDAP directory.

LDAP (lightweight directory access protocol) and DAP predates XML I suppose.  But its concepts of a tree orientation, free-form attributes hanging off entries, primarily string-based attribute values and queries... all map well to the sort of back-end you're describing.

In fact, LDAP's underlying protocol (ASN.1) certainly _would_ have incorporated XML had it been on the radar at the time LDAP was defined.  Instead, ASN.1 is sort of a binary-encoded version of XML in which data objects can be encapsulated inside other objects... inside other objects.  Again, really nothing more than a harder-to-use, binary version of XML.

But if you would like a high-performance back-end that supports many of the concepts you've outlined, definitely check out open-source projects like slapd/openldap.

D Ross
Sunday, December 23, 2001

*  Recent Topics

*  Fog Creek Home