Fog Creek Software
Discussion Board

Integrating data access into programming languages

There was a research project at Microsoft run by a couple of my co-workers which involved embedding knowledge of relational and XML data access directly into the programming language. The research project was called Xen and I blogged about a presentation on it at

It seems inevitable that people will gravitate in this direction.

Dare Obasanjo
Thursday, March 25, 2004

Ummm, Xbase, Visual Foxpro and I guess Delphi and a few others wouldn't qualify?  For data aware languages anyway.

XML aware, well that depends on whether the XML is purely data representational or involves events as well.  There are a number of language dialects of XML which are themselves XML, then again there's XSLT which transforms XML.

Simon Lucy
Thursday, March 25, 2004

Data-aware means that the language primitives and operations actually understand the underlying data. There was a time when types like string and float were only available in libraries and weren't and intrinsic part of programming languages (in fact in C/C++ string still isn't).

The next step is for notions like relational tuple or XML tree to be intrinsic parts the programming language complete with operators as opposed to having to jump through hoops using various third party libraries.

Dare Obasanjo
Thursday, March 25, 2004

Just a simple idea...

It would be really nice if there is operator/keyword
nameof() like there is sizeof().

Connecting struct members to GUI or DB would be so
much easier.

typedef struct idea  {
  char  job[256];
  char  title[256];
} IdeaRecord  myIdea;

printf ("%s", nameof(IdeaRecord, myIdea.job));

should print "job" and

char  *chPtr = myIdea.job;
printf ("%s", nameof(IdeaRecord, chPtr));

should print "job" again.

Well, how to implement this into a language like C
is beyond me, but if only they could do it... somehow.

Thursday, March 25, 2004

There is a language that integrates the database rather naturally - K [ ]. But any APL is rather good at it. The 'where' syntax is algebraic, that is:

select y from table where x > 3;




select x, y from table where x > 3 and y < x;


((x>3)&(y<x)) / (x,y)

(APL has a special font, and this ASCII transliteration corresponds to no standard, but I think should make the point clear).

Ori Berger
Thursday, March 25, 2004

A few days ago as an academic exercise I made "data aware" objects that derived from a data aware base class which, when accessed or saved would probe for data-setting attributes. i.e.

[SQLRead("dbo.myclass_get"), SQLWrite("dbo.myclass_set")]
class MyClass : MyDataAwareClass
  public string Field1

It worked, removed the rote database access code from the data objects (instead making them imperative modifiers). The downside was that performance was 100s of times slower (reflection is slow) than simply embedding a specialized load in the data object.

Dennis Forbes
Thursday, March 25, 2004

PL/SQL provides that integration.  Yes it ties you to Oracle and has many other weaknesses as a language, but the data access integration makes things convenient.

Ideally the compiler should take care of DB access.  For different databases, different plug-in modules for the compiler could be used, rather than providing libraries for programmers to explicitly call.

Thursday, March 25, 2004


Apple's Enterprise Objects Framework (part of WebObjects) addresses exactly the problem Joel describes.

You model your "entities" in a tool that maps your db tables to Java classes and columns to properties.  Then at run time, EOF lets you fetch objects directly from the database.  It turns rows into instances and sets the instance properties appropriately.  It also allows you to follow relationships to other tables as object references from your fetched objects.

EOF generates all SQL for you, and you never have to see it if you don't want to.

Like so many Apple products, it's the world's best solution to an existing problem, but it went nowhere because Apple has no idea how to market to the Enterprise.  Hopefully, that's starting to change.


Jim Rankin
Thursday, March 25, 2004

Jim, that's no different that what the MFC Data Wizard does; it's not unique to Apple.

The very presence of the wiring up means that if your schema changes in the database, you need to regenerate the "wiring," otherwise your app fails at runtime, not compile time.

Just because it does the wiring up for you doesn't mean the language knows what's going on, and that's the real problem.

Joel Spolsky
Fog Creek Software
Thursday, March 25, 2004

Try the following link, the guys website is full of similar topics, integrating table operations into the core of languages.

Joe Booth
Thursday, March 25, 2004

Oh man, this discussion is as old as the hills. Go to CiteSeer to find the ruins of hundreds (ok maybe tens) of database-coupled programming languages that never made it.

I worked on Apple's (then NeXT's) EOF, I worked on all of the major Microsoft database access libraries, and I haved worked directly with the teams developing runtime database engines such as SQLServer.

In my estimation, the runtime capabilities of SQL (or your database engine of choice) are what make the separation of language semantics from storage/query semantics both useful and necessary. Concurrency, transaction boundaries, and partitioning, in particular, are pretty darn useful. But things like this really clutter up programming languages, and more importantly, they introduce vectors for all sorts of difficult to debug programming flaws.

Until programmers stop thinking in terms of datatypes and start thinking natively in abstractions such as concurrency and autonomy, a truly integrated database language is unlikely to appeal to large groups of programmers as practical. (In an only slightly tangential vein, as someone recently pointed out: if LISP is so great, why did the market not adopt it?)

Embedding a good programming language into the database server is a much more attainable goal, imho. But this approach, of course, maintains the separation.

Still waiting for LISP
Thursday, March 25, 2004

What about Prolog?  It's a natural language for describing restrictions on a database and munging the data in various ways.

As for a language with intrinsic support for XML as a first class type, that's Lisp (in its 'purest' form), or Haskell, ML, etc.  The kinds of problems that XSLT is being applied to have traditionally been done using Lisp or mini languages written in Lisp.

A language with a dataset as a first-class type might help with some of these problems.  I can think of one hitch off the top of my head with merging the semantics of an SQL where clause with logical predicates in most common programming languages: the "in" operator.  In an SQL statement you can say "... WHERE x in (SELECT ..." and that's not a concept that most other languages support by default.

Thursday, March 25, 2004

Still waiting, I think that you raise an important point about language complexity.  I think that pushing new development into libraries is such a common thing for this very reason.  It seems like, past a certain point, the easier you make it to do one thing, the harder you make it to do something else.  Since you're still waiting for Lisp, that's a great example.

Lisp makes it really REALLY easy to do anything that has to do with tree parsing.  If you want to make a compiler, an interpreter, or a language translator, Lisp is very well suited for the task.  Plus the fact that it's dynamically typed allows you to do a lot of stuff that'd require obscure template syntax in something like C++ (provided that the Lisp functions you're using support the value types that eventually go into them).

But if you want to use an algorithm that's best represented by a process on contiguous bytes, it can be very very awkward in Lisp.  Sure, you can do it, but typically it's not pretty.

Guy Steele has said a lot of (I think) valuable things on this topic.  He's the first person I've heard say that data structures are small, stupid programming languages, and that large common programming languages are just further extensions of that theme.  From that perspective, it makes sense that certain problems would be awkward in certain programming languages -- because certain algorithms are awkward with certain data structures too.

Thursday, March 25, 2004

But what about Prevayler ( ) and its non-Java equivalents?  With Prevayler, you write your storage and query code in the same language the rest of your code is in.  It's not 100% transparent, but it does mean that there is no "SQL layer" between your application and data.  Threre's no need to learn a seperate query language, and you get to deal with first class objects within your code.

Thursday, March 25, 2004

Prevayler is just an object dumpting ground.  It is great at being an object dumping ground because it has cut out all of the database features that get in the way of dumping objects.

Prevayler is more like changing the database to match the way that a program might use them than vise versa.  Not that I'm sniping, one makes just as much sense as the other in the right context.

Keith Wright
Thursday, March 25, 2004

Off the top of my head MUMPS (M, Open M, Caché) and COBOL are good examples of relatively popular languages with strong, native integration of data access calls in the language syntax

Friday, March 26, 2004

Progress 4GL has native DB support (including data types), which is much easier to use than SQL. In addition, you can write any imaginable business application in this language (Console, GUI, Web etc).

Its widely used by manufacturers and financial organisations like SEB, Toyota, Ford etc.

The catch is

Vlad Gu
Friday, March 26, 2004

it being 4GL.

Vlad Gudim
Friday, March 26, 2004

The AllegroStore add on to Franz Lisp has the best select statement-like support of any of the object persistence systems I've seen.  See section 6.11 of the AllegroStore manual, "Queries and Iterators":

You can pretty much see the "spiritual select-nature" of the simpler examples, and translate them pretty one to one into SQL select syntax.  But you can use (gasp!) macros in the queries too. 

The syntax to make a class with persistent slots is dead easy too.  You simple put a ":allocation :persistent" keywords on the slots in a class.  Bingo it persists.

I've been wishing for an open source alternative that was half as slick for years!

David Mercer
Friday, March 26, 2004

Quick Example of Progress 4GL Code:

FOR EACH customer:
    DISPLAY customer.cust-name.

    FOR EACH order OF customer WHERE order.price>100:
        markup_ratio = 1.0 - order.wholesale / order.price.
        IF markup_ratio > 0.2 THEN DO:
            DISPLAY order.number order.price


The variables (order.number) represent the contents of their respective tables, where the first part is the table name and the second part is the attribute of that table.

Shawn Leslie
Friday, March 26, 2004

*  Recent Topics

*  Fog Creek Home