Fog Creek Software
Discussion Board

Database access in Web applications

Assuming I'm writing a simple web page that displays an article. It needs to display the content of the article, but also the name of the author.

The author is linked to the article using an author Id field. The authors table also contains information that is irrelevant to this page, like the username of the author, or his phone number.

In order to keep the actual code clean of SQL, I would write two classes that represent an Article and an Author. When initializing the Article it would normally select everything it can about the article, and then initialize the Author object, again, selecting everything it can about him.

Obviously, this model can be very resource hungry, especially when using inside a bigger application. Is there a better way to design this without falling back to a plain, single SQL query?

Tuesday, August 31, 2004

Assuming there are likely to be small set of authors, maintain the entire set of Author objects in memory as a shared resource; load all the data in one go at system startup, and make sure all Author create/update/delete operations go through a broker class so you can write through your cache in memory.

That way, you only ever need the author ID which is already in the article table; this gives you the lookup you need to fetch the Author object directly without another query.

Obviously, this pattern only really works if there are a limited number of authors, and that number is likely to stay limited. If not, you can use a fetch-on-demand, cache-recent-requests strategy. Either will offer you better performance than just fetching the author info from the database every time.

.NET Guy
Tuesday, August 31, 2004

Do you need to deal with the abstract notion of "Article" and "Author" in a lot of different places?  Is there a lot of business logic, aside from just the data, that is associate with "Articles" and "Authors".  If not, just write a simple SQL query and be done with it.

How is "select article.content, from article, author where article.authorid =" more complicted than creating classes, get/set methods for each property, save/get/delete methods etc.?

I'm more and more convinced that OO designs which consist of a bunch of classes that have the same attributes and names as underlying database tables (which often seems to be the case) are just a waste of effort and a layer of complexity that hides what's really going on.

Why write, debug, and have to maintain a bunch of code that re-implements what the underlying database is already modeling natively?

OO yeah whatever..
Tuesday, August 31, 2004

If there's not a lot of domain logic, there's nothing wrong with a transaction script style architecture. And it doesn't sound like you have much complex logic there. Of course, moving a transaction script to a full OO-style domain model is hard, so if you are sure you'll need it, don't hesitate to make the classes. Code for the complexity level you're *sure* you'll need, and no more.

I plug the book a lot, but it really is that good: go read Fowler's Patterns of Enterprise Application Architecture. This question sounds like you're interested in his chapter on object/relational mapping.

Mike Swieton
Tuesday, August 31, 2004

Make one class call "Backend", another called Result [was discussed here in the past] and put the SQL behind it. Use proper wording for the methods.


Result r = backend.findAuthorById(id)
if (r.isSuccess()) {
  ResultSet rs = (ResultSet)r.getExtraData();
  <<loop on rs & display or forwardToDisplayPage(rs)>>
} else {

In fact you can generalize the backend methods:

    whereClausen default null,
    orderClause default null,
    view default 'list'
): Result

Then a bunch of :

backend.UpdateXXX and
backend.calculateXX and

methods will do the job.

Why avoid showing that we have a DB when it is so powerful to use its features ?

Why putting a ton of abstractions on top of each other to display a list in  a rectangle on a web page ?

Too much clutter in backend ?

Yeah sure, put some delegates in it if you need to so that it stays manageable by a team. If you are the single developer, don't mind and put everything in there.

What you can also do is to pass a Command object as an extra parm for added flexibility.

Wednesday, September 1, 2004

I only gave the simplest possible example. My actual code is much more complicated than this.

Assuming I am building an e-commerce website with many different supplier. In any page I need to access several variables from the supplier table, like his name, email, etc.

In some pages I also need to access a function which calculates the possible amount of payments depending on the cost of Product and some variables that are different to each supplier.

The supplier class must also do some other actions, like validation when updating the supplier details through the site admin interface.

So the options I have:
1. Wrtie a proper Supplier class that initializes itself with all it can about the Supplier, including details I may never need anywhere on this part of the website.

2. Load only the basic supplier details in each request, and only select the payment details when I need them, in another sql query. Here I may have another problem - where do I put the payments code, and how do I initialize it? in this model the Supplier object is not complete.

This is, again, only a small example - in a real website I may have those issues again and again in many different objects. Until now I found myself just falling back to normal SQL in many places, but I always wonder what is the "right way" to do it.

Wednesday, September 1, 2004

What language is this?! If you have a language which will allow you to easily override the accessors, create a class with lazy initialisation for the parameters that you only need rarely, and/or would be expensive to compute/request. This should be trivial in something like C# or PHP5.

And definitely look in to books on Patterns of Data Access. Well worth reading.

Andrew Cherry
Wednesday, September 1, 2004

The following Microsoft whitepaper discusses options for what you describe:

Their recommendation is to design your Business Entities around your application functionality rather than mapping directly to tables.  To do your example in .Net, this translates to creating a Dataset to represent the important fields of the Author and Article tables rather than all the fields.

Does this help?

Wednesday, September 1, 2004

> What language is this?

Currently using PHP4, but slowly migrating to 5.

> And definitely look in to books on Patterns of Data Access. Well worth reading.

I will. I will also be thankful if anyone could point me to some well designed open source projects that implement those patterns.

Wednesday, September 1, 2004

One word: Class::DBI ( It's Perl, not PHP, but looking at it should give you an idea how to better approach the problem.

Availability of tools like this was one of the reasons I said good-bye to PHP after four yeas of fighting with it. It never matures, just grows.

Wednesday, September 1, 2004

The DBI model is available in several languages, in fact, or something close to it.  PEAR::DB, for instance, is a great choice for database abstraction in PHP.  libdbi offers the same features for C/C++ (although admitedly for a very small number of database engines).

When faced with this very same situation I decided that what my classes needed was a way to get information from an query, not just one that it had initiated.  This let me populate my author object from the same query that populated my article object.  It cut down on the number of queries, and ensured that all access to the database would be treated identically by the object.

Clay Dowling
Wednesday, September 1, 2004

*  Recent Topics

*  Fog Creek Home