Fog Creek Software
Discussion Board




Design by Database

In my current job, I've noticed a strong tendency towards designing an entire system based on the database tables and relationships required. 

To me this seems backwards, as I believe the system should be modeled based on the functionality of the system.  This could easily be translated to a database representation that is appropriate, depending on the needs of the data (persistence, search speed, etc).

As a result of this designing by database, there are several systems within my company that are a one to one mapping from database tables to objects in the code.  This makes maintanence a nightmare to say the least.

Have any of you had similar experiences?  Any feedback is appreciated and welcome.

Seeker
Wednesday, July 16, 2003

Certainly.  To speed this up I use Jakarta Torque (though I obviously work within a Java environment) - so I define my data structure and Torque generates the SQL to create the database structure, and the java code so that I have a set of objects that correspond exactly to rows in the table, and classes to aid searching etc.  All connection-pooled.

Konrad
Wednesday, July 16, 2003

Something to think about: system functionality is generally easier to change than database schemas. Often the database schemas will outlive everyone in a department and through several iterations of applications sitting on top of them.

Scott Ambler has lots of good docs on this sort of stuff: http://www.agiledata.org/

Chris Winters
Wednesday, July 16, 2003

There was a long discussion about this on the Extreme Programming mailing list awhile back.  XP believes in doing nothing until it's needed.  Some folks argued that they sometimes absolutely *know* that their application will need a database.  XP'ers, including Ron Jeffries, pointed out that even if *you* know that you need a database, that doesn't make it good policy to start development with a database.

I've developed applications where I was tempted to start with a database.  I've always been glad I didn't, *especially* on projects where I ended up with a database.

For example, I'd start a project where I "knew" I needed a database, and would plan out the tables.  Then I'd tell myself, "No, you don't need that yet."  So I'd write the application without the database, then add the database when I needed it.  I found that the database design that grew out of that process was much simpler and elegant than the design I'd initially planned.

Brent P. Newhall
Wednesday, July 16, 2003

I think the problem is that normalisation is such a simple process.

By going through a few simple processes you can go from a prose description to a database design in just a few hours.

Then you just have to create the forms and objects on top of the data.

It gives you a nice clear route to take that gives you a fully formed computer system with very little thought.

There lies the problem, very little thought is put into these systems and it shows.

Ged Byrne
Wednesday, July 16, 2003

XP is saying you are too weak or stupid to use judgement.
So  please substitute this simple rule for your judgement.
No thanks.

valraven
Wednesday, July 16, 2003

For enterprise systems always remember that your application does not own the database. The same data will be used by many applications, some of which you will never know.
In guess you could say that data is forever, whereas applications come and go.

Just me (Sir to you)
Wednesday, July 16, 2003

I wholly disagree with many of the sentiments given on here. In many cases the whole reason for an application's existence is the underlying database and the data stored within, and the applications that you build around it are nothing more than transient views (and there can be many views onto the same database, each using different technologies and architectures, be it a n-tier .NET application or a client-server command shell application).  As another said on here: In many cases the database will live long after your carefully considered and prototyped componentized front end.

I am entirely for modeling the database based upon the true storage and business needs, and then developing front-end and middle-tier applications that intelligently use it. Without doing that you end up with trash databases (unbelievably common) because someone was more interested in their piece of candy technology that they developed upon it.

Dennis Forbes
Wednesday, July 16, 2003

Chris,

The AgileData link is excellent.  Thanks for that.

Ged Byrne
Wednesday, July 16, 2003

Yes, thanks for the link Chris.

Seeker
Wednesday, July 16, 2003

"I am entirely for modeling the database based upon the true storage and business needs, and then developing front-end and middle-tier applications that intelligently use it. Without doing that you end up with trash databases (unbelievably common) because someone was more interested in their piece of candy technology that they developed upon it."

There is no excuse for trash, lets be clear about that. Whether it is the database or the front-end.

But "modeling the database upon the true storage and business needs" sounds to me as "modelling the database on the true business applications."

Because I think businesses exist to do achieve results. In order to better achieve results, they may decide to store information, which they can later process to support business activities.

So, they don't just store data, they store data they need in some form or another. Now, to determine which data that is, you need to look at the business activities.
If that data is modelled well, there is a clear distinction between real-world concepts and "artificial" data, which may be transient, or semi-permanent.

The data that represents real-world concepts will not have to change shape, unless the world suddenly changes.
The "artificial" data, however, is there to represent relationships, or intermediate results. This may change over time, depending on how the activities of the business change.

But these changes always preceed the change in data model. Because of the change, some relationships may no longer be necessary, or new ones may be needed. Or if the business changes substantially, new real-world concepts may be needed.

If your database environment was not selected with this in mind, you will be forever trapped by choices once made.

Unless something is wrong, your applications represent your business activities, so your database models had better support anything your applications might require.

All within bounds of financial and technical feasibility of course...

Practical Geezer
Wednesday, July 16, 2003

Thanks for all the responses.  After reading through them so far I have come to the following conclusion.  I do believe the core database design is crucial for all of the core data and should be taken into consideration very early in the design stage of the project.

I suppose part of the problem is that no distinction is being made between core data, and data that is "artificial" or transient.  This data really falls into a category of runtime data, or perhaps persistent data which is dependant on the business layer requirements.

In addition, there is yet another category for data storage of application specific configuration data.  This could include console options, runtime business layer options, etc.

Therefore, I do think that perhaps for the core data, it may make sense to design this initially at the onset of a project.  But as for the persistence and configuration data that is specific to this one project, I feel this should be dictated by the business layer modeling requirements.

Seeker
Wednesday, July 16, 2003

valraven,

XP says no such thing.  What it does say is that you should withhold judgement until you know the relevant facts.

The only way to correctly understand the nuances of a problem is to try and solve it.  Thinking about a problem isn't the same as solving it because you don't have a testable result.  The very nature of solving a problem fleshes out all the issues you couldn't think of because you didn't have all the facts, not because you were weak or stupid.

Oren Miller
Wednesday, July 16, 2003

I believe the design of the middle tier datastructures should be based on the Objects you are creating and the Database should be normalized appropriately. (not necessarily a 1 to 1 relationship class to table).

If you had 10 tables in the database you don't necessarily have 10 middle tier classes (one for each table - that's silly).
The middle tier objects should be based on what they really are from a business sense.

If an object is really a join between 2 tables then you have a class with properties reflecting the fields you need out of the 2 tables. The the DB access method uses a SQL Join to bring the fields into the class. So you would have one class with fields from 2 tables.

For example:

Maybe you have a Customer class with properties CustomerName and DepartmentName. There are 2 tables in the DB, a Customer table which contains a CustomerName field and a Department table which contains the DeprtmentName field.
Your DB method Customer.getCustomer() would do a join on the 2 tables and pull in the fields from different tables, but actually place them in the Customer Object (which is one class, not two just because 2 tables are involved)

GenX'er
Wednesday, July 16, 2003

"The end of database-centric design" - Sean MCGrath

http://www.itworld.com/nl/ebiz_ent/06242003/

Slowly we begin to realize that the database is what's been holding us back....

fool for python
Wednesday, July 16, 2003

hmmmm

So XP means developing in a cloud of fog following a lot of twisty passages all alike until we run out of paper to make the map and reinvent the wheel of data organisation to store it all, no doubt thinking of all of the fields as we need them?

The words amateur and foolish spring to mind.

No the data representation need not be exactly mirrored in the UI, nor the business rules.  Business objects will likely span entire structures of tables and fields sometimes dynamically. 

Tough that the real world data doesn't conform to the UI, its not meant to, they're too entirely different things.  The UI is the abstraction of a way of looking at data (which might come from multiple sources).  There can be many UIs looking at the same set of data for different purposes and different audiences.

The database does not impose on the UI in how it represents data and relationships but it does mitigate what is represented and what is and is not valid.

Simon Lucy
Wednesday, July 16, 2003

Simon, that's a masterful troll.  I don't even know where to begin with your straw man argument

XP says nothing like what you wrote.

Brent P. Newhall
Wednesday, July 16, 2003

Regarding that IT world article, what can I say. Been there, done that, came back.
The problem with all the meta-model driven archectures is that they do not solve the problems in practice.
The problem is not "execution speed", as the writer implies. the problem in my experience is that although in theory such systems are desigened for change, in practice they break even earlier as the inevitable mistakes in the system now become almost impossible to hunt down/debug due to the mountain of intractable layers of indirection.
Yes, in a perfect world these could be the solution. In this world, sticking to KISS almost always pays of.

Just me (Sir to you)
Wednesday, July 16, 2003

Oren, i've done dozens of database systems. The only way
to solve the problem is not to pretend that i haven't done
it before. The best way is to use my experience and make
the best decisions i can.

There is no point in the process where i will ever have
all the facts. So if i make a decision now based on
experience i am a head of the game. I always know
i can make mistakes and will have to make changes.
But this is true at any point in the project. If not we
wouldn't have to refactor or test.

If you have done the planning game, talked to the customer, and make your stories, it's no secret that you
will need a database. To pretend otherwise is silly.

I suspect you don't have this attitude anywhere else in
your life. When you hit 65 do you start saving?
When you get up in the morning do you dress or wait
until later because you don't know if you need to be
dressed? Do you brush when you get a cavite? Do you
say honey, you look fat in that, because you expect what
happened last time not to happen? No.

We use judgement all the time. When and how to use judgement is part of judgment.

And just because i don't agree with XP in this area
doesn't mean i don't understand it. I do. I think it is
wrong.

valraven
Wednesday, July 16, 2003

Usually early on you will discover business objects and their relationships. These objects usually will have to be persisted somewhere. So it is not a big leap to take the objects and normalize them into a database.

I think where a lot of people go wrong they think objects = tables

DJ
Wednesday, July 16, 2003

I've seen this said tangentially here, but I want to clarify:

I start with a database. I design the database based on the real-world entities involved in the process. Then the UI comes from the business requirements for the current application. The business objects do the translation in between.

By basing the database schema on the real world entities you're insuring against change - real-world entities simply don't change that quickly. Relationships and attributes do, but those are easier to manipulate in business logic anyway.

So yes, I always design by database. But I don't let the database drive the design.

Philo

Philo
Wednesday, July 16, 2003

valraven,

I expect I do have this attitude.

Do I start saving when I'm 65? No.  But do I come up with a long term investment strategy when I'm 18 and expect it to be at all reasonable? No.  What is my risk tolerance?  What is a reasonable amount for me to set aside?  What will my other expenses be? Things like this I don't know until I try a few things out, and these and other variables are likely to change as my age, and income change.  I start saving early (which is what XP says to do, start DOING early).  What I would not do is waste my time planning out a complete investment strategy from 18-65 and expect I can possibly know any of the multitude of variables that can and will change as I learn more about how my life and investments play out.

We actually do go out without dressing before we learn otherwise (this is how I imagine you and most other children learned such things).  And although I get dressed everyday, I don't wear the same clothes everyday because I have very little forsight except very near in the future what occasion I will have to wear a particular type of outfit.  These details only become clear as I begin to find out what my day will look like.

I suspect most people don't brush their teeth to avoid cavities.  Some other form of feedback makes you learn much more quickly to do this.  Our mouth begins to smell and crud grows on our teeth.  So we learn very quickly to do this.  On the other hand I would bet very much that many people do not begin *flossing* until a cavity shows up.

As for the example with the wife.  We learn through socialization what is and isn't appropriate.  XP doesn't say forget what you know.  It just says you may not really know what you think you do. This is why when you go to another country you will make many awkward social mistakes because your experience told you things that are not true in this new environment.  This is why so many first dates are awkward (did I do/say something wrong?) The reason we learn these things is through feedback, we say something, she gets angry, or stops talking, or something.  Until that happens, we are guessing what may or may not happen.  Saying your wife looks fat is fairly obvious, but so many other things that a woman takes seriously are just not that obvious.

The point is just knowing you need a database isn't enough information to do much at all.

Oren Miller
Wednesday, July 16, 2003

>XP doesn't say forget what you know.  It just says
>you may not really know what you think you do...The >reason we learn these things is through feedback

So to be consistent if you have learned from other
projects, your feedback, you should feel free to use
that information in your new project. Every date is not
the fist date. And if you didn't use feedback learned
from earlier dates in later dates you would be far
less efficient.

valraven
Wednesday, July 16, 2003

Disclaimer: I don't do XP; I follow some of the discussions, and contribute around the edges on some test related issues.

That said, it appears that valraven's understanding of XP doesn't match mine.  So I'm going to attempt to describe how I understand XP to apply here, and ask that valraven point out "the mistake" I'm making.

The foundation of XP is the notion that "best code" can be delivered in stages, evenly spaced in time, such that stage N+1 always offers more total value to the customer than was delivered in stage N, that the increase in value is always greater than the cost to deliver it, and that the changes made for version N+1 are always those that maximize the value increased during that time interval [note - I'm less certain about this last bit than much of the rest].

This is achieved by delivering code that supports stories.  A collection of stories is created; customers describe what they want, these descriptions are divided into units of manageble size, the team estimates costs for the stories.  Work is determined by the customer prioritizing the stories based on business value and cost.  Stories can be added as the project goes along, and cost assessments may be changed along the way.  Each iteration begins with the customers prioritization of the unfinished stories.

During an iteration, developers attack stories as prioritized by the user.  Work done is specific to delivering the current story - look ahead development is frowned upon.  Work to simplify the existing code set is encouraged.

Now, one concern is that the team may paint itself into a corner - the short sited planning leading to implementation choices which, appropriate in the short turn, add to the long term cost.  XP theory asserts that, because of the continuous simplification in the existing codebase, the problem system can be cheaply shunted and replaced with a subsystem more in tune with the later goals.

That set up, where does experience come in: how do you "use my experience and make the best decisions i can"?

What I see are:
1) Story refinement - it should become easier to distinguish those story elements which call for a database implementation from those which do not, and to present those stories in such a way that the customer can correctly choose where her value lies.

2) Cost evaluation - you ought to get better at estimating the real cost of delivering those story elements, in present tense and future tense, so that the customer can make the correct determination of $1M now and $1.5M later vs $2M now.

3) Implementation cost - I expect experience to translate as improved development time, especially with regard to knowing which tests to write to ensure that the implementation behavior is appropriately constrained.


So how'd I do?

Danil
Wednesday, July 16, 2003

How does simplifying code relate to persistence?
Will you simplify your code such that it doesn't
need persistence anymore?

valraven
Wednesday, July 16, 2003

Hmmm was I trolling?

I don't think so, it was a response to something someone said about not deciding whether you need a database until you need it.  I agree, that's not XP.  I think what I was doing is called asking a rhetorical question.

XP is abused, its eminently abusable.  Its a sketch at a methodology, not Holy Writ.

Simon Lucy
Wednesday, July 16, 2003

Danil - you're close, but didn't specify one major point of XP, and that's the size of your "stories". What you described could apply to traditional software development - 1.0, 1.5, 2.0, 97, XP, whatever.

One of the radical aspects of XP is making the releases much, much smaller. For example, if you were building Excel under XP, you wouldn't wait until you had formulae and printing and cutting/pasting done - you'd deliver your first release to the customer when it's a bunch of cells that do basic math. The point is that *those functions* (cells, navigation, math) are complete - tested and version 0.9 beta. While the customer is working with the rudimentary spreadsheet, the developers are working on, say, cutting/pasting and formulae. Customer comes back with "we'd like to use arrow keys to navigate" and that gets folded into the "Page of cells" module, tested, and released.

Development is much more piecewise, but the customer is involved earlier and there's more give & take going on.

Does that make sense?

Philo

Philo
Wednesday, July 16, 2003

"How does simplifying code relate to persistence?"

The same way it relates to algorithms, I suppose.  What have you got in mind?

(Good catch by philo - my original draft did specify short iterations, but I failed to notice that para got nuked in an edit).

Danil
Wednesday, July 16, 2003

The point about the IT World article is that databases are inherently inflexible. This is usually considered a feature.  The point is we should question whether databases need to be so inflexible just because the current implementations are. We need more flexible databases while retaining most of what we like about them.

If you can't have your cake and eat it too, get another cake.

fool for python
Wednesday, July 16, 2003

Why is the practice of db refactoring not more widespread?


I see the evolution business slightly different. You start of with a simple CRUD need - best generated form the db structure. Now you need more logic - so you start layering views and sp's on the db. Then you reach the point where you want to have the sp's in busines objects in their own right - few projects would get here if they followed the evolution.

Karel
Thursday, July 17, 2003

Databases need not be seen to be inflexible, inflexible in terms of shoving data into them perhaps, though even then the collection of that data can be quite flexible.

The view the middle tier, or rather intermediary tiers provides to clients can show the data in any of the manifold combinations its possible to do.  What does need work is  OLAP.

I want to play with Titanium,  http://www.mdbs.com/html/titanium.html

Simon Lucy
Thursday, July 17, 2003

*  Recent Topics

*  Fog Creek Home