Fog Creek Software
Discussion Board




Development process for database applications

Adding new code or making changes to existing code for a pure software application is relatively easy... you can simply make your change, compile your new code and make the new components (DLLs or what-have-you) available to your customers.

But what about in a database application where you need to make changes to either the structure of the database or the data itself?

How do you handle this type of situation? It would seem that you could have a database script or set of SQL commands that provide the necessary changes. But in this case it would also seem you need a very strict process in place to ensure that any time a database change is made, that change makes it into a script that is released with the new code.

Does anyone have a "best practice" for updating database applications (more specifically, in a Windows environment with something like Oracle or SQLServer)?

Lenny Fitzman
Monday, May 03, 2004

If you find a best practise, please tell our suppliers ;-)

SQL scripts work pretty OK at times, but still there is a major risk that customers does not run all the scripts, or run them in the incorrect order so there is dependencies which results in the scripts not being executed correctly.

I would make a standalone EXE that ships with my application, and supply script files to be run with that utility.
That way you can assure that the DDL is run in the database correctly.

Also keep a table in the database of the DDL patch history to aviod scripts being applied twice. This is a sort of versioning of the database schema. The text file can then be shipped with the installer of the new version.

2 Textfiles gets shipped on the installation media:

DBFix1.fix:

CREATE TABLE CUSTOMER_BK
AS SELECT * FROM CUSTOMER;

DELETE FROM CUSTOMER;

ALTER TABLE CUSTOMER
ADD(NEW_FIELD VARCHAR2(10));

INSERT INTO CUSTOMER
(ID,NAME,NEW_FIELD)
SELECT ID,NAME,'New stuff' FROM CUSTOMER;

COMMIT;

[CheckSum=0xc383728d]


The second file is a text-based file that controls the execution of your DB-schema patching program:

[My DB Patch]
RequireDBSchemaVer=1.0
RunSQL('DBFix1.fix');
SetVersion('1.1');
[End]

This may seem somewhat tedious but it allows you some control over what gets executed in your database. If you feel like it this allows you to checksum your database scripts so that clients wont get creative and try to "fix" your scripts.

Also, you could have a specific requirement so that your client software will not run if it sees a DB-schema that is the wrong version.

HTH

Patrik
Monday, May 03, 2004

>I would make a standalone EXE that ships with my >application, and supply script files to be run with that
>utility.

Just to clarify on this; The executable SQL statements would be vendor-specific SQL still. Do *NOT* try to implement your own SQL parser for this utility. I cant even begin to tell you how much vendors that implement own SQL parsers to support multiple databases with "their" SQL language suck.

Trying to abastract SQL to a "platform independent" level is a sure way of running into problems.

Patrik
Monday, May 03, 2004

We did this on a project with SQL server.  We would use the standard SQL server tools to modify/update the database structure and static data.  SQL server has an API that can reverse engineer the database script from the database itself.  This script would then be checked into CVS with the code.  Our automated build would then use this script to rebuild the correct version of the database with the code.

Oren Miller
Monday, May 03, 2004

Martin Fowler discusses this topic in his article named Evolutionary Database Design:

http://www.martinfowler.com/articles/evodb.html


Monday, May 03, 2004

This is one of those reasons the folk over at dbdebunk.com are so angry[1]--apparently, if we all use what they call sixth normal form, database structure updates are entirely additive, so updates are entirely pain-free, so updates can be deployed more easily, so the database structure is more lightweight.

By the way, in 6NF, your fields do not use Null values whatsoever.  Instead, you use a 1:1 relationship for each attribute, so if there is a value for that attribute, you add the attribute to the related table.  If there is 'no known value' for the attribute, i.e. the null value, then you just don't have an entry for that attribute.

It's all laid out in this document:
http://www.hughdarwen.freeola.com/TheThirdManifesto.web/Missing-info-without-nulls.pdf


It gave me something to think about, but not much more than that.  Updating something like an "Employee" table would go from a routine single-table data set, to a mammoth set of multiply-joined tables, i.e. one table for employee name, one table for employee address, one table for employee hire date, one table for ....  Multiply this by every table in your database, and you can see why I wouldn't use this, unless I REALLY want a flexible structure.


[1] - Read any five sentences written by Fabian Pascal and you'll understand.  This guy writes books?

pds
Monday, May 03, 2004

pds:
You insert into the view which has them already joined.

MR
Monday, May 03, 2004

> But in this case it would also seem you need a very strict process in place to ensure that any time a database change is made, that change makes it into a script that is released with the new code.

We put, in the application, the SQL-specific functionality to create the database if it didn't already exist, verify the database schema if it does already exist, and change or update the database schema if it doesn't match the expected schema.

Christopher Wells
Monday, May 03, 2004

MR: yes, but...

...but if you want to remove an attribute (equivalent of setting it to null), you actually have to delete that ... tuple in the attribute table, for lack of a better word.  So in a view, you don't get that functionality by default, i.e. deleting the row from EMP_ZIPCODE is not the same as just setting the "ZIPCODE" field to null.  So again, it's not entirely the same.  But yeah, for reading/"viewing", it's great.

pds
Monday, May 03, 2004

pds:
I see what you mean. Well, if you've read any Pascal :) you'll know that's a limitation of current DBMS products. If your DBMS supported it, it could automagically translate setting something = NULL to mean deleting it from the base tables.

Or in their 6NF parlance you'd say:
UPDATE employee
  SET job = Unemployed

and it would delete from all of those other tables. Provided the DBMS did that for you I think their solution is pretty cool.


In the case of current SQL products you'd have to put a 'before' trigger on the view which then said something like:
If zipcode IS NULL
  delete from emp_zipcode
  where employee = foo

if job = 'Unemployed'
  delete from employer
  where employee = foo

etc.

Kind of clumsy -- but in that case all you'd have to do is update a single trigger when schemas changed...

MR
Monday, May 03, 2004

Although very simplistic, we implemented a version table in the db consisting of a Build ID and a memo field containing the update script.

We have a generic installer that copies any updated components then appends and runs the script in the version table. 

This makes it really easy to check for missing or skipped updates.

-John

John Murray
Monday, May 03, 2004

Fabian Pascal is a being from a different plane of existence generously donated to our dimension to show how inept and woeful are our database management systems.  Unfortunately software from this other pure plane somehow becomes polluted when translated here.  Its all very strange.

6th normal used to have a different name, meta data.

Meta data is fine and I've used when I've had to, but it can take broken wrist mangling to get anything like good performance out of it.

Last night I was helping a friend with an ORM mapping problem that lives in a purely academic world (in other words its a class project).  I happened to point out that  the more nearly complete a data model is the less likely that any database engine is going to implement it efficiently.  It didn't solve the problem but it did cast it in a somewhat more proportionate light.

In the applications I ship I include an external source script which is called by a wrapper form in the application itself.  That script is generated every time there's a change released in the database.  We make no incremental changes in the database and we certainly don't make changes to an installed database's structure.

So the process of updating the database for the user becomes a matter of backing up the data, running the update form which creates a new blank database and then copies the existing data to the new database.  The user then has to manually copy the database back to where it should be.

Simon Lucy
Tuesday, May 04, 2004

*  Recent Topics

*  Fog Creek Home