Fog Creek Software
Discussion Board

DBAs vs. Coders

I recently overheard a conversation between a programmer with numerous years of experience and a data base administrator.  The programmer was complaining to the DBA for allowing nulls in his database which caused the programmer's application to crash.
The DBA maintained that the database was for holding data and that null was a perfectly acceptable value. 
The DBA also maintained that the onus should be on the programmer to catch exceptions in her
code.  The programmer prevailed though (after appealing to management) and the DBA had to remove all
of the nulls from the database. 

Who is right in this situation?  Is this argument common
in the programming world?

Friday, June 25, 2004

The programmer must be able to handle null data. That's a fundamental rule I learned in CS courses back in the 70's. No reason to believe the rule has been recinded.

Friday, June 25, 2004

The programmer should definitely check for nulls.

The DBA is slightly wrong - Null is not a value - it is an absence of a value. Nulls are perfectly acceptable in certain situations. He/she should have been more persuasive. Too bad.

Friday, June 25, 2004

I've seen it go both ways - allowing nulls, and disallowing nulls (typically by putting in a default value).  It's been my experience that either way the code ends up looking similar, but instead of "or is null" you get "or is '000000000'".

Null is perfectly valid.  If someone doesn't give me a data value, how am I supposed to store it? (unless the business has rules governing such things... if gender isn't specified use X for uknown, etc).

Friday, June 25, 2004

> Who is right in this situation?

It depends on the data. Sometimes data may (for some valid reason) be absent, in which case allowing "nul" in the database is probably the best way to implement support for that absence.

Christopher Wells
Friday, June 25, 2004

If the data is only being accessed by a single application (something that never happens) and the application cannot tolerate nulls, then I'd think that the programmer ought to be trapping for that in her code.  I've never heard of anybody relying on the database to validate input.  I certainly never would.

muppet from
Friday, June 25, 2004

Now, if multiple applications (especially homegrown crap like for example, Bob in accounting got permission to use his special MDB to manipulate data) are accessing the database, then I can understand enforcing some constraints to prevent the Bob's of the world from peeing in your pool.

muppet from
Friday, June 25, 2004

In my oppinion it depends on exactly where the NULL values was in the database. I maintain my standpoint that programmers should not check for NULLs if the data model is broken.

As far as possible, you should disallow NULLs in your datamodel and use database constraints to keep data integrity between tables. In some tables NULLs may be perfectly fine, in which case the data model is not broken and the programmer should handle it.

I have a problem with "The programmer should always check for NULLs" or DBAs saying "we allow NULLs everywhere, its a perfectly valid value"...Its not a who was right/wrong problem.

You need to evaluate design descisions in your data model on a case by case basis.

Friday, June 25, 2004

Maybe it's a personal preference, but I've always maintained that an application should be aware of the data model as much as the database should be set up to enforce it.

I wouldn't want to validate input from users based on an error returned (or not returned) from my database driver.  That just seems messy to me.

muppet from
Friday, June 25, 2004

"I wouldn't want to validate input from users based on an error returned (or not returned) from my database driver.  That just seems messy to me."

It depends what you mean by "validate input". It's the DBMS' job to ensure integrity of your data. The client should not be peforming integrity checks!

Of course, if you are talking about things like "Enter a number" and the user enters a string then sure have your UI throw a fit. But the rest is the domain of the DBMS (that's why they were invented, after all).

Captain McFly
Friday, June 25, 2004

To be back on point -- if the program is expecting valid data always and the DBMS is set up to allow nulls, then there is a disconnect there which must be resolved by looking at the specs. If the specs say "Attribute XYZ can be NULL" then it's the programmer's fault for not implementing the specs correctly. If it's absent, then the specs were inadequately defined to begin with and it's nobody's fault. If the specs say "Attributes cannot be null" then the DBA should have paid attention.

Captain McFly
Friday, June 25, 2004

My opinion is, if you're storing NULLs . . . you're probably doing something wrong.

NULLs can be more tricky than crashing a program . . . having a NULL value will also wreck havoc on queries, as well (a NULL value is neither equal to nor NOT equal to an integral value). For example, you couldn't look for a integer field that is NOT equal to 7 without including NOT equal to 7 and equal to NULL . . . sucks, eh?

Bottom line, don't store 'em and continue smiling. If you feel like you must store it . . . dig just a bit deeper.

Friday, June 25, 2004

While the NULL concept is the same everywhere (or is it?), the implementation varies from platform to platform and language to language. For example, a NULL in TransactSQL is not the same thing as a NULL in VB.

NULLs are a nice academic dream, but in reality they only cause problems and headaches. Just because some people have learned how to use them and cope with the problems they cause doesn't mean NULLs are good.

I guess this could be compared to the discussion between managed code and unmanaged code, pointers vs references, etc...

Friday, June 25, 2004

As much as possible, the database should validate data from the application to protect its own integrity. This is even more important when multiple or untrusted applications access the database.

As much as possible, the application should also provide its own validation on the data going into the database, because this prevents the need for a round trip to the database and makes for a better user experience. But it should also check for errors coming back from the database instead of just ignoring them and pretending like they never happened (like when you put bogus data into the password retrieval feature of a web forum and it just says "password sent" anyway).

I agree with the comments on the data model deciding whether NULLs are allowed. If a NULL shouldn't be allowed in a specific field, then it is the database's responsibility to maintain its own integrity by putting the NOT NULL constraint on that field. It is the application programmer's responsibility to know which fields may or may not contain NULLs, and write the application accordingly. If a field that should not contain a NULL has a NULL, then the integrity of the data is broken, not the application. And that's a more serious problem than the application crashing.

Friday, June 25, 2004

There's a school of thought that says allowing NULLs in a DB is a serious mistake. I don't entirely agree with them but I do agree that NULL provides significant headaches and little/no practical benefits. NULL 'values' simply move the headache from the DBMS to the application.

The real question is -- what do NULLs mean in your DBMS? A DBMS is a collection of facts and derived facts. NULLs are a meaningless assertion.

One quite interesting solution to the NULL problem is to have a series of shadow (DBMS-maintained) tables which indicate the *reason* for a missing value. See Chapter 10 of “Practical Issues in Database Management” (Pascal).

Captain McFly
Friday, June 25, 2004

I would argue that allowing NULLs for a column depends on the application that the database depends on, but most certainly they shouldn't be eliminated altogether.  Returning a NULL is one of the most reliable ways to tell a program that no data was returned, versus "0", "zero", or any other mish-mash of custom implementations.  Generally I like to block insertion of NULLs, that way if I am seeing NULLs returned from the db, I know something's wrong. 

Friday, June 25, 2004

Most DBAs are clueless, so the programmer is probably right.

the real world
Friday, June 25, 2004

But the Coders are on a faster train to India.

Thanks Microsoft
Friday, June 25, 2004

In my opinion, the coder is wrong. You should always check for null. Relying on someone is not the smartest thing to do.

Yes, I know if the DB Admin did his job...
But that's the rub. "IF someone did X," is true in any situation. But in the real world things happen. It is our job as professionals to guard against these failings.

Is there is disconnect in this situation? I don't know. Did the developer look at the data model before programming? Did the DBA provide a complete data model?

As for the argument over whether null is valid or not, it’s up to the data base designer to decide. From personal experience I’ve seen several instances where the absence of a value made more sense than one picked arbitrarily.

I hope they made this decision to remove nulls, after making sure that this change wouldn’t break any existing applications. Who knows what queries rely on a null value being there. Yikes!

Friday, June 25, 2004

So if NULL is bad news in a database, how does one deal with creating a record with partial information?

For example, I'm supposed to create a customer record so that I can get billing done. As part of that process, I'm supposed to enter a phone number, but the data input sheet does not contain a phone number because someone else either forgot to get it or the customer refused to supply it. The input mask requires a properly formatted phone number, but there is no phone number. Is there supposed to be a 'standard' fake number? If so, what happens if that fake number is mis-keyed--now it looks like a real phone number. Is it not better to just leave the phone number field blank (NULL)?

I was taught, perhaps incorrectly, that NULL is the absence of data and was the correct choice when data was absent. I remember thinking at the time that this was logical and smart and was glad that somebody bothered to think of how to deal with the absence of data.

Ron Porter
Friday, June 25, 2004

Good job Ron, you've done it.  You've addressed the central issue of NULLS.

How would you deal with partial records without a NULL?

Exactly.  Most programmers treat data as having 2 values -- It IS Zero, or it IS something else.  But the third state of data is that the data is Undefined.

Returning a zero for data that is really Undefined confuses the issue.  When that happens, you don't know if the data is a Defined 0, or was really Undefined.  This doesn't bother most programmers, because they either don't care about the distinction, or they define it to 0 almost immediately.

Of course, 'C' does not have a value to identify this distinction -- so 'NULL' graduates to 0, or a 'null string', almost immediately.  In the database, you can only keep a NULL if you never really set that columns value -- which is sometimes important to know.

It is a problem which needs to be worked out -- how to handle NULL.  I don't think it is a problem with easy answers, or one that either side should just dismiss with a 'That's YOUR problem' kind of response.

Friday, June 25, 2004

To say "NULLs are a meaningless assertion," is false.  NULL in data says you don't know the value.  NULLs are not an academic dream, and do not "only cause headaches" for skilled professionals.  They are very valuable, and very easy to use.

Whenever it is useful to discriminate between knowing you know a value, and knowing you do not, NULLs are the absolutely correct tool.  Many business applications do not benefit from such a discrimination, and so do not need NULLable columns.  Empty strings suffice for some things.  Numeric types in the data world usually have no good analog to empty strings, so learn how to use NULL.  But also, many business applications -- even some with only textual data -- *do* benefit from such discrimination.  Scientific applications very often do.  And whether to use NULLs is a columnwise decision for the person designing the database.  There is no rule of thumb.

To answer the original post: the decision is not a question of preference for either for the DBA or for the programmer.  The need for discrimination is the only determinant.  The nature of the situation decides this.

The programmer should be perfectly happy to find a NULL in any column allowed to have NULL, behaving properly when one appears, but should in nearly all cases handle unexpected NULLs at a higher level than expected NULLs, treating that as an exception, and not a legitimate situation.  (This last aspect is to keep the code uncluttered.)  The programmer should not explicitly "always check for null" as stated earlier, except where NULL is legitimate.  This is one of the reasons modern languages provide exceptions.

Yes, "there's a school of thought that says allowing NULLs in a DB is a serious mistake," and that school is wrong.  The basis of that view is that NULLity causes problems for the mathematical foundations of relational databases, but that foundation is quite distant from the implementations we all use.  See Chris Date or Ted Codd (may he rest in peace) for the theory.

Ron Porter, NULL is *not* bad news despite claims above.  All skilled programmers know how and when to allow them, as do all skilled database professionals.  The answer to your question is that the "standard fake number" idea is shamefully bad user interface design, the perpetrator should be flogged then fired, the application corrected, and the existing data amended.

Friday, June 25, 2004

And having just reread the subject line, I'll reiterate that this is not a question of DBA vs. Programmer, but rather of correct for the situation or not correct.

Friday, June 25, 2004

Nulls are useful. How do you represent, say in C# code, an unspecified DateTime? As it's a value type you can't set it to null, so then you have to choose a "magic value" such as DateTime.MinValue... yuk. So with "nullable types" coming in the next version of .NET it'll make this sort of thing much easier. So nulls definitely don't seem to be going out of fashion.

Duncan Smart
Friday, June 25, 2004

Oooh, goodie, a rant on NULL values!! :)

I agree that NULL is a perfectly valid (lack of) value.  Think of a survey in which people must fill in little circles representing "yes" and "no."  And since this is a paper (gasp!) survey, there's no JavaScript to tell them they forgot a question.  There is a lot of real statistical value to knowing which questions people answered "no" to, and which ones they didn't answer at all.  Without NULLs, you'd have to have two bit fields in your DB to represent this -- one for answered/not answered and one for the yes/no response.

There are lots of situations where NULL is quite fact, I don't think I'd even consider using a DBMS product that didn't support it.

The problem however crops up when you start accessing DBMS systems with modern programming languages.  As someone mentioned above using C as an example, most of the popular languages don't support NULL for value-types, so there is a disconnect between the code world and the data world. 

I know it's done for performance, but it'd be nice if the language designers let me choose for myself when I need the performance of, for example, a regular datetime vs a nullable datetime, without forcing me to lose type safety and compatibility between the two, which happens when you resort to clunky hacks like wrapper structs and DateTime.MinValue.

Friday, June 25, 2004

>The programmer must be able to handle null data. That's a fundamental rule I learned in CS courses back in the 70's. No reason to believe the rule has been recinded.

>You need to evaluate design descisions in your data model on a case by case basis.

Yes, regardless, the real issue here is some standard, and then you stick to it. You can’t just decide at one time oh..yea..lets now allow nulls, or not allow nulls.

The big issue is what will your code expect for a CompnayName that is left blank. You can agree that a blank Company field is going to be either a zero length string, or null.

However, you sure as the hell better decide before you start writing code. This is not something you leave to the wind. All of a sudden, you start allowing zero length names in the company field that was previously null, then you are in big trouble, as now you must deal with two different conditions for the same meaning (that meaning is that a Company name was not filled out).

Select CompanyName from tblCustomers where tblCustomers is null


Select CompanyName from tblCustomers where tblCustomers = “”

Ouch..what a mess you instantly create!

While in theory the idea of having CompanyName un-defined sounds cool, in practice this is usually a laughable issue!

However, you still MUST nail down as to what a blank company name is going to be when it is not filled out. It is that simple.

As a rule, I setup things up as:

Blank text fields are null, and number fields are defaulted to 0 (and required).

However, it really depends on your environment.

Further, there was comments about records that are un finished. So, sure, we can often have a mailing list, but no company name. So, the real question becomes what value do you want to allow for a blank company name?

Lets assume that you hate nulls, and thus say, ok…my standard is a zero length string for blank text fields, and the value of 0 for number fields (this is ok). At this point, we never have to worry about nulls..right?


The problem is that anytime you use sql to join two tables together, and use a left join (which by the way is about 90% of my joins), then you get nulls in the empty child records. In other words, the parent record might be completely filled, or perhaps might have a blank company name with a zero length string. However when there is NOT a corresponding child record, then all fields for that child record returned are NULL in that data set.

What this means, is that even if your designers, and programmers DO NOT allow any nulls anywhere…the instant you use sql to join tougher two tables…poof! You now have  ton of nulls to deal with.

Since you HAVE TO deal with nulls (even if you hate them!), then I tend as a preference to prefer blank fields as null.

This does mean that often you can’t tell the difference between a non existent child record, or simply that some fields in the record are not filled out (this is the best argument in practical terms to allow nulls and zero length strings in the same system (I mention this just in case you were wondering how one could use null vs zero lenght things!!)). However, this situation is VERY VERY VERY rare that you need to make this distinctions between not filled out, and simply empty !

99.9999% of the time, you simply want to know if the field is empty, or not. So,  you might as well adopt the standard that empty fields have a value of null. This makes the whole application easier to deal with, and no thinking as to what values empty fields will have. Why get caught out in a whole bunch of database mumbo jumbo when all you need here is to know if a field is empty? Save that very rare exception(s) for some other people with too much time on their hands.

As mentioned, since joins produce nulls all the time, then having blank fields = null means you only have to deal with one type of field when it is empty…not two types. So, thus blank fields = null favors my choice.

Regardless, whatever you adopt…STICK TO IT, and make sure everyone knows about the assumptions made.

There might not be one rule as to using nulls, ….but the final rule is to at least stick to SOME design assumption and run with it.

Albert D. Kallal
Edmonton, Alberta Canada

Albert D. Kallal
Friday, June 25, 2004

>There is a lot of real statistical value to knowing which questions people answered "no" to, and which ones they didn't answer at all.  Without NULLs, you'd have to have two bit fields in your DB to represent this -- one for answered/not answered and one for the yes/no response.

And that above is a problem? Golly, why not just have:

1 – yes
2 – no
3 – not answered.

Don’t get me wrong, I see the point. but we don’t have to do that anymore.

Surely, we are not trying to save disk space here?  Perhaps null can mean the machine failed to read the value?

You see, the instantt YOU ARE applying an attribute to a null value, then you just gave it meaning! You can’t tell everyone that null means nothing. Then, 2 seconds later, you now tell me that null now means = “question was not answered”.

You are applying meaning to the null value here! You are free to do this..but at least be aware you applying a meaning to the null value! In your case. you ARE defining what null means...and you should not!

So, I can’t really buy into the above use of nulls. Simply build your data structures that you need to model the situation at hand. Why even being to bother with some issues of nulls vs non nulls in a simple defined problem as above? If we need our system to deal with un-answered questions, then lets golly well design the system to work with the 3 choice.

Nulls have their place, but not for: save disk space?
……a processing point of view?
……. a coding point of view?

And, what about using domain functions on the above test score examples? What about using sql on the data? The above "1,2,3" is MUCH easier to use domain, and group by stuff to total on. In fact, often you can’t even total nulls by using group by, so how would you even get a total of un-answered questions for question #34

With the above design, sql can easily generate a total for 1, 2 and 3 very easy. (group by with sum. After all, we do want a count of un-answered questions here also. right?).

Further, as mentioned, if pull that data via a sql query, and that data is in a child record , then those missing child records return null values. And, I betting in MOST automation survey systems, each question code, and answer is going to be joined to a parent record. (and, thus, all of a sudden those nulls that had meaning, actually mean something else (like the record for that one question does NOT this will ALSO produce null values for those fields!).

Albert D. Kallal
Edmonton, Alberta Canada

Albert D. Kallal
Friday, June 25, 2004

There is an interesting article here from Hugh Darwen:

I think he makes a strong argument, which is the same as Albert's.

A null is an absense of data, so you shouldn't use to to say something like 'Not completed.'  What happens if there is more than on reason for missing data.  Then null might mean 'not completed' or 'not applicable.'  How can you know which one is relevant?

The solution he offers is dependant on 'The Third Manifesto,' Hugh Darwen and Chris Dates proposal for how object oriented relational databases should work.

For the time being, however, NULLs are unavoidable when working with SQL.  As Albert points out, some SQL features are dependant upon NULLS to work.

I think a world without NULLS would be much nicer place to code, but I don't think we're going to see it anytime soon.

Ged Byrne
Saturday, June 26, 2004

>I think a world without NULLS would be much nicer place to code, but I don't think we're going to see it anytime soon.

Great stuff there Ged!

I also agree that modem dbs engines should have a “no nulls” setting. That means even joins that do NOT have a child record would return blank fields (their defaults), and NOT nulls. This feature would fix this whole mess right fast!

However, since we do have nulls, then we have to deal with them. This means for purposes a true/false field can have 3 values..and that just plain stupid! It really is a shame the whole industry adopted this null thingy!

By the way, while I make a good case as to why I want ANY empty field to be null (I just simply want to avoid the 3 value trap), there is ONE major drawback!

The ONE major shortcoming in using nulls is from a users point of view.

For end users..they can get instantly confused:

Select count(*) from tblCustomer.
10,000 records

Select count(*) from tblCustomer where Sex = “Male”
4000 records

Select count(*) from tblCustomer where Sex <> “Male”
3000 records

Hum, now why does the simple total of the above two queries not total 10,000?

It is those stupid nulls. (we have got 3000 of them in the above..right!)

Select count(*) from tblCustomer where Sex = “”
0 records

Select count(*) from tblCustomer where Sex is null
3000 records

Most of the time, we don’t use a not =, but it can bite you

So, while my standards is to say nulls for empty fields, users can VERY easily get bit by the above. Of couse..I always provide a good UI..and my users never see the sql.

However, if your users WILL in fact be using sql..then having blank fields as zero length is a major consideration (and, I have zero quarrel with anyone who would have brought up the above argument). (this means more pain for the developers, but more ease of use for the end users of sql).

Albert D. Kallal
Edmonton, Alberta Canada

Albert D. Kallal
Saturday, June 26, 2004

>then having blank fields as zero length is a major


You should never let you end users limited understanding of the datamodel influence your design descisions, and have empty textfields be space padded instead of NULL or whatever.

Create views for your SQL using users. They are happy, that way and can't really tell the difference. In the view you can perform whatever tricks is nessecary to simplify the users life, but you can also preserve your datamodel that contains the NULLs.

Saturday, June 26, 2004

Shouldn't the end user, whether you are considering their needs or their limitations, be the basis for all your design decisions?

If an application is conceptually perfect but nobody uses it, then it is a useless application.

Ged Byrne
Saturday, June 26, 2004

Well, it depends on how the users are going to get at that data. We might not have views (Does MySql have them yet?). And, even if we do, the instant a user starts to join the data, then again, MOST database systems will start to return null values again. So, even if you present data as views, you can’t allow users to join the data can you?

As said, if you can completely control the environment, and what the users sees, then there is no problem at all. There is no debate. If you don't let usrs join data..then likey we don't have a problem.

However, you can’t always control the users environment, and thus you DO HAVE to take into consideration how the users, and the data will be used. A compromise that results in extra work might in be order…but you don’t always have this choice.

Again, it really depends on your users, and what tools are available. Fact is, consideration has to be give to if you allow nulls in your system, and using a view can’t even come close to solving this problem if you allow users to join the data.

Albert D. Kallal
Edmonton, Alberta Canada

Albert D. Kallal
Saturday, June 26, 2004

To emphasise the point.  If you make select data from two tables in, say, a left join, then if the right hand side doesn't exist for the left hand table you will have NULL values.

This is normal.

I'd be interested in what would be replaced by the NULL, in numerical fields 0 might be acceptable but not in all cases, since 0 is data and NULL is not and the difference is important.

For foreign keys, you'd have what?  Some made up key that pointed at something like 'Empty'?  That belongs in the interface, not the data.

Simon Lucy
Saturday, June 26, 2004


But ofcourse. But in my experience when users present data models of what they need they are often poorly implemented. If I as a developer understand the underlying need of the user, I can often make a better data model.

I remember systems where we redid the entire data model, but presented via views and other functionality parts of the old datamodel to keep old reporting tools working.


Who uses MySQL for anything serious anyway? I've used some DB2 and much Oracle, but I also use PostgreSQL for the no-budget deals.

I've done systems that had an entire view layer over
all the tables for security purposes. No joins in the views, other than security stuff. That gives you perfectly joinable views as far as the application is concerned. The view layer simply  selects the stuff you are allowed to see.

Consider this (somewhat bad) example:





Then your users can join stuff, just like they would any "real" tables, and you can ensure both security and consistency, since you can grant users access to the views, but not necessarily to the underlying table.

This join would be perfectly possible:


But ofcourse, this might be overkill at times, and also not possible if your RDBMS doesnt implement views.

Saturday, June 26, 2004


>Who uses MySQL for anything serious anyway?

LOL! Apologies to all MySql lovers around the world!

You examples do show that you are now casting values for those nulls (NVL). Also, as mentioned, if you had any left joins..then those child records will STILL return null values. (for example, give me a list of all persons and their departments. Some persons might not yet be assigned ANY department yet (your constants have the department as a requirement, and  I might not want that).

Look, I don’t think there is any real big disagreements here.

My point at the end of the day is are nulls really needed, and on a daily bases, how do they help you?

I can NOT remember the last time I talked to a developer, and that developer said oh..boy, I am sure glad this database has null support, as it saved my a whole bunch of trboule..or solved a problem.

I been developing software for about 20 years, and virtually MOST of my development has centered around using databases. I love and enjoy database systems, as they tend to be the life blood of any good application, and in fact tend to be the most valuable computer assets that a company has.

I have no agenda against nulls, and I deal with the nulls as a regular part of development.

However, lets get away from the theory stuff (which I love), and take this discussion to the practical.

When is the last time you found real benefits by using a system with null support? I worked extensively with systems that have nulls, and other systems that do not. I just can’t say this whole null idea was any real big benefit to the whole IT industry, and in fact most of the time it likely has been an additional cost/pain that we did not need (those bunch of views and now having to use NVL to cast those nulls just supports this whole extra effort..and for what??).

I need some real good concrete examples where nulls really solve a real world problem, and not some example that simply casts a field that should only have two values (true/false for example) into a field that now has 3 values?

It is certainly not that I don’t understand nulls…but have to admit that with my considerable experience in the industry, I still find myself with a embarrassing lack of arguments to support the whole null idea.

Albert D. Kallal
Edmonton, Alberta Canada

Albert D. Kallal
Saturday, June 26, 2004

Field Name: BirthDate
Data Type: DateTime

When entering data, the birthdate wasn't given.

What do you put in that field?

Then there's the issue of data migration - you can't always control the data at inception...


Saturday, June 26, 2004

Philo, the date example is a very good example (likely the best one can come up with).

I should say that I have worked on other systems that handles this date problem equally well without the concept of nulls. String based database processors (Advanced Revelation / pick) systems store all values as a string. So, “” is for no value, and 0 = is some internal date value that I don’t know (In this system you can subtract two dates and you get number of days (just like most systems).

Thus, in pick we can go:

With SomeDate = “06/30/2004”
With SomeDate = “”

The same goes for the old dos FoxPro

For someDateField = {06/20/2004}

For someDateField = {}

In the above, the 2nd date field is simple empty, just like you test for a empty Company Name field.

The FoxPro date functions will thus return :

    year({01/01/2000})  will return 2000

    year({}) will return zero

So, the above is two systems (of many) that I have worked with, and they just work just fine with a empty date value. In sql, for that null date we go:

Where SomeDateField is null

Why not have just used:

Where someDateField = “”

The above would be easer to code also. Further, it is easer for users, since they don’t have to learn two syntaxes.

Users tend to like:

Where SomeCity = “” and someDateField = “”

Much better then:

Where SomeCity is null and someDateField is null

Further, as mentioned, it is MUCH easer to code, since code that needs to search for user input value can both work for empty vaules..or a given value.

In other words, why have two sets of syntaxes here? One to find fields with values..and then the Null to find empty fields?

So, sure, date fields are an excellent example of nulls, but most database systems that don’t have a null concept handle empty date values with great ease.

Seems to me a heck of a lot of design stuff was done JUST to have nulls, and most systems don’t miss this feature at all.

Albert D. Kallal
Edmonton, Alberta Canada

Albert D. Kallal
Sunday, June 27, 2004


>I need some real good concrete examples where nulls >really solve a real world problem

I have worked with systems that use NULLs and I also have worked with systems that don't.  If you use the CHAR datatype in Oracle you can not store NULL, since an empty CHAR field is space-padded to its declared length.

An empty CHAR(1)-filed is contains 1 space, and is not NULL. This is a major pain in the you know where.

NULLs or not are not really a major deal, its just that I find it annoying to write SQL for systems that implement such a data model without the NULLs.

This is a just an emotional thing with me, no logical argument :)

Sunday, June 27, 2004

I'll tend to make fields NOT NULL in the database whenever it's suitable to do so. Sure, the application should check for null input too, but if a null value should never occur for a certain field and this needs to be enforced, it should be enforced on both levels. Reasons:

-Your client app might not be the only piece of software accessing the database

-Even if it is, having the database check things like nulls, foreign key integrity etc is a useful sanity check against bugs in your code

Sunday, June 27, 2004

Why are nulls useful? well for an integer field, say, there's often a clear need to distinguish between '0' and 'no data entered'. For a boolean field between 'yes', 'no' and 'don't know/not known/no answer given', for example.

For character fields they are less useful I admit, but still sometimes you want to distinguish between 'this field intentionally left blank' and 'we don't have any information about this and whether it should be blank or otherwise'

Sunday, June 27, 2004

the problem with:

1 – yes
2 – no
3 – not answered (so we don't use NULL)

is that it won't work for integral types only if you make them a composite one.

eg how do you say "age":

-1 : not answered
0-99: age

sounds good, but then for the following question" what is the avg age" you do not need to count -1, so it's a shit solution in this example

Sunday, June 27, 2004

NULL or not NULL. Stupid questions. NULL is a special value for special case. Even if you replace NULL with some valid value (eg. Dec/31/9999) it will be still a special value which need special attention.

I prefer NULL, because it screws up the arithmetic operations and at least I know where should I pay special attention for a special value. Replacing it will a 'valid' value and you can start check your whole codebase.

Sunday, June 27, 2004

Albert, there are tons of examples of good uses of NULL.  In addition to those given by Matt and Philo, a great example that comes to mind is ship date on an order that has not yet shipped.  Account close date for a still-open account?  Temperature, when the sensor was not responding?

Also... SQL database systems don't store date attributes as strings, nor accept string literals for dates in insert or update statements, nor other date expressions such as where-clause comparisons.  I don't know what you were reaching for with your dBase or Pick examples, but how using an empty string in those systems differs from using NULL in the systems of today would deserve a lot more explanation to be relevant here.

Sunday, June 27, 2004

Using sentinel values is a hack.  If you have no other tool available to mark a value as special, then fine, use the sentinel.  But relational databases do have such a tool to say "none" and that tool is NULL.  Using a sentinel as a substitute for NULL in a system that provides NULL seems a mark of shame.

Sunday, June 27, 2004

This may have been said (so many posts, so little time) but it definately deserves to be repeated...the use of NUL values in a db is indirectly proportional to the "normalizedness" of that db. The more NULs the less normalized. If you have a condition where one piece of data could be left out, causing the "need" for a NUL value, that column should be in its own table. Simple as that.

anon-y-mous cow-ard
Sunday, June 27, 2004

Say what?  How do you figure?

Monday, June 28, 2004

seems like he's saying that 0-1 relationships should always be modeled as a separate table. since that's effectively what a NULL-or-valid-data column is.
then when you do a join, you'll get the NULL again.

Monday, June 28, 2004

>Also... SQL database systems don't store date attributes as strings, nor accept string literals for dates in insert or update statements

Virtually all update and insert statements in sql are straight ASCII.  Some systems requite a delimiter, and some don’t.

Select * tblCustomer where BirthDate = “05/05/2003”

Some systems want:

Select * tblCustomer where BirthDate = #05/05/2003#

What in the above is not straight ASCII?


Update tblCustomer set BirtheDate = “05/05/2003” where id = 123

The above is MOST certainly a simple straight ASCII string. What exactly do you mean that string literals are not used (what are earth is then used? Please explain this to me!!!!).

Further more, in fact a good many database systems do in fact store all data as ASCII. Those pick (and IBM’s U2 line of database servers) in fact do store all data as straight ASCII. The trick is that dates like 05/11/2004 are converted to a internal string which is a number, but IS STILL storied as a ASCII string. In fact, for today, the IBM’s U2 date function for today (06/28/2004) is 13329

The next day will be 13330

Of course, that 13329 is the number days from some date (who cares!). So, you type in 06/28/2004 as input string, and it stores the string for 13329. Fact is, most systems have some internal date format, but don’t assume that the value stored is not in ASCII. As mentioned Pick and IBM’s u2 systems will store the above date as the string 13329. You can use a raw disk editor to see this fact.

And, in the FoxPro examples, and in the Pick/U2 examples, if you create a record with a blank date, then a empty string is stored (which is the SAME for all blank fields, date, or time, or number, or currency in these systems as mentioned are ASCII, and thus a empty string works just fine an dandy here).

Obviously many have not worked with systems that store data as ASCII strings (but, this whole ascii thing is moot!).

IBM’s U2 and Pick are 30 years old database systems, but that is a VERY POOR excuse to not be aware of such systems. I will point out that XML is also straight ASCII, and those systems like Pick, and U2 are virtually carbon copies of the XML model. Yes, these systems are multi-valued, and JUST like xml, you can represent a invoice, and the invoiced details in one record. (and, any blank date,or even invoice amount is a empty string in xml. Just like it should be!!! Can you imainge if the XML spec had included nulls….what mess these nulls are!

A single record in these Pick/U2 systems are simply strings with delimiters, and tags in xml for today’s new systems.

If you worked with these database systems, then you instantly see the data structures and concepts are virtually identical, and nearly the same in execution.

Further, if you understand the architecture of how these 30 year old database engines work, you will instantly un-lock the key as to how you can have a VERY VERY high performance xml database.  These systems are very fast for record retrieval.

So, as for the internal representation? Golly..who cares about that? You mean you care if it uses magnet bits vs. some chemical process to store the bits?  You care if the disk is ASCII, or some binary number for string numbers? Lets not get high strung here and confuse the issue of logical view vs physical view. As long as the system has some date type defs, and  it gives things like the ability to subtract dates and get the number of days…really, much else don’t matter.

>I don't know what you were reaching for with your dBase or Pick examples, but how using an empty string in those systems differs from using NULL in the systems of today would deserve a lot more explanation to be relevant here.

Gee, just like systems that store ASCII, you obviously never learned the difference between nulls, and empty strings.

A huge difference, since null as a concept is NOT DEFINED!!!!!!

You can  NOT do conditional comparisons with nulls.

For example, the following is NOT legal:

If (CompanyName = null ) then

The above will not work. Any expression that you compare a un-defined value to will of course (and rightly so) result in a un-defined value.

You can can’t even go

If (null = null) then

The above will not work, since (null = null) actually evaluates to null, and NOT a true/false (Boolean) value.

However, with a empty string, you can most certainly go:

If (CompanyName = “”) then


If (“” = “”) then

So, there is a rather large difference in systems that don’t support, or have the concept of nulls.

From a users point of view, in systems that DO NOT have nulls, you can teach them to select all blank company names like:

Select * from tblCustomer where CompanyName = “”

Now, train them to select all people without a birthrate entered? 99 out of hundred will guess:

Select * from tblCustomer where BirthDate = “”

Well, guess what, the above will not work if the system is built on null values. You have to use:

Select * from tblCustomer where BirthDate is null

What this means is that you have different syntax for searching null values.

Why torture the uses, and have to train them for these different situations?

Are you really telling me that the use of nulls in simple, and intuitive? (not!!!).

Take the following common type code snip:

strBirDate = input(“What Birday to search for,”).

strSql = “select * from tblCustomer where Birthday = “ + strBirDate

Execute the above sql.

In a system that does NOT use nulls, then the user can simply enter nothing (blank), and we can search for all people with a blank birthday. (this makes perfect sense to me, and I bit you million dollars if we set up a booth in the mall to ask people how to search for blank dates..they would say to enter a blank at the search prompt……they ALL would agree on this!).

However, If your system uses null, then the above code snip will NOT work, and you have to put in a test:

strBirDate = input(“What Birday to search for,”).

If strBirDate = “” then
  strSql = “select * from tblCustomer where Birthday is null”


  strSql = “select * from tblCustomer where Birthday = “ + strBirDate

end if

Execute the above sql.

My WHOLE point here is that searching for empty fields is different then searching for null fields.  When you use Null, your results are NOT a Boolean value.
This is a real pain, and for VERY little gain!

As for the other comments about null for birthdates, why do we have to introduce the concept of concept of null? Why not use what FoxPro (12 years ago), or Pick/U2  (30 years ago), or xml  (a few years ago) use? Why does the systems from 30 years ago, 12 years ago, and the new xml all agree with me? Hum????

Select * from tblCsuotmers where BirDate = “”

The above seems like a perfectly reasonable way to search for empty dates (as to what the system stores, or does on disk, or behind the scenes…golly….do we want start that debate???).

No one here has come up with ANY good reason as to why the above syntax for a empty birthday is worse then

Select * from tblCsuotmers where BirDate is null

Albert D. Kallal
Edmonton, Alberta Canada

Albert D. Kallal
Monday, June 28, 2004

"This is a real pain, and for VERY little gain!", Albert said...

It really matters what language you use when implementing your programs, some programming languages support NULLs so you can do:


If BDAY is NULL then
end if;

But if your language doesnt support stuff like the above, NULLs can be a pain.

Monday, June 28, 2004

Some use cases for those who believe NULLs are useless:

- you seem to forget that the data types can be quite complex. One example given here is the 'date'. Imagine I have a PaymentReceivedDate column which must be populated upon receiving the payment. If I don't use nulls I have to store the default value. What is the default value for a date? 0 ? MinDate? What is the correct default date value for Unix, Windows or my Samsung cell phone? Is the MinDate portable across all platforms?  Do I have different MinDate when I am compiling my C# on Mono on Linux?

- imagine one has to add a date column to a existing table. If the column is NOT NULL I have to insert a default value. If  the column to be added is PaymentReceivedDate, I have to forcibly populate all records with dates that I know are incorrect.

- when processing financials, 0 is a meaningful value. For instance, 0 means somebody already made a payment. NULL means the payment is yet to be received. Using 0 as a default value to emulate absence of data will screw up the audit. How do you retrieve how many payments have been received in a certain month? By retrieving payments that are 0? But 0 means there *is* a payment.

- checking against a default value, shifts the onus from checking NULLs to checking default values. You still have to check them and branch the code because NULLs are there to model a business requirement. Somehow you have to do it one way or another.

- as somebody said, NULLs can be normalized. In theory at least. Imagine you have an 10 years old enterprise database with 1000 tables and a billion records (I know, this is not big by any means). In 10 years the business requirements have changed, the datamodel shifted, columns have been added and you decide normalization would help solving the NULL issues. How do you proceed? Changing a single column in this database would take a day. Changing the whole datamodel would take weeks of conversion. Updating the applications would take years of effort. Try selling the normalization to your managers. By the time you are done and tens of millions $ later, you realize the datamodel shifted again and you have to re-normalize the database once more.

Monday, June 28, 2004

NULLs are bad. This is a provable fact (read Chapter 10 “What You Don’t Know Can Hurt You: Missing Information” from “Practical Issues in Database Management” by Fabian Pascal or this link, which was provided earlier, here:

Of course I am not suggesting that, currently, in all cases their use is inappropriate (although most of the time it is). This is due to the fact that current DBMS products are not very good vis Relational Model and so we are *forced* to use NULLs when a better solution would be better. If they supported view updating better then perhaps it would be more practical to support completely normalized DBs which the view then decides what an absence of a row in that table means.

Or in the PDF link, it would automatically change to why it is null.

People keep saying “NULL means XYZ”. Remember that a DB is just the sum of its parts. Integrity constraints, business rules, keys, etc. are what the DBMS uses to give the data *meaning*. When you have a NULL, your DBMS basically says “This fact is unknowable” – namely the DBMS now has a logic hole in it. You then move this unknowable fact into your application saying:
If column supplier in table part is null, then print out “Part ABC no supplier”.

But, this might not be the case. It could be that Bob forgot to fill in that part ABC is supplied by Acme Company. It could be that the part has been discontinued and the company folded. Either case you can say NOTHING about the relation between part and supplier with a NULL. The best you can say is “I don’t know about part ABC’s supplier”.

In the Third Manifesto link (to which I alluded to in my NULL post) they suggest a DBMS controlled set of shadow tables *and operators* which maintain why an attribute is missing. The operators portion is interesting because obviously if your domain is “Salary” having an employee with “NO SALARY” may cause problems. So, the DBMS defines SUM, AVG, etc. over the domain and says for “NO SALARY” it is not supposed to be included in the counts.

This eliminates the NULL ambiguity that has been mentioned earlier.

"Imagine I have a PaymentReceivedDate column "

This is solved with the above solution. A NULL received date may mean that it hasn’t been received yet. But what if customer XYZ goes bankrupt? They are never going to pay, yet your A/R system will keep firing off letters after the due date because payment has yet to be received. What if the customer returned the item? What if the customer died? These cases would all be covered by the

"imagine one has to add a date column to a existing table."

One point of the relational model was, through views or other analogous structures, to hide DB schema changes from the application! This is not a new problem and was solved back in 1970 when Codd wrote up his paper, yet it still remains un-implemented (hint: it’s somewhat difficult for a DBMS to do this correctly and deterministically).

"when processing financials, 0 is a meaningful value."

Again, solved by the Darwin et. al submission.

"checking against a default value, shifts the onus from checking NULLs to checking default values."

It depends on what your application does – but remember that the application’s forte is dealing with *presentation*. The use of the Darwin method may require branching logic but it would also be required in the NULL method as well. The key difference is the Darwin method eliminates the problems (redundancy, consistency, integrity, etc.) associated with NULL.

"Updating the applications would take years of effort. "

In current SQL systems adding a new column will not change this – you’d still have to code your app to handle the new column (unless it was doing some neat behind-the-scenes stuff to read the DB schema and change accordingly).

But, in the case that you mentioned the correct thing would be to not have an un-normalized schema in the first place. Think about all of the supporting applications which must perform consistency and constraint checking to work around the denormalized data. Consider the maintenance of those programs, and the possibility of corrupt or inconsistent data. That is more costly than doing it correctly.

Monday, June 28, 2004

Forgot to change my name!

Captain McFly
Monday, June 28, 2004

>Who uses MySQL for anything serious anyway?

Ever heard of Sabre? Flown a plane? A pretty serious application that one. They use mysql.

Oh and NULLs ... honestly, what's the beef. You KNOW that you need to indicate a value was just not known. So then it may come down to the HOW but certainly not the why. And of course some data models can suck. But don't argue whether NULL is good or bad ... it's a necessity. Argue if its' needed in the data modle where it is used and then waste your time arguing some other way to say NULL.

Monday, June 28, 2004

"If they supported view updating better..."

And if I had wheels, I'd be a wagon.  What's your point?

When this Fabian Pascal builds his UberRelational Database product, and all our employers demand we move to it, then we can talk hypotheticals and mathematical theory.  Until then, NULLs are the right way to say "I don't have one" in the tools we have (Oracle, Sybase/MS, and DB2), and anybody who can't grok NULLs or doesn't know how to cope with them should just find work in some field they have an aptitude for.

Monday, June 28, 2004

>When this Fabian Pascal builds his UberRelational Database product, and all our employers demand we move to it, then we can talk hypotheticals and mathematical theory.

Yes, and as a matter of fact, I just read his book, and his comments on using nulls is quite interesting. He also complains, and mentions that nulls often introduce 3 values for a field, and that is not wise. (and, this is surprising, as I have a LOT of disagreements with Fabian).

I have enjoyed this thread very much. As I mentioned more then once, I do accept the valid uses of nulls, and I work with them just fine thank you.

However, I also worked with other systems that solve the problems of non entered date fields just fine also. These systems just naturally return a empty string in place the null concept. Once again several people pointed out the date examples, or number fields that can have 0, or no value? As mentioned, many string based database processors work just fine.

If you don’t ever a value for a date field, then they return a empty string (zero length), and the same goes for number fields (and, as I pointed out, there is a HUGE difference between a null field and concepts of how nulls are implemented, that that of a simple and easy to use zero length string).

All as I am saying, that MANY Other database systems work just fine WITH OUT the concept of nulls, and those systems ALSO SUPPORT non entered values for things like currency type amounts, and the common date examples that all are so eagar to share.

Further, this idea is not old, since now XML also does this.

Anyway, nulls have not been a real big benefit to the industry, and what benefits we got are often mitigated by the extra work involved.

Yes, I accept nulls are usefull..but as a whole...not a lot.

Albert D. Kallal
Edmonton, Alberta Canada

Albert D. Kallal
Monday, June 28, 2004

some more reading from Date,289142,sid41_gci962949,00.html

Monday, June 28, 2004


Your example of Customer Sex field is very poor.  The DB very correctly returns that there are 4000males, 3000 non-males and 3000 unknowns.  Say you wanted to put "Mr." in front of all male names, then it would really suck if you got those NULLs.

Every normal person understands that unknowns are neither equal to a constant, nor different.  They are just unknown.

In some cases as you said you have an option of using other values say 3 for Uknown Sex, but that has to be in the specs.  Plus it really annoys users when they are forced to make up data.  I've seen enough "go_f%^$" in stupidly required email fields to last me a lifetime.

Tuesday, June 29, 2004

Database constraints are the last line of defence for ensuring correct data. 'Nuff said.

Arron Bates
Tuesday, June 29, 2004

>Ever heard of Sabre? Flown a plane?

No, and yes. The Piper PA-28 I flew didnt have MySQL in it however ;-)

Tuesday, June 29, 2004

+a better solution would be better+


muppet from
Tuesday, June 29, 2004

>Your example of Customer Sex field is very poor.  The DB very correctly returns that there are 4000males, 3000 non-males and 3000 unknowns.  Say you wanted to put "Mr." in front of all male names, then it would really suck if you got those NULLs.

I can throw the above right back at you, and ask why in the world would one risk:

Where  Gender <> “Female”

When the following is SOOOOO MUCH better:

Where Gender = “Male”

If I use the above, then I don’t have a problem. Further, why do you make the assumptions that the 3000 are unknown? Hum, perhaps those 3000 names in the mailing list are companies? (now who has a poor example?).

Further, it would be VERY bad to use Gender <> Female, since later on, the gender might be extended to include Companies, or perhaps sexual orientation. We as a rule will thus ALWAYS use Gender = “Male”, and avoid Gender <> “Female”, as that would fail the instant more values are allowed for the gender field.

My point as to using always Gender = “Male”?

Well, then your argument about “really suck if you got those nulls” would not occur.

I can also further my argument to mention that companies don’t have a they, and that is NOT a unknown!

The ONLY way for the genders fields in the “Mr” example you give is to use a negative search, and as I pointed out that is a bad idea.

Further, why not use a zero length string in this example, and forget about the nulls?

As I said, the nulls really don’t’ solve this problem anymore so then does a zero length string.

The same goes for the email example. What is the problem with:

Where EmailName = “”

Why introduce a whole new data type, all the new sql commands (nvl, is null), and extensions to programming languages to include null data types? You could have equally solved the problem with a simple zero length string.

Why do we NOW have to go:

Where EmailName is null

I really sorry, but you have not make a good case for nulls vs a empty string in your examples. (there are SOME good examples…but the gender one is not one of them!).

Albert D. Kallal
Edmonton, Alberta Canada

Albert D. Kallal
Wednesday, June 30, 2004

"and avoid Gender <> “Female”, as that would fail the instant more values are allowed for the gender field."

LOL!  (Yes, I know it's possible, in theory.  But it's still funny.)

Thursday, July 1, 2004

With all due respect, the null is a fundamental part of the Relational Model which Dr. E. F. Codd developed in the 70s.  In 1985, Codd published a paper called "Is your database really relational?".  In this paper, he set down a number of rules (called Codd's 12 rules) with which a system must comply to be considered fully relational.

The paper came about because of an increasing trend in the early 80s for database vendors to describe their existing products as relational, or as being 'based on the relational model'.  At that time the term 'relational' was a little understood industry buzzword (yes, like OOP or XML)

Rule #3 is as follows:  "Null values (distinct from the empty character string or a string of blank characters and distinct from zero or any other number) are supported for representing missing information and inapplicable information in a systematic way, independent of data type."

Now, you may ask: "What gives Codd the right to set the rules?"  To which I respond: "What gives the makers of Monopoly or Trival Pursuit the right to set the rules?"  Certainly, you can (and many people do) change the rules to suit themselves.  But if you are going to participate in the larger gaming world (say Monopoly tournaments), you better know and stick to the rules. 

--Roger Carlson

Roger Carlson
Thursday, July 15, 2004

"and avoid Gender <> “Female”, as that would fail the instant more values are allowed for the gender field.

LOL!  (Yes, I know it's possible, in theory.  But it's still funny.) "

I created a database for an Obstetrics department and discovered that there they have a Gender value called "Indeterminate".  It is more common than you think and is actually is very sad.

Roger Carlson
Thursday, July 15, 2004

*  Recent Topics

*  Fog Creek Home