Fog Creek Software
Discussion Board

Welcome! and rules

Joel on Software

Removing Nulls from Sql Server 2000 DB?

Hi all.

I've been tasked with "speeding up" a mid-sized production system Sql Server 2000 DB.  It
is riddled with nulls...  "IsNull" all over the procs, etc.

Is it worth it to get rid of the nulls and not allow them in the
columns anymore?

If so, how to go about removing the nulls with a script?

Thanks so much,

Steve in Norcal

S. Walker
Monday, May 24, 2004

If you're having performance issues, the first thing to check on is the indexing scheme.  Adding indexes can provide dramatic speed increases.  Have you looked into your indexes yet?

Monday, May 24, 2004

Apart from truly extreme cases, I suspect getting rid of nulls will have negligible effect on overall system performance.

The first thing to do is determine where the bottlenecks actually are. There's no point spending time "fixing" things like nulls if that isn't the problem.

You should talk to your DBA (or JFDI yourself if possible/appropriate) with a view (no pun intended) to establishing what is wrong.

Then, and only then, can you determine the best course of action. After all, it may well be perfectly legitimate for these nulls to be there and "removing" them may cause more problems than it solves.

Indexing is one area to look at, though bear in mind that adding indexes can make some parts of the database slow down dramatically, even if it speeds other parts. Adding indexes should always be "by design", and based upon your emperical evidence, not just added in the hope that it'll somehow speed things up. Your DBA should be able to help you here.

If you find yourself without the aid of a DBA then you can use the tools that come with SQL2K to reasonable effect, although you should be very careful about applying changes, especially to a production database. Building a new index can take a long time and casue severe distruption to Users, although you can use the scheduler to execute jobs at night, if you prefer.

To address the nulls issue, if you really do decide that's the best course of action, after suitable investigation...

First of all you need to identify what value you will use for each occurrence of null in every column/table, as for any existing data you will need to update these nulls to something non-null before you can change the table definitions.

In an ideal world, it may be that for all data in a particular column that is null, you can default to a single value (e.g. 0 for numeric columns, '' for strings, etc). This is relatively simple, but some cases will be more complex than this.

Assuming you can determine the replacement value for all of your nulls, then you can apply these to update your tables. This may be in a script, or may require something more complex, depending upon your replacement rules, etc. You will now be in a position where no data is null.

Obviously, for the benefit of the explanation, I am considering the case where you end up with no nulls at all. In reality, you will likely keep some data as nullable.

Now your data is all non-null, you can change the table definitions, so that they no longer require nulls. This can be acheived using the tools within Enterprise Manager. If you have foreign keys, this will involve a lot of work, and take a while to execute, as Enterprise Manager will drop and re-create the relationships and the tables in order to apply your changes. You can't go from nullable to non-nullable directly, so you have to drop the FKs, copy the data to a temp table, drop the table, constraints, indexes, build the table, constraints, indexes, copy back the data, discard temp table, re-create the FKs. You can easily do this in script if you want, but Enterprise Manager does it all for you.

At this stage, your database will not have any nulls in it.

Now comes the fun part.

You need to go through all the SPs and ensure that they are still valid. You can get rid of all the ISNULL() bits (though look out for COALESCE, IS [NOT] NULL, etc too), but the most critical part is that they mustn't try to set any data to null (via INSERT or UPDATE).

If you're lucky, all access to the data will be via SPs and that'll be the end of the story. However, you should check any scripts, reports, applications, etc that use the database to see if they make any changes directly, or make any assumptions about the data being null, or not being null.

Clearly, this may turn out to be a huge undertaking, with very little return, in terms of performance gain. It could easily take weeks and introduce many subtle bugs that will take months to track down.

Realistically, you really need to investigate what is causing the performance problem first. It may well be that one more index will fix it for now, or maybe you just need to add another Gb of RAM. Both of these options are likely to be vastly cheaper and easier than a complete re-factoring of the database, and all its attendant stored procedures, scripts, etc.

As wise man once said "If it ain't broke, don't fix it".

Steve Jones (UK)
Tuesday, May 25, 2004

Forgot to mention before...

Bear in mind that nullable columns can improve performance, due to the fact that SQL reserves no space for nullable columns*, so more rows can fit on a data page, so fetches, updates, etc are faster.

Conversely, making all columns non-null forces SQL to store data for all columns/rows, so taking more space and slowing things down.

Sadly, there are many variables to consider, so there is no one-size-fits-all solution. This is why we hire DBAs, as although I know a fair bit about how SQL stores and manipulates its data, I don't know have time to do a proper DBA job.

Also, I hope this is obvious, but I'll say it anyway, please, please test any proposed changes before you apply them to the production environment. One of the most important aspects of a DBAs job, imho, is to act as gate-keeper and protect the production environment (and its Users) from distruption.

* Okay, so adding a nullable column where none existed before requires an extra byte (one bit per column, but a minimum of one byte) to use as a bit-mask to indicate whether the column is null or not.

Steve Jones (UK)
Tuesday, May 25, 2004

NULLs do not slow down databases. What on earth gave you the impression they did? You should look elsewhere for your optimisations.

Use a profiler [*] to profile the code - it WON'T be your IsNull code slowing anything down I can assure you.

Learn how to use SQL Query Analyzer (Query > Display Esimated Execution Plan) to see where the time is being used up in the query. Use Profiler to see what your app is actually sending to the db. In addition, Profiler has the "Index Tuning Wizard" which will have a guess at what indexes could be added/changed to speed up your database.

[*] or

Duncan Smart
Tuesday, May 25, 2004

In addition to all of the above - "Remove nulls to speed up the database" is logically equivalent to "Remove all instances of the letter 'm' to speed up the database." You can't just go changing the data willy-nilly to speed things up.

Of course, if the nulls are sprinkled around for no reason, rather than to indicate the absence of data, you have much bigger problems than performance.

Mike Gunderloy
Tuesday, May 25, 2004

"NULLs do not slow down databases."

True, but having to use

somefield = ISNULL(col, defvalue)
in your where clause pretty much hoses your query as far as index use is concerned.

Wednesday, May 26, 2004

*  Recent Topics

*  Fog Creek Home