Fog Creek Software
g
Discussion Board




Computed Columns in SQL Server 2000

In a SQL Server 2000 view, I'd like to compute an average of some fields in each record.  There are 10 fields, and I would normally just add up the fields and divide by 10.  Unfortunately, this data can be null.  In this case, I would like to ignore the nulls and just average whatever is left.  So if 9 fields were null, and the 10th was 4, then the average would be 4. 

Is there an easy way to do this?  I'd rather not use cursors, if possible.

Thanks for your help.

nathan
Wednesday, April 21, 2004

COALESCE( MyNullableField, 0 )

Duncan Smart
Wednesday, April 21, 2004

I don't think you can use COALESCE like that in this case - the zeros will skew the average.

Ankur
Wednesday, April 21, 2004

Whats wrong with:

SELECT AVG(BAR) FROM FOO ?

To the best of my knowledge both MS SQL Server and Oracle supports a bunch of aggregate functions.

Patrik
Wednesday, April 21, 2004

The AVG function works on one column down mutiple rows. The OP wants the average across mutiple columns.

Looks like you are going to have to create a user-defined function.

DJ
Wednesday, April 21, 2004

Thoroughly unpleasant little bit of business, but you could use something like

( ISNULL(@field1,0)+ISNULL(@field2,0)+ISNULL(@field3,0) ) / ( CASE WHEN @field1 IS NULL THEN 0 ELSE 1 END + CASE WHEN @field2 IS NULL THEN 0 ELSE 1 END + CASE WHEN @field3 IS NULL THEN 0 ELSE 1 END)

Which is the technique you mentioned, just with the addition of the member divisors varying based upon the number of non-null values.

Dennis Forbes
Wednesday, April 21, 2004

How about a case statement?

For Example

Declare @a float, @b float, @c float

Select @a = 5, @b = 2, @c = null

Select
      (CASE WHEN @a is null THEN 0 ELSE @a END
    +
    CASE WHEN @b is null THEN 0 ELSE @b END
    +
    CASE WHEN @c is null THEN 0  ELSE @c END)
      /3 as average

double_dark
Wednesday, April 21, 2004

Dennis - that works great.  I'm going to have to put in some error handling though for when all the fields are null... gave me a divide by zero error.

Thanks.

nathan
Wednesday, April 21, 2004

Its doable the way the OP requested by a number of unions. If you have 10 and always 10 fields you could do it like this, in SQL (no need for user functions).

create table testsum(
p1 number,
p2 number,
p3 number,
p4 number,
p5 number,
p6 number,
p7 number,
p8 number,
p9 number,
p10 number
);

I had the following testdata: (N denotes null)...

200,120,10,N,N,N,10,N,80,23    = 443 
222,N,N,444,10,19,N,N,88,2      = 785
585,21,5,N,N,20,N,39,N,3          = 673

avg(443+785+673) = 1901

With the following SQL, you get the same output:

select sum(summary_field) from (
select p1 as summary_field from testsum
union all
select p2 as summary_field from testsum
union all
select p3 as summary_field from testsum
union all
select p4 as summary_field from testsum
union all
select p5 as summary_field from testsum
union all
select p6 as summary_field from testsum
union all
select p7 as summary_field from testsum
union all
select p8 as summary_field from testsum
union all
select p9 as summary_field from testsum
union all
select p10 as summary_field from testsum)

I know the UNIONs are kindof ugly, but if you have a moderate and constant number of fields per row, I think this is a suitable solution.

Patrik
Wednesday, April 21, 2004

Patrik,

I don't understand your solution.  I would want to know the average of each of those three rows, not the overall average of all three rows.  Using your data, this is what i want:

200,120,10,N,N,N,10,N,80,23    = 443/6 =  73.83
222,N,N,444,10,19,N,N,88,2      = 785/6 = 130.83
585,21,5,N,N,20,N,39,N,3          = 673/6 =  112.17

I don't see where your solution gives me those averages.

Thanks,

nathan
Wednesday, April 21, 2004

Sorry nathan,

Here is a rewritten query using the same data that is yeilding the correct results. I take it you have some id on the row you are going to calculate the average for.

Like this:

create table testsum(
id_s number,
p1 number,
p2 number,
p3 number,
p4 number,
p5 number,
p6 number,
p7 number,
p8 number,
p9 number,
p10 number
);

Then you can modify the query to the following:

select avg(summary_field) from (
select id_s,p1 as summary_field from testsum
union all
select id_s,p2 as summary_field from testsum
union all
select id_s,p3 as summary_field from testsum
union all
select id_s,p4 as summary_field from testsum
union all
select id_s,p5 as summary_field from testsum
union all
select id_s,p6 as summary_field from testsum
union all
select id_s,p7 as summary_field from testsum
union all
select id_s,p8 as summary_field from testsum
union all
select id_s,p9 as summary_field from testsum
union all
select id_s,p10 as summary_field from testsum)
where id_s=1;

i get the following:

73.8333333333333,
130.833333333333,
112.166666666667

for id_s=1,2,3 respectively.

HTH

Patrik
Wednesday, April 21, 2004

Or you could do something like this:

select avg(x) from (
  select col1 as x union all
  select col2 union all
  select col3 union all
  select col4
) dt

You'll likely have to build this into a user-defined function - I don't remember if sqlserver lets you include a select subquery in the select list, but I'm guessing no.

schmoe
Wednesday, April 21, 2004

If you search for:
average columns null
at http://groups.google.com you will find several newsgroup postings on this subject.

This one, for example (one of the replies in the first Google entry returned for a "sqlserver" group):
http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&selm=94H6b.91185%24Sq.17702675%40twister.nyc.rr.com&rnum=4
provides the following SQL statement:

SELECT col1, col2, col3,
(COALESCE(col1, 0) + COALESCE(col2, 0) + COALESCE(col3, 0)) /
NULLIF(
CASE WHEN col1 IS NULL THEN 0.0 ELSE 1.0 END +
CASE WHEN col2 IS NULL THEN 0.0 ELSE 1.0 END +
CASE WHEN col3 IS NULL THEN 0.0 ELSE 1.0 END,
0.0) AS col_avg
FROM T

This statement can be extended to more than 3 columns. Note: This is very similar to the earlier reply by Dennis Forbes, but it also handles the case where all columns are NULL. [When you find an apparent solution to a complex SQL problem by searching Google groups, check it carefully for your situation because it may be based on some assumptions that are not relevant for you or it may not handle boundary cases (such as all NULL values) or it may even be incorrect.]

Philip Dickerson
Wednesday, April 21, 2004

You could normalize those columns out to another table which should allow for typical AVG as you'd expect to use.

MR
Wednesday, April 21, 2004

Oops, just saw it was a view. 

MR
Wednesday, April 21, 2004

Philip,

What spurred the highbrow commentary about utilizing results from Google Groups? I think any rational person would conclude that you're not-so-subtly implying that I ran and copy/pasted my solution from a search result, and if that's the case then that is just laughable, and is yet another sad commentary about the state of this industry (I mean we're talking about ISNULL and CASE here...not the cutting edge of SQL programming).

This is a trivial case and I provided the basics for the OP (taking into account that they specifically stated that they wanted it for a computed column, which rules out SELECTS, UNIONS, etc), though obviously I'm not going to waste my time dinking around with exception cases when they're pretyt clear to fix. I don't think anyone in this group is a cut-and-paste programmer, so I think your commentary is misplaced.

Dennis Forbes
Wednesday, April 21, 2004

Dennis,

My comment wasn't directed at you in any way - I'm sorry if I gave that impression. I didn't mean to imply that you had cut-and-pasted a solution. It was actually intended to be a compliment to you - that you had independently (and very quickly) come up with a solution, and searching for other solutions elsewhere showed similar solutions therefore yours was very good. (If you knew me personally, you would know that I would never insult people like my entry apparently came across; but it is very hard to convey tone and intent in a forum posting.)

My only intent was to provide another source to the original poster of finding a solution to the problem. Sorry again.

Philip Dickerson
Wednesday, April 21, 2004

Normalise your database structure and everything will look less ugly. Haven't you heard of the '0, 1 or infinity' rule? in this case, if there needs to be capacity to store more than one piece of this data for each record, there should be capacity to store an arbitrary number.

So put the data in a separate table with a foreign key tying it to the original record, rather than creating multiple fields in the original table. You'll thank yourself in the long run when it turns out an extra 4th (or was it 6th?) column needs adding. And the queries will be less ugly as you can just use aggregate functions on the subsidiary table.

Matt
Thursday, April 22, 2004

*  Recent Topics

*  Fog Creek Home