Fog Creek Software
Discussion Board




Arrggg!!

If iPreferredNumber = 0 Then
    sPhoneRecord = IIf(IsNull(rs!PHONE_NUMBER) Or Trim( _
        rs!PHONE_NUMBER) = "", "", rs!PHONE_NUMBER)
    If Trim(sPhoneRecord) <> "" Then
        sPhoneRecord = sPhoneRecord
    End If
    lCurrentNumberDialing = 0
ElseIf iPreferredNumber = 1 Then
    sPhoneRecord = IIf(Not IsNull(rs!phone1) And Trim( _
        rs!phone1) <> "", rs!phone1, IIf(Not IsNull( _
        rs!PHONE_NUMBER), rs!PHONE_NUMBER, ""))
    lCurrentNumberDialing = 1
ElseIf iPreferredNumber = 2 Then
    sPhoneRecord = IIf(Not IsNull(rs!phone2) And Trim( _
        rs!phone2) <> "", rs!phone2, IIf(Not IsNull( _
        rs!PHONE_NUMBER), rs!PHONE_NUMBER, ""))
    lCurrentNumberDialing = 2
ElseIf iPreferredNumber = 3 Then
    sPhoneRecord = IIf(Not IsNull(rs!phone3) And Trim( _
        rs!phone3) <> "", rs!phone3, IIf(Not IsNull( _
        rs!PHONE_NUMBER), rs!PHONE_NUMBER, ""))
    lCurrentNumberDialing = 3
ElseIf iPreferredNumber = 4 Then
    sPhoneRecord = IIf(Not IsNull(rs!phone4) And Trim( _
        rs!phone4) <> "", rs!phone4, IIf(Not IsNull( _
        rs!PHONE_NUMBER), rs!PHONE_NUMBER, ""))
    lCurrentNumberDialing = 4
Else
    sPhoneRecord = IIf(IsNull(rs!PHONE_NUMBER) Or Trim( _
        rs!PHONE_NUMBER) = "", "", rs!PHONE_NUMBER)
    If Trim(sPhoneRecord) <> "" Then
        sPhoneRecord = sPhoneRecord
    End If
    lCurrentNumberDialing = 0
End If

Whiner
Friday, April 02, 2004

So what are you whining about exactly?

...
Friday, April 02, 2004

Thas a whole lotta If statements, Good Gawd!

James B.
Friday, April 02, 2004

I see it is clearly related with Joel's post about how deficient is the integration between relational and object programming paradigms and languages.

Or in a broader sense: computer programming still sucks.

.NET Developer
Friday, April 02, 2004

Why not just do the equivalent in a stored procedure/sql statement and return the result(s)

...
Friday, April 02, 2004

Why is this table not normalized?

Having multiple columns (phone1, phone2, phone3, phone4) is not just wrong, it's sinful.  This hardcoded limit constrains flexibility and makes for rotten code (as you can see).

In databases, only three numbers have meaning: zero, one, and infinite.

Alyosha`
Friday, April 02, 2004

This code would look much better with For loop or select statments.

Raju Patel
Friday, April 02, 2004

I agree -- normalization, normalization, normalization!!! :)

MR
Friday, April 02, 2004

Yet another "Bad Code" thread.

Yes this code is ugly and someone should have rewritten it a while ago rather than continue to add new cases. But, if it works and is making money, what are you complaining about?

An oil well isn't pretty either, unless it's your oil well and it's making you rich.

pdq
Friday, April 02, 2004

It may *function* correctly but in the long run the difficulties encountered  (higher likelihood of buggy code, increased probability of inconsistent/incorrect data, increased development time to deal with inconsistent/incorrect data, etc.) by non-normalized tables are typically more costly than what a redesign would be (of course, it was done right the *first* time…).

MR
Friday, April 02, 2004

Actually this if-then-else statement is repeated a total of 10 times within this 1,000+ line function.

The logic of the whole function is repeated twice within the same function since the function takes some data and creates two files (should have been a file parameter).

My favorite part of this code is the following:

If Trim(sPhoneRecord) <> "" Then
    sPhoneRecord = sPhoneRecord
End If

Whiner
Friday, April 02, 2004

And yes, the reason I'm even looking at this code today is so that I can fix it, because this whole application is rife with surprises like this and it *barely* works (it's very temperamental).

Whiner
Friday, April 02, 2004

>> "I agree -- normalization, normalization, normalization!!! :)"

What the heck is so hard about normalization?!?  Why can't 90% of the developers out there normalize?  It would make their lives (and mine!) so much easier! 

anon
Friday, April 02, 2004

So how would I normalize the function?

Author of the function
Friday, April 02, 2004

Sadly, that code is good compared to most.


Friday, April 02, 2004

>> "So how would I normalize the function?"

Good one...  I'll clarify.  I've worked almost exclusively at places where developers design & implement the databases.  What the heck is so hard about DATABASE TABLE normalization?!? 

anon
Friday, April 02, 2004

"If Trim(sPhoneRecord) <> "" Then
    sPhoneRecord = sPhoneRecord
End If"

I like that bit.  It has a certain Zen stupidity about it.  :)

Kyralessa
Friday, April 02, 2004

Whiner, you got some of my people there with you? Welcome to the shop. I see that stuff everyday. ;)

Sathyaish Chakravarthy
Saturday, April 03, 2004

Feeling really pukish. Honest! Chi! Chi! Chi!

Sathyaish Chakravarthy
Saturday, April 03, 2004

On normalisation:
Generaly if you find your self making several fields for the exact same kind of data you should break those fields out into a table of their own.
This saves you alot of nulls and also removes the limits on how many phonenumbers you can store for each parent entity.

Eric Debois
Saturday, April 03, 2004

Normalization is not *too* difficult.  A good rule of thumb is if it feels wrong, it is. :)

But more seriously -- normalization forms are defined by a clear set of rules precisely so that you don’t have to think about *how* to normalize (indeed Codd envisioned the DBMS actually running the normalization checks).  If your relation passes all the rules then it is normalized.  If it fails one, fix it, then try again.

The problem I've seen is that 99% of 'tutorial'-based normalization guides get the normal forms totally wrong, thereby teaching you nothing or worse -- teaching you incorrect habits/rules.

MR
Sunday, April 04, 2004

I'm not at all sure we are talking about normalization here.

If an office has got up to four extensions, and you need to store a number for each in the database, then having a separate filed for each witnin the table for offices follows all the rules for normalization.

It does have the disadvantage Eric points out that you will get a lot of nulls and you have to create a new field if the number goes above four so if you have a separate table for all phone numbers and then assign them to each office you avoid that problem, though you may create a greater one with the dsplay of the number of phones per office.

Incidentally Outlook has seperate fields for each phone number, so the original design can't be that bad.

Stephen Jones
Monday, April 05, 2004

If your DBMS table is created like this:
phone_number( ... phone_number, extension1, extension2, ... )

Then you have a repeating group, are in violation of 1NF, and are not normalized. QED.

Outlook may present them as a separate field for each phone number, but that doesn't mean that only one database tuple (row) is used to create/display that form.

MR
Monday, April 05, 2004

Stephen.

Normalising databases is a good thing (tm).

I have taught DMS (MS Access) to some people. I have taught SQL (mySQL and PostgreSQL) to others. No different.

The everything-is-a- spreadsheet just don't get it. If you do not nomalise data, use Excel. One of the primary reasons I move anything from excel to access is the ability to normalise data.

Some people get it. Like someone else noted, a lot of book writers out there do not seem to get normalisation.

Anyone that builds databases without understanding why

tblTelephone {name, telepone1, telephone2, telephone3, ... } is bad should be taken out back for a number six!

In this instance, searching for a telephone number requires iterating through every telephone# field.

How do you show all the people who share one number?

How do you display all the unique telephone numbers in your account?
All your SQL statements would have to be rewritten to accomodate another number field. BAD!!!

Oh yeah, and when the local telco decides to update (read change) your number, how much harder is it than if you had normalized the data.



eg.
tblName {name_id (PK), fullname, ..}
tblTelephone {tele_id (PK), telephone_number, name_id(FK)}

You could of course normalize further, but that is a different story.

Tapiwa
Monday, April 05, 2004

Coming back to the code in question, replace all those ElseIfs with a Select Case. It won't fix the underlying problems, but it might make the code more readable, and therefore make it easier to fix the underlying problems.

Martha
Monday, April 05, 2004

You are presuming that all telephone numbers are the same entitiy.

That is not necessariy true.

let us presume I am doing a list of staff. I will have one column for home telephone number, one for office telephone number and another one for mobile telephone number.

There is no braach of normalization there.

Normalization depends on the data you are normalizing and the purposes for which you want to nomralize that data.

In the case of the Outlook telephone directory or the staff telephone directory I am talking about you want to treat somebody's home telephone numbers and office telephone numbers as separate entities. Sure you may get duplications ( or reporting violatins as Tapiwa calls them). In the Outllook book you may have two or three contacts that actually share the same extension, and in the staff directory you are talking about you may get two or three members of staff who live in the same house and share the same home number. However that is still not a reason to design the database with all phone numbers as one tuple.

The truth is that there are plenty of cases where you get the same reporting violations and the knee-jerk "This isn't normalized, go back to doing your shopping lists in Excel" doesn't kick in>

How about names. Do you have  separate look up columns for all possible last names and first names, and a table for each possible last name and each possible first name?  The answer is almost certainly not and the reason is that there are going to be no attributes for each name. Now, I am going to have to design a database for the attendances and marks at a new college that is starting up, and I am going to do precisely this. The reason is that for one Arabic name, such as Mohammad, there are half a dozen possible English variations, and I am sick of  having to search for half-a-dozen possible alternatives, but this will be one of the few databases I know off that stores all names in a table. I've seen plenty of tables with customer names, but not with Jones, Smith and Spolsky all stored in a table of last names that the customer table references. Yet each time your database has two peoiple with the surname Jones you have a reporting violation.

What about prices. If you have a list of thousands of stocks and shares or widgets you will be storing the price for each one. Do you have that price field dependent on a lookup tablle that stores all possible ranges of prices from $0.001 through to $99999999.9999? Of course you don't, yet when you have two goods that sell for, or two shares that trade for, the same price you have a technical reporting violation. On the other hand a company that  has a limited range of price points may well decide to keep a separate table for each individual figure.

The relational database model depends on the real life situation of the data. Unless you are well aware of this you can't even begin to draw up a useful database schema.

Stephen Jones
Tuesday, April 06, 2004

>"let us presume I am doing a list of staff. I will have one column for home telephone number, one for office telephone number and another one for mobile telephone number."

Entity type bundling requires nulls which are BadThings and are not normalized.  And as you can see it results in annoying application code to boot.

Normalized would be:
HomeTele( User, PhoneNumber )
OfficeTele( User, PhoneNumber )
etc.

(of course, you could have a view which combined the separate tables into one which looked like your example).

>"How about names. Do you have  separate look up columns for all possible last names and first names, and a table for each possible last name and each possible first name?"

This has nothing to do with normalization.  This has something to do with *domains* -- but remember a domain does not enumerate all acceptable values.  For example, a domain for names might have the limitation that first names can only be character values (no numbers, non-printable characters, punctuation except for -, etc.). 


>"Yet each time your database has two peoiple with the surname Jones you have a reporting violation."

I don't know what a "reporting violation" is but the situation you describe does not violate any relational tenants that I am aware of (unless surname is some sort of a primary key, but the DBMS will disallow duplicates in a primary key).

> "What about prices."

Again, nothing to do with normalization.  This example has to do with domains (again) and perhaps constraints.  If you defined your domain "price" as a money value from 0.01 to infinity you could then make a subdomain for a particular type of product (say, lawn mowers) which had a range from $500 to $3000.  Or, you could achive the same thing with a constraint that says the price must be between a certain range.

In any case, it's obvious that the OP's design he was working with was VeryBad(TM) and the primary reason was a failure in correctly defining a logical data model.

MR
Tuesday, April 06, 2004

ARen't we going to have as many nulls whether the home telephone, office telephone and mobile fields are in a query as you suggest, or in a table. After all the guy either has a telephone or he doesn't!

More to the point, aren't you simply transferring the normalization breach to another table. If the same telephone number is only used by one user then having a field for telephone number in the user table doesn't cause a normalizaiton breach. And if the same telephone belongs to more than one user, then you are going to have to have more than one user field in the telphone number table, and most of those second and third user fields will be nulls.

I'm not confusing normalization with domain with regard to prices or names. I am pointing out that nobody thinks of keeping a table of all possible last names, or all possible prices. Yet not doing so is as much a technical breach of normalization rules as ren-entering the same telephone number more than once.

Stephen Jones
Tuesday, April 06, 2004

>"ARen't we going to have as many nulls whether the home telephone, office telephone and mobile fields are in a query as you suggest, or in a table. After all the guy either has a telephone or he doesn't!"

I'm not sure what you're trying to say but the existence of a row in the business_phone relation tells us whether or not john doe has a business phone number.

>"aren't you simply transferring the normalization breach to another table. "

NO.  more below:

"And if the same telephone belongs to more than one user, then you are going to have to have more than one user field in the telphone number table"

IF a phone number belongs to more than one user then you need a *DIFFERENT* table structure to avoid redundancy.  The "null" values to which you refer are to your original, bad, table design, not the normalized version.

>"I am pointing out that nobody thinks of keeping a table of all possible last names, or all possible prices"

But you have not pointed out how that is germaine to the problem of normalization.

MR
Tuesday, April 06, 2004

Stephen.

i agree with you one one point. Without knowledge of the data, the constraints, domain sizes, and the use of said databases, we can't really give a definitive answer.

That said, I have found in my experience, that  tblWidgets(value1, value2, value3, ... ) data structures are a bad thing.

I am not against redundancy at all in tables. I just think you can have faster data access, and better management with just a bit of normalising effort.

In the telephone example, I might have two tables.

tblUser (user_id (PK), username)
tblTelepone (tel_id (PK), user_id(FK), telephone_no, telephone_label, ... )

All the telephone numbers would then go into one table. I might even go as far as to have a tblLabel, which has a list of the available telephone categories (home, business, mobile, emergency etc) and have this reference by the tblTelephone.

The beauty of such a design is that it is easy to have give one person an infinite number of telephone numbers. Should we decide on another category (they did not have mobiles in widespread use 10 yrs ago, it is just a question of creating another record in the tblLabels, without having to mess with the database structure.

Similarly it is pretty easy to search for a number (only one column in the table).

Tapiwa
Wednesday, April 07, 2004

Stop prefixing tables with tbl! :)

Tapiwa -- if I understand correctly your design means that a non-key attribute is functionally dependent on *part* of the key (e.g. the label) and so not in 2NF.

I think the normalized version would be the separation into tables as I proposed earlier.  It might be something like this, too:
User-Phone( User, PhoneNumber )
HomePhone( PhoneNumber )
WorkPhone( PhoneNumber )

In order to avoid changing your application when new phone types appeared you would create a view which joined all of the phone base tables together and simply attach new tables to the view.

MR
Wednesday, April 07, 2004

I always attach tbl to tables, frm ot forms and qry to queries. Hungarian is alive and well among Access programmers and with good reason!

Going back to the idea of having all the phones as separate tables, what happens when more than one user shares the same phone number?

Stephen Jones
Wednesday, April 07, 2004

----">"I am pointing out that nobody thinks of keeping a table of all possible last names, or all possible prices"

But you have not pointed out how that is germaine to the problem of normalization. "------

You seem to have a blind spot here, though I am sure you are not alone. If I have a table of sales where one of the entries is a product you can bet your bottom  dollar that every database programmer who has ever heard of normalization will have that  product kept in a separate table of products. If there he has a customer for that transaction, you can bet your bottom dollar that he has all those customers kept in a list of customers.

Now the one thing almost nobody does is have the last names of the customers kept in a separate table of last names. It is considered a sin against normalization to have the same product entered twice in a row in a table, instead of having that row linked to a product table, yet you can have millons of Smiths and Jones's in a table and nobody will even think of putting all the possible last names in a table of their own.

The point of this observation? You can't talk about normalization as being something that is applied automatically. This has nothing to do with optimization. It is simply that duplicate entries for names, dates and numbers are so routine nobody even realizes they go against normalizaton.

Stephen Jones
Wednesday, April 07, 2004

Logically, a database table represents a set of true propositions about real-world entities of interest.  Tables representing propositions about entities of *one type* are said to be fully normalized.

Less formally a base table can be viewed as representing an entity type, defined as the collection of all attributes of interest of entities with regard to that type.

Remember that relations are sets of tuples and that a tuple is a set of <A,T,v> triples, where A is an *attribute* name, T is a *type name*, and v is a *value* of type T.

Keeping a list of products in a products table and a list of customers in a customers table is done so because products and customers are *entities*.

You wouldn't (normally) create a table of "last names" because (normally) they are not *entities* by themselves.  They are *attribute values* -- the attribute is the PERSON SURNAME.

You are stating that it is a *requirement* of some normal form (in order to violate normalization there must be a normal form that you are violating) that EVERY attribute value be a foreign key.  Please, tell me which normal form requires this?

Normalization *is* something that can be done automatically – that is why it is a set of rules which can be applied in a mechanical fashion.

Of course, if your business rule required that everyone’s last name be part of a certain set there is nothing stopping you making a set of acceptable last names and then using that.  But, not doing so is not a violation of any normal form that I know of.

MR
Wednesday, April 07, 2004

You're making my point. The database designer defines what are entities based on his analysis of the real world situation.

That situation says that products are normally nearly always entities and last names are not. As a result there is never any discussion.

The question arises when you have a gray area. Are telephone numbers an attribute of the employee like his last and frst names, or is it better to view them as entities. The decisio os a conscious one and automatic application of normalization rules doesn't enter into it.

Stephen Jones
Wednesday, April 07, 2004

>"That situation says that products are normally nearly always entities and last names are not. As a result there is never any discussion."

But no "sin of normalization" as you have continually posted.

>"The question arises when you have a gray area. Are telephone numbers an attribute of the employee like his last and frst names, or is it better to view them as entities. The decisio os a conscious one and automatic application of normalization rules doesn't enter into it. "

Well, remember that normalization does not handle nulls.  It is clear by the IF block that the phones *can* be null, meaning they need to be normalized away into separate tables so you don't end up with this garbage.

Also, normalization is something that assumes certain things -- namely that you're working with a true RDBMS.  In a true RDBMS, you have far richer features and functionality which enable *full* automatic normalization. 

However, there is enough of the RM in current SQL products that you can *almost* apply every normalization rule automatically.

MR
Thursday, April 08, 2004

If you have shared phones then puting them in separate tablles merely transfers the nulls to the user field in that table.

Select Case would certainly make the code a lot more reasonable.

Stephen Jones
Thursday, April 08, 2004

*  Recent Topics

*  Fog Creek Home