Tuesday, September 18, 2007

What's worse than finding a NULL in a NUMBER column?

Finding half a worm! Oops, wrong punchline. No, the worst thing is finding a zero. Twice in the last week I have had to rewrite some code to handle an application which was using zero as a magic value instead of populating a numeric column with null.

One of these columns was a code which described the nature of some free text held in another column. 1 through 6 were assigned to pre-determined categories but the users can enter text outside of those categories. My program did a lookup to expand those codes with the category description; what I hadn't anticipated was that the rows for extra-mural text would have a code of 0 instead of null. NO_DATA_FOUND. Irritating but simple enough to handle.

The other case was slightly more worrying. Here the column was one of two; the other column had a data type of date. One or the other column or neither could be populated, but not both. Overwhelmingly the date column was populated . But when the date column was null roughly have the number columns were zero. Given the nature of the data it was not credible that half those rows would really have a value of zero. Most of them should have been null. However, for a handful of those records zero was probably a legitimate value. It was just impossible to tell which rows they were.

And that is the problem with "handling" nulls by using magic values instead. The magic values often need special handling themselves. Also they can subvert our data integrity. For instance substituting a null with zero may not affect the calculation of a total but will it will change the value of an average.

Nulls are annoying and it is a good idea to avoid them in a data model. But few of us are prepared to go to 6NF. Thus we have columns for which we cannot assign values. So be it. Much better to face the fact and use nulls than to fill our columns with magic values. In a very real sense disguising nulls in such a fashion is just like having an exception handler to suppress exceptions:

...
exception
when no_data_found then
null;
....

We need to know.

2 comments:

Patrick Wolf said...

Iiihhh, magic values. That developer should be ... :-) Maybe he is used to programming language where NULL isn't easy to work with.

Patrick

Anonymous said...

Null in a boolean is worse. Whatever braindead moron invented that, should be shot. Oh right, Oracle uses it. Then it must be good.