03 Mar SQL Server and NULL Values
I seemed to have struck a minor chord with my hatred for database triggers. I had very few comments, but a handful of emails and exchanged even more tweets on the subject. But one thing was made clear to me and that is people wanted to hear about why I disdain NULL values.
Fortunately for me I was reminded by Louis Davidson that it is not the feature I dislike, it is the abuse (or misuse) of that feature. This would not be the first time that Louis has reminded me that I was (or about to) make an ass of myself, so I took note of his words very carefully. Indeed, it is not necessarily database triggers that I hate, but rather the fact that they are abused…misused…actually, I hate the fact that they are used and it is so difficult to troubleshoot.
But NULLs, well, they are a different animal. Why do I say that I hate NULLs? I mean, there is nothing to them, right? They do not perform any actions like triggers, they just sit there, in your tables, doing nothing. Some of them are waiting to be discovered and some hope to be updated to have real value. It makes me think of that old Schoolhouse Rock video about that little scrap of paper. So, why so much dislike for them? What harm do they cause?
Why NULLs Can Be Harmful
The harm they cause is usually the result of a person not knowing they are there. It’s that simple, and that is why I hate them. OK, not hate, but I would rather design a database with as few NULL values as possible.
Oh sure, I understand that there are times when you do not know the value of something, and it is easy to simply define that particular column to allow for NULL values and you will come back later to update that value. Sure, sounds great. I always hear about people that plan on coming back later to fix the things they didn’t do right the first time around. I only wish I could say that I actually see that happening as often as I hear it is going to happen. So, you put a column into a table, allow it to have NULL values, and are asking yourself right now “WTF? What’s your problem?”
Again, it is because the end user is not always aware that NULL values exist. And sometimes, the end user is a developer (*ahem*) who may be brought into the middle of a project and be building some widget only to have the widget crap out once a month because their code never accounted for NULL values that get inserted (or updated) after a monthly batch load. Not that I know anyone that has ever experienced such things, but I hear about these people (*ahem*). What’s that? Oh, so my hatred for NULLs is really a hatred for the fact that I am not someone is not good enough at coding or T-SQL or database design to begin with? Maybe, but consider this…
What tool do you use to write a majority of your T-SQL queries? OK, do you know what the default connection properties are? What I am looking for is: Do you know what your ANSI_NULLS is set to? How about this question: Do you even know what ANSI_NULLS setting you want to use? Did you know you even had that option? Do you know the difference in behavior if ANSI_NULLS is set to ON or OFF? I have a handful of tools at my disposal and surprisingly they have different default settings when making connections to an instance. Some of those differences can be with ANSI_NULLS, which makes no sense to me because the whole idea around ANSI is to have a standard so you would think the rocket surgeons in charge would have thought about mandating a default ANSI_NULLS setting whenever someone makes an application that wants to connect to an database server in order to avoid crap like I have had to deal with. Sure, you want to be able to change the option, but the behavior out of the box should be a standard, in my opinion.
Now, how many DBA’s know everything there is to know about ANSI_NULLS, connection strings, the various applications being used to connect to data these days, and each and every database design in their shop? I have no idea how many of you just raised your hand, but I know mine stayed on my keyboard. Now, let’s go one level away from the DBA, and ask: how many customers understand all of this? I supposed that you could only have your users connect to pristine cubes or data marts, right? Good luck with that, because I know most people want to connect to the raw data so that they can go through it themselves. And guess who will get blamed if it turns out that data is “missing” because the end user missed the fact that a NULL value altered their result set?
So, if database designs minimized the use of NULLs, then we can minimize the amount of issues faced downstream. Ever since I took part in some Six Sigma Black Belt training I have come to understand that what I really hate are inefficient processes. The use of and troubleshooting triggers is inefficient. And the same case can be made for NULL values. Their use can lead to a lot of bad result sets being accepted as a “version of the truth”, which is something we will ultimately be on the hook for at some point. Why not save people some time by looking to minimize their use? That would seem to be more efficient for everyone.
So, maybe I don’t hate NULLs. I just hate how so many groups are all operating under the assumption that everyone knows what they are, why they are there, and how best to use them. Seems to me that with all of those assumptions chances are lots of bad business decisions are being made every day.
Wait a minute…maybe we could blame the economic downturn on the use of NULL values? Hey, maybe I found a reason to love the little buggers after all.