Welcome back to another installment of SQL University. This week we will be talking about a topic that is often overlooked: database design. On second thought, “overlooked” may not be the correct choice. I think a more accurate description would be NULL. That’s right, I just said that database design is an unknown entity to many. Because it is so easy to create a database, and because CPU, memory, and disk I/O is often able to overcome horrendous database design, most people don’t have any concept of what it means to actually design a database.
And for those that have heard of the words “database design”, a majority of them simply think that it means nothing more than listing out the tables they will be using to store their data. The concept of a foreign key, for example, is quite, well, foreign to them. Never mind normalization, or constraints, or anything else for that matter. I wish more people understood how a good design up front can save a lot of time in support later on.
Let’s get started.
So many times I see a database “designed” by this method:
- Create table
- Insert data into that table
And that’s it. Sometimes the steps are repeated, and many tables are created and populated, but I have seen the case where there was one database, with one table, with no indexes (or key, or constraints), and a few hundred million rows. I came across this gem when I was asked to help find out why performance was so poor for the database. Luckily I had a LOT of options to explain why performance was less than optimal.
Another thing to be aware of at this level is the need to accurately define your datatypes. Don’t just go with nvarchar(50) for everything because some design tool chooses that by default. Be aware of the choices you are making when you create the table. If you have no idea what I am talking about then go here and read this article by Greg Larsen. I’ll wait for you to come back.
Now that you have a basic awareness of some of the value in a good database design, it is time for you to expand your horizons a bit more. You need to start mapping out relationships between the tables in your database. Start thinking about what columns in your table compose a unique combination of values. While at TechEd I was able to attend Buck Woody’s presentation on database design. My favorite part of his talk, besides his comments about Brent Ozar’s sleeping habits, was how you can use nouns in order to help you design a better database. Buck starts his overview of his database design process at about the sixteen minute mark and it goes like this:
- Create and/or refine the business requirements
- Break out your nouns
- Refine nouns
- Define relationships
- Define datatypes
Steps two through four are part of a process called normalization, which is something we will discuss more in the master level.
Another item you need at this level is the ability to find your way around an ER diagram. There are a handful of tools you can use to create an ER diagram, and I won’t recommend any particular one here. The important point to take away from this level is that you are familiar enough with ER diagrams so that you can understand them to some degree.
To be at this level you should have a database that has primary and foreign keys defined, perhaps even some unique constraints. You should be able to make changes to your database based upon the ever-changing business requirements (trust me, they will always change). To truly be a model database designer, however, you will need to be adept at communicating with your business end users. You need to understand the business as well as, or perhaps better than, the end user. Otherwise you will find yourselves repeating step one (create/define business requirements) over and over again. Knowing your business will allow for you to help streamline that part of the process, and by streamline what I really mean is “make it slightly less painful than chewing on a piece of aluminum foil”.
If you can speak the language of your end user, allowing you to better define and create the necessary business requirements, then you will be well on your way to being a model database designer.
If you want to be a master database designer then you need to understand normalization, what it means to normalize (or denormalize) a database, and how to apply techniques to move your database into and out of particular normalization forms. In short, you need to dream about normalization much the same way as I dream about the elusive quad-thick slice of peppered bacon wrapped around a sea scallop and dipped in bourbon.
If you are looking for a completely dry explanation of database normalization then go here. If you want a slightly more readable format covering the basics, then go here. Personally, I like pictures, and here is a good one to help you understand normalization, especially after you read a little bit about it.By now you may be asking yourself “Self, why isn’t every database normalized by default somehow?” That’s a fair question. After all, why not have tools like SSMS force the user to create everything necessary in order for every database to be in the third normal form (3NF)? It’s because that, as with anything in life, there is a trade off whenever you want something. In this case the trade off can be performance. It could very well be the case that you want to denormalize your database, or at least parts of your database, in order to maximize performance. But as a master database designer you already knew that!
The easiest thing to do is to tell you to go and get every book Codd has ever written on the subject of database design and theory. But that would be rather shortsighted of me, considering how much of the world has changed since Codd did his most famous work. In today’s world we have “eventual consistency“, which can be viewed as a manner for which you remove the need to ever have a normalized database (or even transactions for that matter). Recently I had a person tell me that stored procedures are dying, which only served as a reminder to me that technology is always evolving. So, start with Codd, but don’t end there.
I have a section of my library dedicated to database design books. I really like Beginning Database Design: From Novice to Professional, and would recommend that highly for anyone that wants to get started by learning the fundamentals for database design. After that book, the rest is up to you.