MORE ABOUT ME
Welcome! I’m Thomas…
Resources

Why You’re Wrong: The Data Professional’s Guide To Contentious Issues

Why You’re Wrong: The Data Professional’s Guide To Contentious Issues

Codd hates you: Data Professional’s Guide To Contentious Issues

Image courtesy of Michael J. Swart: http://michaeljswart.com/

The Internet is an angry place.

Lots of angry people, arguing with strangers over useless topics, or blaming former executives about things that they cannot fix. So many people trying to show that they are the smartest person in the room. So much wasted energy.

Except, of course, when it comes to arguing about databases, or data. Such debates are worth every minute if only to watch the other persons head explode when you tell them how wrong they are to use NULLs, or using tbl_ for table names, or tabs instead of spaces.

I’m here to put an end to the debates. Today we will examine the data professional’s guide to contentious issues.

You’re welcome.

SQL pronunciations

First and foremost is the fact that most don’t know how to even pronounce ‘SQL Server’. Is it ‘sequel’, or do we sound out the letters ‘Ess-Queue-Ell’? Early in my career I was repeatedly corrected to include ‘Microsoft’ whenever I said ‘sequel’ server because Sybase also had SQL Server and it was confusing. For that person. They had issues. Anyway, it is rare for me to hear anyone say ‘ess-queue-ell’ server these days, but those that do are well respected and I’d be an idiot to think about correcting them publicly on anything even when they are quite wrong. And don’t get me started on the whole MySQL crowd.

Maintenance plans

We all start out with zero knowledge of SQL Server, and we often end up using maintenance plans to handle backups and other tasks. Then, as our knowledge increases, we realize that custom scripts work best for these tasks. And then, some of us go WAY out of their way to tell mock the folks using maintenance plans how wrong they are for doing so. I really don’t care what you use, as long as you are using something and that something would be custom scripts.

Object Relational Mapping (ORM) tools

When used properly, ORM tools such as nHibernate and Entity Framework (EF) can add value to your business. Unfortunately they are rarely used properly, making ORMs one of the most hated things touching your data other than @TrumpDBA. If you want happy servers, good performance, and data quality then don’t use ORMs.

Database tools on servers

Installing database management tools, such as SQL Server Management Studio (SSMS), on your database servers can be viewed as a security risk by your audit team. But for the people that can’t stand working from a command line (see next item), the thought of a server without SSMS drives them into fits of rage. The fact is there is no reason to install management tools on the server itself. You can connect remotely just fine. And those times that you can’t? You know, for the one or two edge cases? Well, that’s when you have script(s) ready to be run from the command line.

Server Core

Server core is a wonderful way for you to keep data professionals away from your database that shouldn’t belong touching it anyway. For the data professionals that can’t live without a GUI, server core is a nightmare. I can’t wait to see how they react to SQL Server on Linux.

Triggers

Triggers, once a necessity to maintain referential integrity for your data, are still a favorite for many. They allow you to do wonderful things such as increase administrative overhead, cause performance problems, and modify data without anyone knowing. Your life as a data professional is better without them. Let’s agree to leave them in the 1990s where they belong.

Schemas

About the only time I see schemas are inside of the AdventureWorks database. Every other database appears to use the dbo schema and nothing else. No, I don’t know why, but I suspect it is because permissions are hard. While the use of schemas make ownership chaining more difficult at times just because something is a tad bit harder at times isn’t a valid reason for avoiding it altogether. If that were the case then we’d have people arguing against the use of Server Core (oh, wait…)

Object Naming

Objects should be given names that help another person to understand what they are, or what they contain. A common way to do this is “Hungarian notation”, which leaves us with common prefixes such as sp_*, vw_* and one of my all-time favorites tbl_*. None of these bother me; all I ask for is consistency throughout the schema to make things easier for anyone trying to understand what they are looking at.

Remote Dedicated Admin Connections

Another line item for the audit team: not allowing the use of the Dedicated Admin Connection (DAC). Because, you know, someone could use that to perform admin tasks. It’s as if those pesky auditors don’t trust anyone! Well, remind the auditors that since you don’t allow management tools to be installed on the server (see above), and you may not always be able to remote to the server (or walk into the datacenter), it’s a good idea to have the DAC enabled for remote connections.

Tabstops

I can’t believe that “tabs versus spaces” is even a topic for discussion at all. The answer is tabs, people. And the tabstop should equal four spaces. Anything else is an abomination. Oh, and your KEYWORDS need to be in UPPERCASE otherwise I’m not debugging your code. Let’s move on.

Test Matches Production

I’ve lost many hours of my life trying to root out performance issues based upon the statement “it ran fine in test”. Test data rarely matches production data, and if you have run your code against 10 rows and it suddenly chokes the production server’s 10 million rows, that’s not my fault. Your test data MUST match your production data. If you can’t have test match production, you should at least have an idea of the production volumes you will be hitting, or perhaps grab the stats. Or just test in production because we know that’s what you’ve been doing anyway.

The above is really only a partial list of the silly arguments I’ve seen people have over the years. While I’ve shared my opinions on the issues above, I feel it necessary to remind you that there is never one right answer. It always comes down to cost, benefits, and risk.

Well, except for triggers. And NULLs.

I hate them both and you are wrong for using them.

One Pingback/Trackback

  • Pingback: Field and Record vs. Column and Table | Michael J. Swart()

  • David Wilkinson

    LOVE IT….

    • ThomasLaRock

      Thanks!

  • Ess Queue Ell or Sequel… Neither. The correct pronunciation is, of course, “Squeal”. (Mind you, I do see the occasional job posting asking for “Sequel Server Database Administrators”…)

    • ThomasLaRock

      Oh! I forgot about that one!

  • danielsun

    Spaces, dammit, not tabs! Never tabs! And 3 spaces, not 4. Don’t ask: I have my (patented, copyrighted) reasons. 😉

    Now, as far as pronunciation is concerned: is query pronounced “queerie” or “qwarey”–the dictionary says both are valid; I tend toward the latter, but others have corrected me and claimed I should be saying the former. Tomayto, tomahto?

    • ThomasLaRock

      qwarey? Nope. Besides, they aren’t queries, they are requests. That’s why we have dm_exec_requests.

      • danielsun

        Dude, you just ruined a limerick I’ve been working on involving an old man DBA from Tucumcari….

  • Mike W

    “They allow you to do wonderful things such as increase administrative overhead, cause performance problems, and modify data without anyone knowing.”

    I laughed out loud at this, when asked by other half what I was laughing at and told her, she wouldn’t believe me so thinks I’m keeping secrets from her. Thanks, random Internet person. You got me in trouble.

    And what’s wrong with nulls?

    • ThomasLaRock

      Unknown.

  • Mìcheal Thomson

    Much of this seems sensible, but I see two problems with your approach:

    A total ban on triggers will make decent auditing difficult

    A total ban on NULLs could only make sense if both (a) you also have a total ban on OUTER JOIN and (b) either (b 1) you are happy to let defaults for values you don’t know be indistingushable from real values that just happen to equal the default value or (b 2) you always know everything.

    Tom