17 Dec Data Modeling Is Dead; Long Live Data Modeling!
That is no longer the world in which we live. That era was one that had high costs associated with building and maintaining a database of customers.
Today’s era is one where you can subscribe to Salesforce.com for just a few dollars a day. You can decide for yourself to run a new report. How much did that same report cost in the old era? How long would it take for IT to deliver that report? That’s why businesses today are using such services, because it reduces time and costs.
Recently Karen López (blog | @datachick) wrote an article gives a good background on the differences between conceptual, logical, and physical data modeling. The TL;DR version is here: Conceptual modeling is when the business helps to map out their needs at a strategic level. Logical modeling is when a data architect gets involved and describes the business data requirements independently of the DBMS, uses, or organizational constraints of the data. Physical modeling is when a data architect and a DBA ensure that the logical model will meet the business needs for things like performance and recovery. A physical model is designed for a specific version of a DBMS or other data store.
On paper and in classrooms these mythological ideals sound great. In the non-academic world database modeling is not done as much as you might think. With the advent of cloud services such as Azure I can build and deploy applications without ever needing to have one database design or modeling session.
Here are the reasons why modeling doesn’t matter for most of us anymore.
Third Party Software Packages
We’ve all been there: someone on the business side gets a phone call from a salesman for an invite to a three-martini lunch. By the time lunch is over the business has decided that the salesman’s product is *THE* product that your business must have in order to do…well, it doesn’t matter what it promises to do, really. The point here is that your business decides to purchase some software that has a database backend. If you are lucky it is a database platform you can support, but that is not always the case. But hey, a database is a database, right? [Despite our shop being a dedicated Microsoft shop I once had a manager demand to know “how much longer” it would take for my team to support Oracle, because we were holding the business back, apparently.]
This software come in house, get’s installed, and guess what? All of your best practices and policies get thrown out the window. You make exceptions for every rule because this software requires ‘sa’ access, it needs a dedicated server, and you aren’t allowed to make any schema changes (not even additional indexes) in order to improve performance (I’m looking at you, Sharepoint).
While it is a recommended best practice to do logical and physical modeling together those tasks are not possible when purchasing a third party product. More than three-quarters of the apps supported by my customers right now are from third party vendors. Very little in house development is being done. That means they get the conceptual, logical, and physical model that was built for them by somebody else. As a DBA, you won’t even be able to touch anything, either.
You’re stuck with a pickup truck and your business expected a Ferrari.
This is the same thing as above except that you don’t host the system. The DBA doesn’t have any input into…anything! Of course the DBA will get blamed for anything that goes wrong, but that’s a topic for a different blog post.
With SaaS you get someone else’s conceptual, logical, and physical design, and you simply hope it will be good enough for your needs most of the time. Salesforce is the prime example here. Your business does need not purchase and host their own customer relationship management system these days, they can just pay for the service. They save money, and time because this also means you don’t need to spend time in those pesky database design or modeling meetings.
This is just like above but you don’t do anything except pull in data feeds. Now, once you pull in the feeds you might be thinking “hey, we’ll need to store that data”, which would mean the need for some actual modeling. But the reality is that it won’t get stored inside of a traditional database. It is much more likely to be stored, inside a spreadsheet, inside of another application such as Sharepoint. Don’t pretend as if you’ve never seen this before. If you haven’t, then just wait your turn, you will see it soon enough.
Here’s the crazy part: as much as the need for data modeling seems to be diminishing these days I also see that it is more important now than ever before. All of those companies that are providing SaaS and DaaS need to have data modelers on staff that can ensure their services can be consumed and shared easily. That stuff cannot just happen by magic. As easily as I can deploy an application to Azure without needing to know anything about modeling my application will not scale beyond a certain point without a proper database model and design. User needs must be anticipated, logical models are needed to ensure data quality, and physical designs are needed for performance and recovery. Companies run a huge risk by not having these.
Every time I meet with a customer that says “we can’t touch the app” I feel their pain. I know that our careers as data professionals are heading into the Cloud, and that often times our hands are tied. But as long as people are still building applications, as long as data needs to get shared between two endpoints, then there is always going to be the need for someone to understand how best to keep that data organized.
That’s where the data modeler, or architect, is needed most.