Last week I was fortunate enough to attend Enterprise Data World in Washington DC. I was not just at EDW as an attendee, I was also presenting two sessions. Since it was my first EDW that meant I had no idea what to expect either as an attendee or as a presenter. On top of that I was only in town for two full days so I had to make the most of my time. I did my best to meet as many people as possible. I walked the exhibit hall floor talking with vendors for much of Wednesday afternoon.
I thought that EDW was well executed considering the hotel was undergoing some lobby renovations making it difficult at times to find a place to network effectively. Having helped run a few events you might have heard about I know how challenging it can be to pull everything together. I enjoyed many aspects of EDW but two items in particular stand out to me because I believe I see a relation between them.
First, during my session on Wednesday morning, I asked the audience how many were attending their first EDW. Almost every hand went up. It reminded me how at PASS events we often find 40-50% of the people in the room are new to the event.
The second item happened in the exhibit hall where I came across a product that claims to be the “only Hadoop RDBMS“. Here’s a link to Hortonworks describing what Hadoop is: distributed computing. So, not quite a traditional relational database. The need for this product must exist. I cannot imagine a company would invest their time and money in making a product without having a market in which to sell.
Now, think about those two things put together. Not only did a company decide that there was a need to take something non-relational and somehow add a layer of abstraction to make it relational, but there are more and more people looking to gain knowledge about data, data modeling, and data architecture.
I think it’s about time we realize that 50% new attendees to data-focused events is not the result of marketing efforts, it’s a result of the changing nature of data professionals.
In other words, the future is bright for all data professionals. Our world keeps evolving; always something new to learn.
Looks like I need to learn about relational Hadoop.
You were very gracious in your defeat. I really admire that about you. Class. Courage to come back, every time. Awesome stuff.
Defeat? WTH? I don’t recall you winning anything. Also, you didn’t even mention anything about the NULL I was wearing.
Unknown
I was at EDW last week as well and agree with your assessment. Regarding SpliceMachine, I have to admit I don’t grasp how this can really work. I thought the whole idea behind hadoop was to partition data among many different machines and use parallel processing to scan it quickly. It scales precisely because there is no need for transaction consistency, and SQL DBMS’ scale up instead of out precisely because there is that need. I’ll be interested to see how they do. It seems to me they think they are combining two technologies with distinct benefits and limitations to create one product with all of the benefits and none of the limitations. I think instead they have unwittingly created one product with all of the limitations and none of the benefits.
The trend I see creating a future bright for data professionals, and an equally disasterous one for corporate IT, is this notion that you need a lot of DBMS’, each with a narrow use case. Many corporations already have 3 DBMS platforms in house – DB2 on z/OS for their 1980’s mainframe systems, Oracle for their 1990’s client server systems, and then SQL Server for when they discovered it provided a core feature set just as good as Oracle for a fraction of the cost. Three entire software stacks and three distinct sets of specialists to administer them. Now, instead of trying to reduce those to one, the industry seems to think the answer is to actually increase it to five or seven. There will be data professional jobs everywhere. Codd must be turning over in his grave right now. He invented the relational model precisely because a relation can represent any kind of data structure at the logical level and thus simplify data management for all applications. Seems to me the goal should be to make data management for applications simpler, not more complicated.
Wonderful insights Todd, thanks for sharing. Funny, but more than once last week I thought to myself “Codd must be turning over in his grave”.
I think the situation you have described here is not entirely new. As the future for data professionals gets brighter the difficulties for corporate IT become greater as well. I’m not certain I have the answers here, but I believe part of the solution is to reduce the administrative overhead for IT and allow them to focus on helping to manage the data, and not the servers.
“Codd must be turning over in his grave”
ROTFL
Thanks, I will begin use this phrase… 🙂