During his keynote at AWS re:Invent, Andy Jassy made some statements that seemed…questionable. Well, questionable to me, at least. Not surprising, the questionable statements focused on databases, data services, and storage.
If you are interested in watching the keynote for yourself, you can see it here: https://youtu.be/ZOIkOnW640A
The keynote is 2 hours and 44 minutes. It’s not action packed, so I recommend you adjust the speed to 1.5x. Doing that will save you an hour of viewing time. YouTube offers a transcript as well, making it easy to grab the quotes.
Now, I’m not writing this post to make Jassy or AWS look like fools in any way. The keynote is long, filled with a lot of wonderful information. AWS is doing wonderful things with databases and data services. I’m a fan of all things data.
What I have here today are a handful of statements, out of a very long keynote. I found these statements to be unfair. As someone who works in marketing, I know how keynotes work. But as a data professional, and Azure fanbois, I don’t like seeing bad information presented as truth.
Thus, today’s post is my effort to fact-check the statements that irked me the most.
You’re welcome.
Let’s get started.
AWS has 11 relational and non-relational databases. Which is much more than you’ll find anywhere else, nobody has close to half of that.
Well, AWS has 13 databases now, because later in the keynote Jassy announced Timestream and QLDB. But let’s focus on the original 11, and the statement that “nobody has close to half that”.
The 11 databases that Jassy refers to are listed on the screen behind him are as follows: RDS (MySQL, PostgreSQL, MariaDB, Oracle, SQL Server), Aurora (MySQL, PostgreSQL), DynamoDB, ElastiCache (Memcached, Redis), and Neptune.
That’s a confusing list to me, because it does not include SimpleDB, or Redshift. And Aurora is counted twice, but Aurora is really just RDS but at a higher performance tier. I don’t see how Jassy can count Aurora as something different, but he’s probably using SKU math that folks with MBAs like to use.
So, let’s count up the databases available in Azure today. And to keep it fair, I will also go by SKU, and leave off the Azure SQL Data Warehouse service.
Azure SQL Database
Azure Database for MySQL
Azure Database for PostgreSQL
Azure Database for MariaDB
Azure Cosmos DB
Azure Cache for Redis
So, that’s six. Last I checked, 6 is more than half of 11. But we are not done yet.
Cosmos DB is really three engines, and the AWS equivalent to Cosmos DB are two engines: (Dynamo, Neptune). I wrote about this earlier in 2018, for reference. So, Azure offers you one SKU, and AWS offers you two. But if we break Cosmos DB out, then Azure has 8 database services. And 8 is also more than half of 11.
More to the point, what does it matter if AWS has two databases (Dynamo, Neptune) and Azure only has Cosmos DB? I fail to understand why the number of databases offered is as important as the functionality that those services offer. At the end of the day, functionality is what should matter most for those “builders” that AWS is coveting.
I get that counting the number of databases is a convenient metric. It’s also useless.
Ok, let’s move on to the next.
In AWS, it’s the only place where you have a database migration service that allows you to switch from SQL to NoSQL or actually be able to migrate your data warehouse.
Well, Jassy certainly makes it sound easy to switch between relational and non-relational. Just a few clicks, export tables to JSON, and you are done, right? Maybe….maybe not.
The AWS Data Migration Service (DMS) documentation doesn’t talk about this “SQL toNoSQL” functionality. I did, however, find this other documentation that states you can use DynamoDB as a target for DMS, and have a relational database as a source. And then this page describes that you can use DMS to extract your database to S3 buckets, which are then imported into Redshift.
So, yeah, his statement is true. He just doesn’t talk about the nightmare of deconstructing your relational database prior to the migration. Note he didn’t use the phrase “only” here, as Azure offers a robust Data Migration Service, along with a playbook for data migrations that includes sources such as Cassandra and Access (sources not offered by AWS, by the way).
AWS has 11 different ways to get your data into the cloud depending on the nature of your data and your application. Nobody else has a little bit more than half of that.
This is the list of data transfer services on stage when Jassy makes this statement:
AWS Direct Connect
AWS Snowball
AWS Snowball Edge
AWS Snowmobile
AWS Storage Gateway
Amazon Kinesis Firehose
Amazon Kinesis Data Streams
Amazon Kinesis Video Streams
Amazon S3 Transfer Acceleration
AWS DataSync
AWS Transfer for SFTP
My first issue here is counting Kinesis three times. That seems to be a bit of a stretch, but OK. Oh, and Kinesis is listed under “Analytics”, not with the migration products. Warrants mentioning.
Now, let’s consider similar offerings from Azure. I’ll use the same method of accounting that AWS did for that slide.
Azure Data Box
Azure Data Box Disk
Azure Data Box Heavy
Azure Data Factory
Azure Event Hubs
SQL Server Stretch Database
Azure StorSimple
Azure VPN Gateway
That’s 8, and 8 is more than half of 11.
Amazon Neptune which we launched here a year ago and it’s off to really a raring start.
Amazon Neptune is currently ranked #129 on the DB-Engines rankings. Not exactly a fiery meteor cutting a path to the top of the leaderboard. But watch out Db4o, Neptune has you in their sights!
S3 is the most secure object store. It’s the only object store that allows you to audit any access to an object.
I don’t know what Jassy means by “most secure”. And the phrase “audit” can mean many different things. But Azure offers a lot of security features as well as logging.
S3 is the only object store that allows you to do cross region replication.
This is false. Azure Storage has offered this feature for years. No, I don’t know why or how a statement this false was allowed in the keynote. It’s disappointing.
The world of databases in the Old Guard commercial grade databases has been a miserable world for the last couple decades.
I won’t argue otherwise, but I would say that it’s not just the world of databases.
Warrants mentioning that those miserable databases are the exact ones that AWS wants to host for you. In other words, “commercial databases are miserable unless you are using them in our cloud”.
Seems legit.
Summary
OK AWS, listen up. You’ve got a great set of services for data and databases. And a lot of stuff said in the keynote is true, too. For example, you have the most powerful GPU offering on the market. You are a leader in many areas of cloud computing.
You don’t need to resort to these tactics, where you stretch the truth in order to make a point. Just focus on the awesome stuff you have. Talk about the wonderful support you offer your customers. You’re better than this.
When I hear statements like the ones above, it makes me think twice about all of the messages that are coming out of AWS.
I know that these are a handful of statements in a long keynote. But I still believe this was a poor effort on your part. A simple 5 minutes of research to compare and contrast services would have fixed everything above.
Hugs.
(Please don’t read this and decide to delay delivery of our Christmas gifts.)
References
https://azure.microsoft.com/en-us/services/
https://aws.amazon.com/products/
https://db-engines.com/en/ranking
https://thomaslarock.com/2018/03/azure-versus-aws-data-services-comparison/
https://thomaslarock.com/2018/03/azure-vs-aws-analytics-and-big-data-services-comparison/
Nice write-up. In the list of databases, you can possible add Snowflake to both AWS and Azure.