Last week I wrote a post that helped visualize the different data services offered by Microsoft Azure and Amazon AWS. This week I’m writing about the Azure vs. AWS Analytics and big data services comparison. This comparison took a bit longer because there are more services offered here than data services. Making the chart was also challenging. Because both Microsoft and Azure offer so many wonderful analytics and big data services, it was hard to fit them all on one page.
Just like last week I made a cheat sheet to help make sense of all the services offered. It is my hope that this post will be a starting guide for you when you need to research these analytic services. I have included relevant links for each service, along with some commentary, in the text of this post below. I’ve done my best to align the services, but there is some overlap between offerings. (Click image to embiggen)
OK, let’s break these down into their respective groups. I’m not going to do a feature comparison here because these systems evolve so quickly I’d spend all day updating the info. Instead, you get links to the documentation for everything and you can do your own comparisons as needed. I will make an effort to update the page as frequently as I am able.
Data Warehouse
Azure offerings: SQL Data Warehouse
AWS offerings: Redshift
It feels like these two services have been around forever. That’s because, in internet years, they have. Redshift goes back to 2012, and SQL DW goes back to 2009. That’s a lot of time for both Azure and AWS to learn about data warehousing as a service.
Data Processing
Azure offerings: HDInsight
AWS offerings: Elastic MapReduce
Both services are built upon Hadoop, and both are built to hook into other platforms such as Spark, Storm, and Kafka.
Data Orchestration
Azure offerings: Data Factory, Data Catalog
AWS offerings: Data Pipeline, AWS Glue
These are true enterprise-class ETL services, complete with the ability to build a data catalog. Once you try these services you will never BCP data again.
Data Analytics
Azure offerings: Stream Analytics, Data Lake Analytics, Data Lake Store
AWS offerings: Kinesis Analytics
Last week I talked about how Cosmos DB was all-in-one billing for your NoSQL needs. Well, here is the AWS version, as their Kinesis is one service whereas for Azure you need three. I didn’t list Event Hubs here for Azure, but if you want to stream data you are likely going to need that service as well. (In other words, “Analytics” is an umbrella term, and is one of the most difficult things to compare between Azure and AWS).
Data Visualization
Azure offerings: PowerBI
AWS offerings: QuickSight
I saw some demos of QuickSight while at AWS re:Invent last fall, and it looks promising. It also looks to be slightly behind PowerBI at this point. Of course, we all know most people are still using Tableau, but that is a post for a different day.
Search
Azure offerings: Elasticsearch, Azure Search
AWS offerings: Elastisearch, CloudSearch
Elastisearch for both is just a hook to the Elastisearch open source platform. For Azure, you have to get that from their marketplace (that’s what I link to because I can’t find it anywhere else). One of the biggest differences I know between the services is the number of languages supported. AWS CloudSearch claims to support 34, and Azure Search claims to support 56.
Machine Learning
Azure offerings: Machine Learning Studio, Machine Learning Workbench
AWS offerings: SageMaker, DeepLens
DeepLens isn’t available yet, but I’ve got one on pre-order as an attendee gift from re:Invent last year. I enjoyed using Azure Machine Learning Studio during my data science certification journey last year. I’m currently using it for my big data certification, too. If I get a chance I will try SageMaker and do a comparison post in the future.
Data Discovery
Azure offerings: Data Catalog, Data Lake Analytics
AWS offerings: Athena
Imagine a library without a card catalog and you need to find one book. That’s what your data looks like right now. I know you won’t believe this, but not all data is tracked or classified in any meaningful way. That’s why services like Athena and Data Catalog exist.
Pricing
Azure Pricing calculator: https://azure.microsoft.com/en-us/pricing/calculator/
AWS Pricing Calculator: https://calculator.s3.amazonaws.com/index.html
Same as last week, you will find it difficult to do an apples-to-apples comparison between services. Your best bet is to start at the pricing pages for each and work your way from there.
Summary
I hope you find this page (and this one) useful for referencing the many analytic and big data service offerings from both Microsoft Azure and Amazon Web Services. I will do my best to update this page as necessary, and offer more details and use cases as I am able.
While good there needs to be an OLAP Layer as well (MSTR, SSAS, COGNOS) etc.
That’s a great idea. And with the announcements at re:Invent this week, I think I need to update a lot of content, too.
A neat comparison of products from both vendors on capabilities. Thanks for taking time to research and put this together.