Cloud Computing Archives - Thomas LaRock

Why AWS and Azure Benchmarks Don’t Matter to Me

Thomas LaRock — Sat, 01 Feb 2020 17:06:23 +0000

Last October I wrote a review of the Gigaom SQL Transactional Processing Price-Performance test. That post references the original data warehouse test also published by Gigaom. I believe both Gigaom reports were funded by Microsoft. I found the first report on data warehousing to be of good quality. The SQL transaction report had some weak areas (IMO) which I detail in this post.

This latest report, well, I didn’t bother promoting it, or writing a review. I felt this report was incomplete and suggested to Microsoft they have at least one more round of revisions. They did not agree.

Turns out AWS had similar concerns. Earlier this week I found this tweet from Corey Quinn (@QuinnyPig), linking to a blog post from AWS titled “Fact-checking GigaOm’s Microsoft-sponsored benchmark claims“.

With all love and respect to @azure: Are you folks out of your goddamned minds? https://t.co/A7qos0XZgC

Go read that and then come back.— Corey Quinn (@QuinnyPig) January 30, 2020

You can read for yourself the response from AWS, it details many of the same concerns I had with the report.

But one comment from AWS stood out to me, it was “Publishing misleading benchmarks is just one more old-guard tactic by Microsoft”.

OK, I’ve got some opinions to share.

Grow Up

I’m not going to defend Microsoft or the quality of the report. Microsoft may not be perfect, but right now AWS looks more “old-guard” than Microsoft does. I watch Andy Jassy spread misinformation from the re:Invent keynote. Jassy did similar tricks at the last keynote, and it was awful to watch. There’s a lot to love about AWS, but the idea that they don’t use similar poor marketing tactics as other companies is laughable.

As for these AWS and Azure benchmark reports, I find them to be a fairly useless disk-measuring contest. The cloud technology changes fast. These reports are out of date just after the ink dries. I do not believe there is a company today making a choice for being “all-in” on AWS or Azure and basing their decision on such marketing propaganda. “Oh, look, this article says they are the fastest, let’s use them!”

Look, anyone can build a contrived edge case that will show their hardware outperforms someone else. Watching AWS and Azure bicker over these reports is like listening to two rich kids argue during lunch who has the nicest sports car.

No one cares, Richie Rich.

Summary

As a user of cloud services what I want is a reliable, secure, stable, and affordable solution. That’s it. I expect you to be updating your hardware and configurations to make things better. I expect you are making your services easier to consume, administer, and monitor.

We don’t need these AWS and Azure benchmark reports to see whose disk is bigger. We need guidance on what servers to select, how to configure my workload, how to monitor, and adjust as necessary.

Give us more of that content, please.

Focus on building great services, creating happy customers, and less on poking holes in each other.

The post Why AWS and Azure Benchmarks Don’t Matter to Me appeared first on Thomas LaRock.

Updated Analytics and Big Data Comparison: AWS vs. Azure

Thomas LaRock — Thu, 02 May 2019 21:07:44 +0000

Building upon my earlier post, today I want to share with you the updated graphic and links for the analytics and big data services offered by Microsoft Azure and Amazon Web Services.

It is my hope that this post will be a starting guide for you when you need to research these analytic and big data services. I have included relevant links for each service, along with some commentary, in the text of this post below. I’ve done my best to align the services, but there is some overlap between offerings. (Click image to embiggen)

I’m not going to do a feature comparison here because these systems evolve so quickly I’d spend all day updating the info. Instead, you get links to the documentation for everything and you can do your own comparisons as needed. I will make an effort to update the page as frequently as I am able.

Data Warehouse

Azure offerings: SQL Data Warehouse

AWS offerings: Redshift

It feels like these two services have been around forever. That’s because, in internet years, they have. Redshift goes back to 2012, and SQL DW goes back to 2009. That’s a lot of time for both Azure and AWS to learn about data warehousing as a service.

Data Processing

Azure offerings: HDInsight

AWS offerings: Elastic MapReduce

Both services are built upon Hadoop, and both are built to hook into other platforms such as Spark, Storm, and Kafka.

Data Orchestration

Azure offerings: Data Factory, Data Catalog

AWS offerings: Data Pipeline, AWS Glue

These are true enterprise-class ETL services, complete with the ability to build a data catalog. Once you try these services you will never BCP data again.

Data Analytics

Azure offerings: Stream Analytics, Data Lake, Databricks

AWS offerings: Lake Formation, Kinesis Analytics, Elastic MapReduce

I didn’t list Event Hubs here for Azure, but if you want to stream data you are likely going to need that service as well. And Kinesis is broken down into specific streams, too. (In other words, “Analytics” is an umbrella term, and is one of the most difficult things to compare between Azure and AWS).

Data Visualization

Azure offerings: PowerBI

AWS offerings: QuickSight

I saw some demos of QuickSight while at AWS re:Invent last fall, and it looks promising. It also looks to be slightly behind PowerBI at this point. Of course, we all know most people are still using Tableau, but that is a post for a different day.

Search

Azure offerings: Elasticsearch, Azure Search

AWS offerings: Elastisearch, CloudSearch

Elastisearch for both is just a hook to the Elastisearch open source platform. For Azure, you have to get that from their marketplace (that’s what I link to because I can’t find it anywhere else). One of the biggest differences I know between the services is the number of languages supported. AWS CloudSearch claims to support 34, and Azure Search claims to support 56.

Machine Learning

Azure offerings: Machine Learning Studio, Machine Learning Service

AWS offerings: SageMaker, DeepLens

DeepLens is a piece of hardware, but I wanted to call it out because you will hear it mentioned. When you use DeepLens you use a handful of AWS services such as SageMaker, Lambda, and S3 storage. I enjoyed using Azure Machine Learning Studio during my data science and big data certifications. But the same thing is true, you use associated services. This make price comparisons difficult.

Data Discovery

Azure offerings: Data Catalog, Data Lake Analytics

AWS offerings: Athena

Imagine a library without a card catalog and you need to find one book. That’s what your data looks like right now. I know you won’t believe this, but not all data is tracked or classified in any meaningful way. That’s why services like Athena and Data Catalog exist.

Pricing

Azure Pricing calculator: https://azure.microsoft.com/en-us/pricing/calculator/

AWS Pricing Calculator: https://calculator.aws/

Same as the previous post, you will find it difficult to do an apples-to-apples comparison between services. Your best bet is to start at the pricing pages for each and work your way from there.

Summary

I hope you find this page (and this one) useful for referencing the many analytic and big data service offerings from both Microsoft Azure and Amazon Web Services. I will do my best to update this page as necessary, and offer more details and use cases as I am able.

The post Updated Analytics and Big Data Comparison: AWS vs. Azure appeared first on Thomas LaRock.

Updated Data Services Comparison: AWS vs. Azure

Thomas LaRock — Wed, 01 May 2019 16:59:25 +0000

Last year I wrote a post comparing the data services offered by both AWS and Microsoft Azure. Well, there’s been some changes since, so it was time to provide an updated graphic and links.

Since both Microsoft Azure and Amazon Web Services offer many data services, I thought it worth the time to create a graphic to help everyone understand the services a bit more. Essentially, I wanted to build a cheat sheet for any data services comparison (click to embiggen):

You might notice that there is no Data Warehouse category. That category is located in the Analytics and Big Data comparison chart which I will share in a future post.

It is my hope that this post will be a starting guide for you when you need to research cloud data services. I’m not going to do a feature comparison here because these systems evolve so quickly I’d spend all day updating the info. Instead, you get links to the documentation for everything and you can do your own comparisons as needed. I hope to have future posts that help break down features and costs, but for now let’s keep it simple.

Relational

Azure offerings: SQL Database, Database for MySQL, Database for PostgreSQL, Database for MariaDB

AWS offerings: RDS, Aurora

RDS is an umbrella term, as it is six engines in total, and it includes Amazon Aurora, MySQL, MariaDB, Oracle, Microsoft SQL Server, and PostgreSQL. I’ve listed Aurora as a distinct offering because it is the high-end service dedicated to MySQL and PostgreSQL. Since Azure also offers those distinct services it made sense to break Aurora out from RDS. (Or, to put it another way, if I didn’t call out Aurora here you’d finish this post and say ‘what about Aurora’, and now you don’t have to ask that question.)

NoSQL – Key/Value

Azure offerings: Cosmos DB, Table Storage

AWS offerings: DynamoDB, SimpleDB

Cosmos DB is the major NoSQL player for Azure, as it does everything (key/value, document, graph) except relational. DynamoDB is a workhorse for AWS. SimpleDB is still around, but there are rumors it will be going away. This might be due to the fact that you cannot create a SimpleDB service using the AWS Console. So, short story, look for this category to be just Cosmos DB and DynamoDB in the future.

NoSQL – Document

Azure offerings: Cosmos DB

AWS offerings: DocumentDB

Azure used to offer DocumentDB, but that platform was sunset when Cosmos DB came alive. AWS recently launched DocumentDB with MongoDB compatibility in what some people see as a major blow to open source.

NoSQL – Graph

Azure offerings: Cosmos DB

AWS offerings: Neptune

As of May 2019, Neptune is in Preview, so the documentation is likely to change in the coming ~~weeks~~ ~~months~~ years (well, that’s my assumption, because Neptune has been in Preview since November 2018.) Cosmos DB uses a Gremlin API for graph purposes.

In-Memory

Azure offerings: Cache for Redis

AWS offerings: ElastiCache

Both of these services are built upon Redis, so the real question here is if you want to use Redis-as-a-service from a 3rd party provider as opposed to just using it Redis itself.

Time Series

Azure offerings: Time Series Insights

AWS offerings: Timestream

If you are in need of a time series database for your IoT collections, then both Azure and AWS have a service to offer. Azure Time Series Insights was launched in early 2017, and AWS announced Timestream in late 2018. In other words, the world of data services is moving fast, and the two major cloud providers are able to roll out services to meet growing demand.

Ledger

Azure offerings: [Sad Trombone]

AWS offerings: Quantum ledger Database

Setting aside the silliness of using the buzzword ‘Quantum’ in the name of this product, AWS does have a ledger database service available. As of May 2019, Azure does not offer a similar service.

Pricing

Azure Pricing calculator: https://azure.microsoft.com/en-us/pricing/calculator/

AWS Pricing Calculator: https://calculator.aws

I like using pricing as a way to start any initial comparison between data services. These calculators will help you focus on the important details. Not just costs, but how the technology works. For example, Azure SQL Database focuses on the concept of a DTU, which has no meaning in AWS. Using the calculators forces you to learn the differences between the two systems. It’s a great starting point.

That being said, trying to compare the data services offered by AWS and Azure can be frustrating. Part of me thinks this is done on purpose by both companies in an effort to win our favor without giving away more information than is necessary. This is a common practice, and I’m not bashing either company for doing what has been done for centuries. I’m here to help others figure out how to make the right choice for their needs. At the end of the day, I believe both Amazon and Microsoft want the same thing: happy customers.

By starting at the pricing pages I can then dive into the specific costs, and use that as a first level comparison between the services. If you start by looking at resource limits and maximums you will spend a lot of time trying to compare apples to oranges. Just focus on costs, those resources, throughput, and DR. That should be a good start to help you determine the cost, benefit, and risk of each service.

Summary

I hope you find this page useful for referencing the many data service offerings from both Microsoft Azure and Amazon Web Services. I will do my best to update this page as necessary, and offer more details and use cases as I am able.

The post Updated Data Services Comparison: AWS vs. Azure appeared first on Thomas LaRock.

Quantum Computing Will Put Your Data at Risk

Thomas LaRock — Thu, 06 Dec 2018 18:26:54 +0000

“Too many secrets.” – Martin Bishop

One of the pivotal moments in the movie Sneakers is when Martin Bishop realizes that they have a device that can break any encryption methodology in the world.

Now 26 years old, the movie was ahead of its time. You might even say the movie predicted quantum computing. Well, at the very least, the movie predicts what is about to unfold as a result of quantum computing.

Let me explain, starting with some background info on quantum computing.

Quantum computing basics

To understand quantum computing, we must first look at how traditional computers operate. No matter how powerful, standard computing operates on binary units called “bits.” A bit is either a 1 or a 0, on or off, true or false. We’ve been building computers based on that architecture for the past 80 or so years. Computers today are using the same bits that Turing invented to crack German codes in World War II.

That architecture has gotten us pretty far. (In fact, to the moon and back.) But it does have limits. Enter quantum computing, where a bit can be a 0, a 1, or a 0 and a 1 at the same time. Quantum computing works with logic gates, like classic computers do. But quantum computers use quantum bits, or qubits. With one qubit, we would have a matrix of four elements: {0,0}, {0,1}, {1,0}, or {1,1}. But with two qubits, we get a matrix with 16 elements, and at three qubits we have 64. For more details on qubits and gates, check out this post: Demystifying Quantum Gates—One Qubit At A Time.

This is how quantum computers outperform today’s high-speed supercomputers. This is what makes solutions to complex problems possible. Problems today’s computers can’t solve. Things like predicting weather patterns years in advance. Or comprehending the intricacies of the human genome.

Quantum computing brings these insights, out of reach today, into our grasp.

It sounds wonderful! What could go wrong?

Hold that thought.

Quantum Supremacy

Microsoft, Google, and IBM all have working quantum computers available. There is some discussion about capacity and accuracy, but they exist.

And they are getting bigger.

At some point in time, quantum computers will outperform classical computers at the same task. This is called “Quantum Supremacy.”

The following chart shows the linear progression in quantum computing for the past 20 years.

(SOURCE: Quantum Supremacy is Near, May 2018)

There is some debate about the number qubits necessary to achieve Quantum Supremacy. But many researchers believe it will happen within the next eight years.

So, in a short period of time, quantum computers will start to unlock answers to many questions. Advances in medicine, science, and mathematics will be within our grasp. Many secrets of the Universe are on the verge of discovery.

And we are not ready for everything to be unlocked.

Quantum Readiness

Quantum Readiness is the term applied to define if current technology is ready for quantum computing impacts. One of the largest impacts to everyone, on a daily basis, is encryption.

Our current encryption methods are effective due to the time necessary to break the cryptography. But quantum computing will reduce that processing time by an order of magnitude.

In other words, in less than ten years, everything you are encrypting today will be at risk.

Everything.

Databases. Emails. SSL. Backup files.

All of our data is about to be exposed.

Nothing will be safe from prying eyes.

Quantum Safe

To keep your data safe, you need to start using cryptography methods that are “Quantum-safe.”

There’s one slight problem—the methods don’t exist yet. Don’t worry, though, as we have “top men” working on the problem right now.

The Open Quantum Safe Project, for example, has some promising projects underway. And if you want to watch mathematicians go crazy reviewing research proposals during spring break, the PQCrypto conference is for you.

Let’s assume that these efforts will result in the development of quantum-safe cryptography. Here are the steps you should be taking now.

First, calculate the amount of time necessary to deploy new encryption methods throughout your enterprise. If it takes you a year to roll out such a change, then you had better get started at least a year ahead of Quantum Supremacy happening. Remember, there is no fixed date for when that will happen. Now is your opportunity to take inventory of all the things that require encryption, like databases, files, emails, etc.

Second, review the requirements around your data retention policies. If you are required to retain data for seven years, then you will need to apply new encryption methods on all of that older data. This is also a good time to make certain that data older than your policy is deleted. Remember, you can’t leave your data lying around—it will be discovered and decrypted. It’s best to assume that your data will be compromised and treat it accordingly.

One thing worth mentioning is that some data, such as emails, are (possibly) stored on the servers they touch as they traverse the internet. We will need to trust that those responsible for the mail servers are going to apply new encryption methods. Security is a shared responsibility, after all. But it’s a reminder that there are still going to be things outside your control. And maybe reconsider the data that you are making available and sharing in places like private chat messages.

Summary

Don’t wait until it’s too late. Data has value, no matter how old. Just look at the spike in phishing emails recently, where they show you an old password and try to extort money. Scams like that work, because the data has value, even if it is old.

Start thinking how to best protect that data. Build yourself a readiness plan now so that when quantum cryptography happens, you won’t be caught unprepared.

Otherwise…you will have no more secrets.

The post Quantum Computing Will Put Your Data at Risk appeared first on Thomas LaRock.

Why I’m Not Worried About Skynet

Thomas LaRock — Mon, 19 Nov 2018 19:40:06 +0000

I’m 98% confident if you ask three data scientists to define Artificial Intelligence (AI), you will get five different answers.

The field of AI research dates to the mid-1950s, and even earlier when you consider the work of Alan Turing in the 1940s. So, the phrase “AI” has lasted for 60+ years, or roughly the amount of time since the last Cleveland Browns championship.

My preference for a definition to AI is this one, from Elaine Rich in the early 1990s:

“The study of how to make computers do things which, at the moment, people do better.”

But there is also this quote from Alan Turing, in his effort to describe computer intelligence:

“A computer would deserve to be called intelligent if it could deceive a human into believing it was human.”

Here’s my take.

Defining Artificial Intelligence

When I try to define AI, I combine the two thoughts:

“Anything written by a human that allows a machine to do human tasks.”

This, in turn, allows humans to find more tasks for machines to do on our behalf. Because we’re driven to be lazy.

Think about the decades spent finding ways to build better programs, and the automation of traditional human tasks. We built robots to build cars, vacuum our house, and even flip burgers.

It the world of IT, alerting is one example of where automation has shined. We started building actions, or triggers, to fire in response to alert conditions. We added triggers until we reached a point where human intervention was necessary. And then we would spend time trying to figure out a way to remove the need for a person.

This means if you ever wrote a piece of code with IF-THEN-ELSE logic, you’ve written AI. Any computer program that follows rule-based algorithms is AI. If you ever built code that has replaced a human task, then yes, you built AI.

But for many in the field of AI research, AI means more than simple code logic. It also means things like image recognition, text analysis, or a fancy “Bacon/Not-Bacon” app on your phone. AI also means talking robots, speech translations, and predicting loan default rates.

AI means so many different things to different people because AI is a very broad field. The field contains both Machine Learning and Deep Learning, as shown in this diagram:

That’s why you can find one person who thinks of AI as image classification, but another person who thinks AI is as simple as a rules-based recommendation engine. So, let’s talk about those subsets of AI called Machine Learning and Deep Learning.

Machine Learning for Mortals

Machine Learning (ML) is a subset of AI. ML offers the ability for a program to apply statistical techniques to a dataset and arrive at a determination. We call this determination a prediction, and yes, this is where the field of predictive analytics resides.

The process is simple enough: you collect data, you clean data, you classify your data, you do some math, you build a model. This model is necessary to make predictions upon similar sets of data. This is how Netflix knows what movie you want to watch next, or how Amazon knows what additional products you would want to add to your cart.

But ML requires a human to provide the input. It’s a human task to define the features used in building the model. Humans are the ones to collect and clean the data used to build the model. As you can imagine, humans desire to shed themselves of some tasks that are better suited for machines, like determining if an image is a chihuahua or a muffin.

Enter the field of Deep Learning.

Deep Learning Demystified

The first rule of Deep Learning (DL) is this: You don’t need a human to input a set of features. DL will identify features from large sets of data (think hundreds of thousands of images) and build a model without the need for any human intervention thankyouverymuch. Well, sure, some intervention is needed. After all, it’s a human that will need to collect the data, in the example above some pictures of chihuahuas, and tell the DL algorithm what each picture represents.

But that’s about all the human needs to do for DL. Through the use of Convoluted Neural Networks, DL will take the data (an image, for example), break it down into layers, do some math, and iterate through the data over and over to arrive at a predictive model. Humans will adjust the iterations in an effort to tune the model and achieve a high rate of accuracy. But DL is doing all the heavy lifting.

DL is how we handle image classifications, handwriting recognition, and speech translations. Tasks once suited for humans are now reduced to a bunch of filters and epochs.

Summary

Before I let you go, I want to mention one thing to you: beware companies that market their tools as being “predictive” when they aren’t using traditional ML methods. Sure, you can make a prediction based upon a set of rules; that’s how Deep Blue worked. But I prefer tools that use statistical techniques to arrive at a conclusion.

It’s not that these companies are knowingly lying, it’s just that they may not know the difference. After all, the definitions for AI are muddy at best, so it is easy to understand the confusion. Use this post as a guide to ask some probing questions.

As an IT pro, you should consider use cases for ML in your daily routine. The best example I can give is the use of linear regression for capacity planning. But ML would also help to analyze logs for better threat detection. One caveat though: if the model doesn’t include the right data due to a specific event not observed, then the model may not work as expected.

That’s when you realize that the machines are only as perfect as the humans that program them.

And this is why I’m not worried about Skynet.

The post Why I’m Not Worried About Skynet appeared first on Thomas LaRock.

Cloud Vampires

Thomas LaRock — Tue, 13 Nov 2018 20:18:20 +0000

I don’t want to alarm you, but your cloud is infested with vampires.

No, not the kind who wear fashionable cloaks. I’m talking about vampire resources. These are the cloud resources you’ve created but are no longer used. Over-provisioned VMs, orphaned disks, load balancers, and whatever else you forgot about.

These cloud vampires are costing you money. They are also difficult to find. Neither AWS nor Microsoft Azure provide default reports to help identify vampire resources. This should not be surprising, as it is not in their best interests to remind you to spend less.

Cloud Vampire Resources

Here’s a list of resources you want to watch to bring vampire resources into the light of day.

Underutilized Virtual Machines – You built a VM according to the requirements. But the requirements were wrong. In the cloud you pay for resource consumption, disk storage, and network egress. Even using a minimal amount of VM capacity means you get billed the full amount for the hour. Either downsize or move that workload to a different VM.

Unused Virtual Machines – These are VMs that you built for Adam in Accounting a year ago and he’s never used. Or it’s a case of shadow IT, and employees are spinning up cloud VMs for their personal sandboxes. Even with these rogue VMs powered off you still pay for storage used by the VM disks, even when they are idle.

Orphaned Disks – You removed the virtual machine, but disks remained. This is by design, in case the VM removal was an accident. You’re paying for them, and there’s zero chance they are being used. Get rid of them.

Data Egress – The Cloud is like New Jersey—it’s free to get in, but you pay to get out. Your applications and systems should only pull data from the cloud when necessary. Too many extra API calls will lead to a bump in your monthly bill.

Geo-replication – Cloud resources often have options for automatic high availability and disaster recovery. Those services aren’t free. And they are not needed for every system. Check to make sure that systems using HA and DR need these options.

Load Balancing – Another HA feature that sounds great, but not needed by Developer Dan. You’ll want to review deployments of load balancers and ensure they are necessary.

Snapshots – Snapshots are a great way to rollback your VM in case an update goes awry. But don’t keep those snapshot lingering around too long. The extra overhead leads to extra dollars from your budget.

Unused IP Addresses – You have an option to create static IP addresses for your VMs. But those IP addresses are distinct objects, separate from your VM. So if you stop your VM, you are still charged for that static IP address.

Summary

I’ve listed some common vampire resources here. But this list is not meant to be comprehensive. It’s up to you to understand the cloud services you have deployed. You must track if they are in use, and the associated costs.

When transitioning workloads to the cloud, you must transition how you approach monitoring. Traditional methods of monitoring for outages and performance are not enough. You must also track resource usage, as well as use of cloud services.

And when you find cloud vampire resources, drive a stake through their heart. It’s the only way to be sure.

The post Cloud Vampires appeared first on Thomas LaRock.

No, You Don’t Need a Blockchain

Thomas LaRock — Thu, 01 Nov 2018 19:11:57 +0000

The hype around blockchain technology is reaching a fever pitch these days. Visit any tech conference and you’ll find more than a handful of vendors offering blockchain. This includes Microsoft, IBM, and AWS. Each of those companies offers a public blockchain as a service.

Blockchain is also the driving force behind cryptocurrencies, allowing Bitcoin owners to purchase drugs on the internet without the hassle of showing their identity. So, if that sounds like you, then yes, you should consider using blockchain. A private one, too.

Or, if you’re running a large logistics company with one or more supply chains made up of many different vendors, and need to identify, track, trace, or source the items in the supply chain, then blockchain may be the solution for you as well.

Not every company has such needs. In fact, there’s a good chance you are being persuaded to use blockchain as a solution to a current logistics problem. It wouldn’t be the first time someone has tried to sell you a piece of technology software you don’t need.

Before we can answer the question if you need a blockchain, let’s take a step back and make certain we understand blockchain technology, what it solves, and the issues involved.

What is a blockchain?

The simplest explanation is a blockchain serves as a ledger. This ledger is a long series of transactions. And it uses cryptography to verify each transaction in the chain. Put another way, think of a very long sequence of small files. Each file based upon a hash value of the previous file, combined with new bits of data, and the answer to a math problem.

Put another way, blockchain is a database—one that is never backed up, grows forever, and takes minutes or hours to update a record. Sounds amazing!

What does blockchain solve?

Proponents of blockchain believe it solves the issue of data validation and trust. For systems needing to verify transactions between two parties, you would consider blockchain. Supply chain logistics is one problem people believe solved by blockchain technology. Food sourcing and traceability are good examples.

Other examples include Walmart requiring food suppliers to use a blockchain provided by IBM starting in 2019. Another is Albert Heijn using blockchain technology along with the use of QR codes to solve issues with orange juice. Don’t get me started on the use of QR codes; we can save it for a future post.

The problem with blockchain

Blockchain should make your system more trustworthy, but it does the opposite.

Blockchain pushes the burden of trust onto individuals adding transactions to the blockchain. This is how all distributed systems work. The burden of trust goes from a central entity to all participants. And this is the inherent problem with blockchain.

[Warrants mentioning – many cryptocurrencies rely on trusted third parties to handle payouts. So, they use blockchain to generate coins, but don’t use blockchain to handle payouts. Because of the issues involved around trust. Let that sink in for a moment.]

Here’s another issue with blockchain: data entry. In 2006, Walmart launched a system to help track bananas and mangoes from field to store, only to abandon the system a few years later. The reason? Because it was difficult to get everyone to enter their data. Even when data is entered, blockchain will not do anything to validate that the data is correct. Blockchain will validate the transaction took place but does nothing to validate the actions of the entities involved. For example, a farmer could spray pesticides on oranges but still call it organic. It’s no different than how I refuse to put my correct cell phone number into any form on the internet.

In other words, blockchain, like any other database, is only as good as the data entered. Each point in the ledger is a point of failure. Your orange, or your ground beef, may be locally sourced, but that doesn’t mean it’s safe. Blockchain could show the point of contamination, but it won’t stop it from happening.

Do you need a blockchain?

Maybe. All we need to do is ask ourselves a few questions.

Do you need a [new] database? If you need a new database, then you might need a blockchain. If an existing database or database technology would solve your issue, then no, you don’t.

Let’s assume you need a database. The next question: Do you have multiple entities needing to update the database? If no, then you don’t need a blockchain.

OK, let’s assume we need a new database and we have many entities needing to write to the database. Are all the entities involved known, and trust each other? If the answer is yes, then you don’t need a blockchain. If the entities have a third party everyone can trust, then you also don’t need a blockchain. A blockchain should remove the use of a third party.

OK, let’s assume we know we need a database, with multiple entities updating it, all trusting each other. The final question: Do you need this database distributed in a peer-to-peer network? If the answer is no, then you don’t need a blockchain.

If you have different answers, then a private or public blockchain may be the right solution for you.

Summary

No, you don’t need a blockchain.

Unless you do need one, but that’s not likely.

And it won’t solve basic issues of data validation and trust between entities. If we can trust each other, then we would be able to trust a central clearinghouse, too.

Don’t buy a blockchain solution unless you know for certain you need one.

[This article first appeared on Orange Matter. Head over there and check out the great content.]

The post No, You Don’t Need a Blockchain appeared first on Thomas LaRock.