Technical feasibility of moving databases to the public cloud

Cloud…Cloud…Cloud. The hype around moving IT workloads to the cloud has reached a fever pitch in the marketplace. However, the first point of any migration to the cloud starts with the database. Once the data is moved, all of the app servers and other infrastructure can (relatively easily) follow. Just how realistic is it to move database workloads to the cloud today circa early 2015?

If you are an ISV, fortunately the answer is “relatively easy”. Simply look at your database structures, and architect around the limitations inherent in the cloud offering. However, if you are a large Enterprise, moving the database in any significant way is much…much harder. To understand why, you first have to put yourself into the shoes of a large Enterprise CIO, and evaluate the following:

  1. Is it even technically feasible to move my database infrastructure? Will my data be secure?
  2. If it is technically feasible to move a large % of my database environment, does is make business sense?
  3. What is the timeline for doing this migration, and am I staffed to do it myself, or do I need 3rd party help?

All 3 of these questions are very hard to answer in any simplistic way, and require a significant amount of due diligence to determine. Let’s discuss these challenges and see if there is a path to success.

 

Technical options available

Both Amazon Web Services (AWS), and Microsoft Azure have had basic abilities to host databases in the cloud for some time. This could be done either with an IaaS infrastructure (you host the Virtual Machine Operating System yourself), or with PaaS (Platform as a Service – you utilize a hosted database service where the provider manages the OS and database software on your behalf).

PaaS

For ISV’s, PaaS represents a significant opportunity IF you are able to architect your application around the limitations of the database service. For example, AWS released it’s SimpleDB service in 2009. To leverage this service, you were required to use a proprietary API that was not SQL based, and had severe limitations in terms of functionality. Even for ISV’s, these limitations meant only a very small number of use cases were even possible, and for Enterprise IT, migrating to this service was a pipe dream.

Since 2009, Amazon and Microsoft have introduced significant new database PaaS features. For example, recently Amazon released a new version of RDS (Aurora) targeting complete API compatibility with MySQL, but architecting the back end for massive scale (up to 64TB), and removing the complexities of failover and disaster recovery. Microsoft has released SQL Azure, which allows for hosting databases up to 500GB in size (today) where most traditional on premise SQL Server functions are supported. Both of these services have significantly advanced the ability to host OLTP applications in the cloud. For ISV’s, Amazon’s RDS offerings are quickly approaching a point where it really no longer makes sense for an ISV to run their own datacenter. In addition to OLTP centric offerings, both Microsoft and Amazon provide fantastic PaaS offerings around Hadoop/Big Data, and more traditional data warehousing functions such as Amazon Redshift and Microsoft cube hosting/reporting services.

However, for large Enterprise IT, leveraging PaaS in the cloud is still a LONG ways off. Enterprise IT is severely limited by a couple of critical factors:

  1. ISV’s dictate architecture – Although some Enterprise IT shops do internal application development (this obviously varies by industry), the majority simply implement ISV solutions out of the box (buy vs build). Even if an IT organization could theoretically move an ISV application database to the cloud, many ISV’s often do NOT provide support on virtualized databases…let alone hosting databases in the cloud.  This is a risk most IT shops are unwilling to take.
  2. API limitations – Many PaaS offerings are not API compliant with on premise versions. For example, Azure SQL does not currently support CLR in stored procedures. Amazon Aurora, although supposedly MySQL API compliant, may introduce potential incompatibilities based on hosting, replication, and security scenarios (currently Aurora is in preview). Regardless of whether the database is FULLY drag and drop supported will still require extensive testing…something an Enterprise IT shop would prefer to leave up to an ISV to perform.
  3. Every conceivable database vendor and versions – Many Enterprise IT shops run a bunch of database vendor products (SQL Server, Oracle, MySQL, etc). ISV’s also often dictate what specific version of database software must be run. It is not uncommon for Enterprise IT to run database versions that are over 10 years old for fear of breaking a mission critical database. Because of all of the mix of vendors and versions, it makes it almost impossible for Enterprise IT to use PaaS solutions in any sort of broad scale.

IaaS

Many Enterprise IT shops ask themselves, “If hosted PaaS database offerrings aren’t possible, at least I can do IaaS (Infrastructure as a Service)…right?”. Although it is true that hosting your own VM with the correct software vendor/version installed is absolutely doable, there are significant performance problems that get in the way of any sort of broad scale adoption. These performance challenges fall into the following categories:

  1. Network performance and latency – no Enterprise application is an island. Most applications deployed in the enterprise have integration requirements with other applications or databases. It may be just as simple as connecting to the database for data warehousing purposes, or perhaps more real time connectivity/application dependency. Therefore, a low latency/high performance connection is required for other applications to integrate with the database. Up until late 2014, connectivity to the public cloud was limited to VPN connections that are unreliable, slow, and have unpredictable latency. With the release of Amazon’s DirectConnect and Azure’s ExpressRoute capabilities, it is now possible to interconnect the public cloud back to existing Enterprise IT datacenters with speeds up to 10GigE (equivalent to LAN speeds). However, simply because I have a high speed connection can still be hampered by latency issues if there are many network hops in between. Care must be taken to ensure ExpressRoute and AWS DirectConnect don’t experience serious latency issues. Today, many Enterprise IT shops have yet to put in high speed interconnects to public cloud providers. This should accelerate in 2015 as large Enterprise IT shops put public cloud integration into their longer term Enterprise Architecture roadmaps.
  2. CPU/high capacity VM’s – For an Enterprise IT organization to consider moving to the cloud, there must be VM’s capable of handling large scale workloads. It wasn’t until late 2014 that VM instances existed that were even capable of handling medium size database workloads in IaaS. For example, AWS EC2 instances are now available to optimize for max CPU, memory, or price/performance. However, this is very recent. In addition, SSD based VM’s are also just now coming online.
  3. IOPS (Input/Output Per Second) – IOPS is perhaps the hardest hurdle to overcome in the public cloud when hosting databases. Databases have tremendous sensitivity to IOPS, and it wasn’t until late 2014 that Amazon enabled “provisioned IOPS” where you can dial in the amount of IO necessary for any database solution. Microsoft does not yet have provisioned IOPS, but should have it available in early 2015.

Because of these (and other limitations), Enterprise IT hosting databases in IaaS simply wasn’t realistic in 2014. However, many of these limitations are being removed in 2015…opening up the ability for Enterprise IT to consider hosting a most sizable % of Enterprise applications in the cloud. Although, implementing Tier 1 database workloads in the cloud will likely require more time before they can be moved (likely in 2016-2017). Tier 2-3 workloads should be able to move in 2015.

Security in the public cloud

Security in the public cloud has been a large topic of discussion, but in my experience, it tends to be a tempest in a teapot. Although public cloud providers do represent a very large juicy target to any would be hackers, the reality is that these cloud provider are very aware of this threat, and architect around this reality. They hire the best and brightest security experts available today, and have a large business reason to never allow an intrusion to occur. Any breach generates a large news event that can kill future business.

Enterprise IT, on the other hand, is an entirely different story. One only needs to look at major intrusions at Target, Home Depot, Sony, Sands Hotels, and many others to realize that Enterprise IT just isn’t up to the challenge of protecting their infrastructure from debilitating intrusions. Having been intimately involved with many IT organizations, the security infrastructure I have observed has been mediocre at best, and downright frightening at its worst. Enterprise IT intrusions/outages seldom make the news unless it is so catastrophic that the entire organization is put at risk (eg Sony).

Although there are some legitimate national security cases to be made for avoiding the public cloud, the vast majority of organizations do not have a security use case stopping them from moving. Then, why is it that IT organizations squash cloud migrations for security reasons? The answer is always simple…follow the money. IT professionals are smart individuals and do look out for their own self-interest. Nobody wants to have their job outsourced, or lose their job. There is a financial interest IT professionals have in throwing up roadblocks. In the sales world, we call this FUD (Fear, Uncertainty, and Doubt). My recommendation is that any CFO/CIO considering cloud bring in an independent 3rd party to validate what are real legitimate security concerns vs FUD.

My recommendations

PaaS

I have a very hard time technically recommending a database PaaS solution to Enterprise IT over the next 3 years. The services simply aren’t mature enough yet (nor will they be in the foreseeable future) to handle the complexities of a broad scale Enterprise database infrastructure migration. However, I absolutely believe ISV’s should have a public cloud PaaS strategy…it just makes sense from a cost and complexity argument.  The main argument I hear from ISV’s from moving to PaaS is vendor lock in. But with AWS Aurora running MySQL, I simply do not see vendor lock in being a significant problem. Vendor lock-in to me has always been a bit of a red herring.

IaaS

IaaS is absolutely coming of age in 2015 for Enterprise IT from a database perspective. However, it is in its technical infancy, and Enterprise IT should start planning for a day when databases can be deployed to the cloud due to availability of provisioned IOPS and high speed network interconnects. However, I would not recommend moving to cloud as the first step for most IT organizations. For 2015, it should be a future research project only, with first production deployments scheduled in 2016-2017.

Internal database Virtual Private Cloud (VPC)

The first step in any cloud migration should be creating an internal database virtual private cloud (VPC). In my research, an internal VPC can be less expensive than a public cloud provider given the fact that Enterprise IT already has sunk investment in database licenses. I believe it will be a few years before public cloud providers can make database deployment less expensive than an on-premise VPC. Please see my upcoming post on what I believe a future cloud migration strategy should look like.

 

Posted in Cloud, Database.