We frequently hear about Azure Infrastructure-as-a-Service (IaaS) having a 99.95% uptime SLA. Here is the direct quote from Microsoft, “For all Internet facing Virtual Machines that have two or more instances deployed in the same Availability Set, we guarantee you will have external connectivity at least 99.95% of the time.” Wait, what are they really saying? Could I be under the false impression that my Azure cloud-based VMs are really highly available? My emphatic answer is YES!
What does this mean?
The Azure team does upgrades, gear fails, and stuff just happens! Unless you build redundancy into your own Cloud implementation, you will be susceptible to these failures just as you are on-premises. It is time to spin up a few more VMs, assign them the same DNS name as your first VM and farm out your role or service. You DO need to consider your database server as well if you implemented an Azure IaaS VM and installed SQL Server yourself. The process to stand up additional IIS, SSRS or SharePoint servers in a farm configuration, and to run them through a simple Kemp load balancer or using DNS Round Robin, is very straightforward, but to deploy your redundant database server is a completely different story.
Platform-as-a-Service (PaaS) is the BEST way to get Highly Available services in the cloud at an efficient cost. This blog is for the folks who are stuck on IaaS for whatever reason – application compatibility or development costs being the prominent two reasons.
Help me understand Microsoft’s “Availability Sets”
You’ll notice the SLA statement is qualified with for “two or more instances deployed in the same Availability Set”. You may have asked yourself – “Well, how can I deploy to an availability set?” Or, “Do I not already have an Availability Set?” The back story is this: Microsoft periodically updates the Azure Fabric behind the scenes, unannounced. All Azure IaaS customers must design their infrastructure to tolerate the maintenance windows even when they occur during business hours.
That ideal 99.95% uptime does not come in to play if you deploy a single server for any role. In this case, you need two IIS servers, two application servers, two database servers, etc to be deployed to get that high level of uptime. Each tier configured for an Availability Set, and each tier separately configured for load balancing or failover in the event a VM is unreachable. This approach essentially doubles the cost of your cloud deployment, and adds additional complexity to your solution with load balancers, health checks, failover clusters, and database mirroring.
Be careful not to confuse the term with “Affinity Groups”, which is just assuring that resources are deployed together in the same datacenter. “Availability Sets”, in a sense, do the opposite. They may run in the same datacenter, but in different fault/update domains within that datacenter. Your VM can be used with both of these features!
So, where do I go from here?
If you rolled your own database server or other application solution and were expecting a highly available solution, you have one more big step to complete! Implement one of Microsoft’s SQL Server High-Availability solutions like failover clustering (Standard edition) or Always On Availability Groups (Enterprise only). For fastest failover, upgrade to two Azure VMs running SQL Server Enterprise edition. Create an Availability Group with synchronous commits and automatic failover. In this case, your ~$600/mo single VM with SQL Server Standard suddenly spikes up to ~$3,000/mo and the cost effectiveness of the Cloud begins to fall apart.
Time to take the plunge
NOW is the time to fix this mistake before the database server gets rebooted unexpectedly, and your application craters during business hours! We can do a Cloud Assessment for you and help you make that decision to bring that workload back on-premises, convert it to PaaS, or bite the bullet and re-deploy the IaaS solution to maximize availability while keeping costs in check where possible.