Azure SQL Failover Strategy – I

Azure SQL has multiple failover strategies , the amount of content to understand the different strategies seems to be overwhelming unless you are a database administrator and you had been working on the HA / DR for quite a while.

I am going to attempt to make it simpler and consolidate some of the strategies to make it easy for you to understand . Since the strategies are different for each type of cloud computing (IaaS and PaaS), i am going to focus on the 1st part only on the IaaS and the next part on PaaS, that way it would be easy to look back and understand the magnitude of differences in Azure SQL.

Azure VM based Failover Strategy

There are 3 important failover strategies that we will refer for Azure SQL VM

Option 1 – Availability Groups

Also known as Always On Availability Groups, AOAG, AG, …. you get the picture (literally, see below).

  • Always On availability groups on Azure Virtual Machines are similar to Always On availability groups on premises.
  • Requires an Internal Load balancer within the two SQL VMs
  • The load balancer holds the IP addresses for the availability group listener. If you have more than one availability group each group requires a listener. One load balancer can support multiple listeners.
  • Best Practice is to use Availability set within the two SQL VMs & Domain Controllers to ensure redundancy and high availability

Availability Set:

Azure makes sure that the VMs you place within an Availability Set run across multiple physical servers, compute racks, storage units, and network switches & physical building within the same region. If a hardware or software failure happens, only a subset of your VMs are impacted and your overall solution stays operational.

Pricing

To use BYOL with a SQL Server VM, you must have a license for SQL Server Standard or Enterprise and Software Assurance

Cost savings The Azure Hybrid Benefit offers up to 55% savings. For more information, see Switch licensing model
Free passive secondary replica Another benefit of bringing your own license is the free licensing for one passive secondary replica per SQL Server for high availability purposes. This cuts in half the licensing cost of a highly available SQL Server deployment (for example, using Always On Availability Groups). The rights to run the passive secondary are provided through the Fail-Over Servers Software Assurance benefit.
Windows License Windows license allows Dual-usage , which means you don’t have to pay for Windows license if you are brining your own License and carry EA

Recovery Objective

The time taken to make the secondary replica as primary.

RTO

If the Availability mode is set as Synchronous, then the data committed is replicated , so any data during transit is required to be sent over (retry from Application)

RPO

More info : https://docs.microsoft.com/en-us/azure/virtual-machines/windows/sql/virtual-machines-windows-portal-sql-availability-group-overview

Option 2 – Availability Group across Multiple Regions

This is an extension of the Option 1 , where we have instead of the SQL VM being replicated only within a single region, we can have SQL VM replicated across multiple regions. This provides higher redundancy and increased resiliency . This architecture helps when one region is completely down for some reason (read fire, earthquake, human error) , to protect business staying afloat customers use this strategy

In this deployment, The SQL VMs are in Availability group are across in two regions. The availability group replicas can have synchronous commit with automatic failover on SQL-1 and SQL-2 and SQL-3 is in a different Azure region. SQL-3 is added to the Windows Server Failover Cluster. SQL-3 can host an availability group replica. In this architecture, the replica in the remote region is normally configured with asynchronous commit availability mode and manual failover mode.

When availability group replicas are on Azure virtual machines in different Azure regions, each region requires:

  • A virtual network gateway
  • A virtual network gateway connection
Location Server
Instance
Role Availability Mode Failover Mode
Region 1 SQL-1 Primary Synchronous Automatic
Region 1 SQL-2 Secondary Synchronous Automatic
Region 2 SQL-3 Secondary Asynchronous Manual

Pricing

To use BYOL with a SQL Server VM, you must have a license for SQL Server Standard or Enterprise and Software Assurance

Cost savings The Azure Hybrid Benefit offers up to 55% savings. For more information, see Switch licensing model
Free passive secondary replica Another benefit of bringing your own license is the free licensing for one passive secondary replica per SQL Server for high availability purposes. This cuts in half the licensing cost of a highly available SQL Server deployment (for example, using Always On Availability Groups). The rights to run the passive secondary are provided through the Fail-Over Servers Software Assurance benefit.
Paid Third Replica SQL Licensing covers only one Passive replica, and the third replica will be a paid replica based on the configuration
Windows License Windows license allows Dual-usage , which means you don’t have to pay for Windows license if you are brining your own License and carry EA

Recovery Objective

The time taken to make the secondary replica as primary.

RTO

For the Availability mode is set as Synchronous, then the data committed is replicated , so any data during transit is required to be sent over (retry from Application), However for the Asynchronous replication there could be a data loss of ~5 minutes due to this being a region outage

RPO

Option 3 – Windows Failover Cluster

Failover Cluster Instances (FCI), which require shared storage, can be created in 4 different ways.

1. A two-node failover cluster running in Azure VMs with attached storage using Windows Server 2016 Storage Spaces Direct (S2D) to provide a software-based virtual SAN.

2. A two-node failover cluster running in Azure VMs using Premium File Share. Premium file shares are SSD-backed consistently-low-latency file shares that are fully supported for use with Failover Cluster Instance.

3. A two-node failover cluster running in Azure VMs with storage supported by a third-party clustering solution. For a specific example that uses SIOS DataKeeper, see High availability for a file share using failover clustering and 3rd party software SIOS DataKeeper.

4. A two-node failover cluster running in Azure VMs with remote iSCSI Target shared block storage via ExpressRoute. For example, NetApp Private Storage (NPS) exposes an iSCSI target via ExpressRoute with Equinix to Azure VMs.

Pricing

On Azure virtual machines, you can license SQL Server using pay-as-you-go (PAYG) or bring-your-own-license (BYOL) VM images.

The type of image you choose affects how you’re charged.

  • With pay-as-you-go licensing, a failover cluster instance (FCI) of SQL Server on Azure virtual machines incurs charges for all nodes of the FCI, including the passive nodes
  • If you are using your Enterprise Agreement with Software Assurance, you can use one free passive FCI node for each active node. To take advantage of this benefit in Azure, use BYOL VM images, and use the same license on both the active and passive nodes of the FCI.

Recovery Objective

The time taken to fail over between the nodes.

RTO

Because Always On FCI uses shared storage, the same view of the storage instance is available on failover.

RPO of Zero Data Loss