All posts by tcai

Disaster Recovery with Windows Virtual Desktop

If deploying Windows Virtual Desktop, having the multiple redundant control plane that Azure provides is a very attractive proposition! If a Azure region fails, your WVD connection routes to another Azure region on the platform layer. Your end users shouldn’t notice a thing right?

However, what happens to your IaaS virtual machines that serves as your WVD session hosts and pools? Those aren’t protected. I will go over some scenarios you can setup to protect your environment from regional failures when running a Windows Virtual Desktop workload.

1) Active/Passive Setup – using FSLogix VHDx

In an active-passive setup, you have a single pool setup in a single region. You store your FSlogix profiles on the file server \\fs01\Profiles\%username%

You use a backup tool that copies permissions over to another file share, preferably on a different file server running your paired fail-over region. As for the replication intervals, I would suggest once every 4 hours. Remember, you will incur cross region bandwidth charges between the two vNet peers you are replicating to.

For example: If my primary Azure region where my Pool-A is located, Pool-B should ideally be located in Central US. In Central US would be my active fs02 server running a D2sv3 machine with Premium SSD’s.

When you spin up Pool-B, go to your image for Pool-B and setup your FSLogix registry settings to point the FSLogix VHD location to: \\fs02\Profiles\%%username%. Your FSLogix Redirected XML doucment should also be pointing to your \\fs02 folder path with the same XML document. Your Pool-B image should be identical to your Pool-A image except for that change.

When US East goes down, you immediately turn on Pool-B and re-assign all active WVD users to Pool-B and the last good copy of your FSLogix profile will already be synced with FS02 and your running WVD environment is ready to go.

You can use Azure Site Recovery for all other VM’s and LOB apps.

2) Active/Active Setup – using FSLogix CloudCache

In a Active-Active setup, we are going to use FSLogix CloudCache.

FSLogix CloudCache allows you to point a user profile to up to 4 providers. A provider is a location where you specify that will contain synced copies of every users VHD.

For example, provider 1 (primary) can be a SMB Share on a File Server in US East, provider 2 can be a Azure Files SMB path in US East, provider 3 can be in US Central on a File Server, and provider 4 can be in US Central in a Azure File share there.

You specify the paths of the providers in the same FSLogix registry settings but you cannot have both a VHD Location and CloudCache Location at the same time, it’s either one of the other.

In US East on Pool A, you can setup the providers pointing to the 4 provider locations in the order you wish to use as the primary. We can prioritize as such (ie: Provider 1 – Azure Files (US East) ; Provider 2- SMB Share on File Server (US East); Provider 3 – Azure Files (US Central); Provider 4 – SMB Share on File Server (US Central).

The great thing is that CloudCache manages all the failover and failback if a provider does go down. It will sync up automatically. On the WVD Session host itself, there is a Cache folder in C:\ProgramData\FSlogix\Cache that stores all the users cache files, make sure you have enough space on your C drive for that. You can specify a different path for this cache file but make sure its fast. The temporary drive in a VM is also a place to store this cache file temporarily, otherwise store them on a Premium disk.

Some drawbacks are that initially during first time logins, it takes additional time to cache to your local disk, and upon log off, it takes type to sync back to all 4 providers. The log off process won’t complete until all 4 providers have been synced up. Potential issues here is that you are going to likely run into IOPS issues on your disks. In a DR scenario, that should be acceptable for the average client.

DATTO BDR VS MICROSOFT AZURE SITE RECOVERY BACKUP

I have been selling Datto BDR to my clients for the past 4 years. Datto is a  BDR solution sold by many MSP/VARs in the IT management space. It features a private label SuperMicro chassis with linux based OS running Datto’s software. It touts hybrid virtualization which means you have the ability to spin up your servers virtually in the Datto cloud in a DR scenario or spin up the server locally on the Datto appliance itself (only available in select models). For the most part it works however it’s expensive as you start backing up larger amounts of data. Anything over 1TB becomes out of reach for most SMB customers.

Datto Pros:

  • One appliance takes care of it all
  • IT company manages health of the appliance and monitors backup
  • Centralized management portal for all Datto appliances in MSP fleet
  • Datto managed DR scenario – support from Datto Team
  • Backs up legacy OS, Windows 2003 included
  • Works with VMWare, Hyper-V and Physical Servers

Datto Cons:

  • Requires 50% of free space to provide adequate protection
  • Non salable growth: if you surpass unit GB allowance, you must upgrade to a larger unit
  • Expensive for a BDR solution for clients with larger than 1TB of data

Microsoft’s Azure Site Recovery on the other hand was designed to backup on-premise physical and virtual machines. Both VMware and Hyper-V environments are supported. Certain linux distributions such as RedHat, CentOS, Ubuntu and SuSE is also supported. There is a per instance fee of $25 per machine backed up but the costs are much more reasonable. The technology is also much better. Datto uses StorageCrafts backup technology which in my opinion is antiquated. Azure Site Recovery uses a Site Recovery dedicated machine and backups via site recovery agents. What you would have to do first is to design a Azure network used for your emergency workload. This means you are creating a vnet, a vnet gateway and a VPN back to your premise, setting up Azure Site Recovery, backing up your environment and testing your disaster scenario. If this is done right, you eliminate the need for having a high monthly Datto charge and you’ll end up with a highly flexible Azure solution.

Azure Site Recovery Pros:

  • Very salable, back up unlimited workloads and LOB applications – never run out of resources
  • Use Azure’s infrastructure for running your workload in a DR scenario
  • Priced very reasonably, $16/month for customer owned sites or $25/month to backup to Azure + the cost of Azure storage
  • Works with VMWare, Hyper-V and Physical servers
  • Allows pre-disaster orchestrating of a DR which will force you to think about DR before it happens
  • Can be used to migrate existing workloads to Azure

Azure Site Recovery Cons:

  • Requires more IT knowledge than Datto for set up and management
  • Requires Azure knowledge – vnet, storage, ASR and VPNs
  • Does not work with legacy OS, ie Windows 2003, older versions of Linux
  • Requires instance for ASR configuration server, essentially a backup management server