When I originally planned this blog series, my primary goal was to introduce the broader Citrix community to the Microsoft Cloud Adoption Framework (CAF) for Azure to help Citrix teams “speak Microsoft Azure” when planning or deploying a Citrix Landing Zone in collaboration with their Cloud Platform teams.
In Part 1 we reviewed the design principles of an enterprise-scale landing zone. These design principles highlight a methodology to create a modular and scalable architecture. In Part 2, we summarized the CAF critical design areas and covered the impact of Azure AD tenants on Citrix Cloud tenancy, feature compatibility with the varying types of Active Directory options, and ways to track subscriptions as a unit of scale for your Citrix deployment. I recommend reading each of these blogs you dive into this part.
Part 3 will continue the Citrix on Azure – Enterprise-Scale Landing Zone blog series by exploring the following critical design areas:
- Business continuity and disaster recovery
- Security, governance, and compliance
- Platform automation and DevOps
I will highlight the relevant Azure capabilities, and considerations when applying these capabilities to Citrix design and operations. Like before, each of these areas could have enough information for their own blog individually so I will focus primarily on key insights or lessons learned.
Business Continuity and Disaster Recovery
CAF recommendation: “The built-in features provide an easy solution to the complex task of building replication and failover into a workload architecture, simplifying both design and deployment automation.”
For any customer or partner interested in a primer on Citrix considerations for business continuity planning, I highly recommend reviewing the Citrix Tech Zone article on the topic written by my colleague Michael Shuster for a deeper dive. This whitepaper also covers key considerations for public cloud disaster recovery and planning. Another great resource is the Microsoft Azure Well-Architected Framework on reliability. For this design area I want to focus on the specific Azure components (built-in features) and key considerations for a Citrix deployment to provide a reliable architecture.
Azure Feature: Regions
What is it? Set of datacenters deployed within a latency-defined perimeter and connected through a dedicated regional low-latency network.
What are the Citrix considerations? Selection of an Azure Region should be based on proximity to existing datacenters, users, or required backend data. You should also be mindful of what services are available in the regions of choice.
From a Citrix perspective, organizations often start with a single region. However, two or more regions should be considered long term for geographic redundancy in a BC/DR strategy.
Azure Feature: Express Route
What is it? Set of datacenters deployed within a latency-defined perimeter and connected through a dedicated regional low-latency network.
What are the Citrix considerations? Express Route is an essential infrastructure component bridging a Microsoft and organization’s datacenter or co-location. It is often a prerequisite for an enterprise-scale Citrix deployment.
Express Routes are shared service in which bandwidth capacity planning should be conducted to determine overall bandwidth needs for the enterprise. If not enough bandwidth is available, it can impact user experience or access to key services in the datacenter. Additionally, there will be an impact on ICA performance if sessions transverse the Express Route to the datacenter.
Azure Feature: Availability Zones
What is it? Unique physical locations within a region. Each zone is made up of one or more datacenters equipped with independent power, cooling, and networking.
What are the Citrix considerations? Availability Zones provide a 99.99 percent SLA and should be used for all applicable Citrix infrastructure (Connectors, StoreFront, FAS, etc.) to provide datacenter redundancy within a region. Not all regions have Availability Zones, so it is recommended to plan accordingly.
Azure Feature: Availability Sets
What is it? VMs in an availability set are spread across several fault domains. A fault domain is a group of VMs that share a common power source and network switch.
What are the Citrix considerations? Availability Sets provide a 99.95 percent SLA and should only be used for Citrix infrastructure if Availability Zones are unavailable in the region. Availability Sets only provide hardware redundancy (like hypervisor anti-affinity rules); therefore you should also use a multi-region strategy to provide datacenter and geographic redundancy.
Azure Feature: Managed Disks
What is it? Managed Disks are managed by Azure and automatically placed in different storage scale units to limit the effects of hardware failure.
What are the Citrix considerations? Managed Disks should be used for all Citrix infrastructure and VDAs. Unmanaged Disks are not recommended. For Single Instance Virtual Machines (i.e. a user’s VDI or a published application server), they have the following SLAs:
- Premium or Ultra SSD – 99.9 percent
- Standard SSD – 99.5 percent
- Standard HDD – 95 percent
Each of the varying disk types have performance and cost differences. Premium is recommended for Citrix infrastructure machines. When deciding the disk type for VDAs it should be a factor of cost constraints, performance, and availability needs.
Azure Feature: Backup
What is it? The Azure Backup service provides simple, secure, and cost-effective solutions to back up your data and recover it from the Microsoft Azure cloud. A backup policy contains schedule and retention settings. These settings should align with RTO/RPO, operational, or regulatory compliance needs.
What are the Citrix considerations? Azure Backup is recommended for Citrix infrastructure, master image VMs, and persistent desktops. This service provides an automated means to protect these components.
Azure Feature: Resource Locks
What is it? Prevent other users in your organization from accidentally deleting or modifying critical resources. They contain the following settings:
- CanNotDelete – Authorized users can still read and modify a resource, but they can’t delete the resource
- ReadOnly – Authorized users can read a resource, but they can’t delete or update the resource
What are the Citrix considerations? CanNotDelete Resource Locks should be applied to all Citrix infrastructure and master image VMs to protect these critical components. Read-only can be considered but may complicate troubleshooting, for example if a component is unresponsive and requires a reboot in the Azure portal. Resource Locks apply a restriction across all users and roles. If specific admin groups should have read-only access, Azure role-based access control would be the preferred approach.
Security, Governance, and Compliance
CAF recommendation: “Use built-in policies where possible to minimize operational overhead.”
In Part 1 of the series we highlighted the importance of policy-driven governance as a key design principle for Azure cloud adoption. Azure Policy is a critical tool for every Citrix on Azure deployment. They can help with adhering to security standards set by your cloud platform team or ensure continuous compliance to regulations with automatic enforcement and reporting. Review your policy baseline with your platform team because, per Microsoft guidance, definitions may be applied at the top-level root management group so that they can be assigned at inherited scopes. Additional policies or exceptions can be applied to the underlying Citrix management groups / subscriptions to help automate key aspects of the Citrix environment. While I covered this at a high level in Part 1, in this blog I will provide a few more detailed examples.
Policy Definition: Allowed Virtual Machine SKUs
Suggested Effect: Deny
Citrix Behavior: This can be used if reservations are purchased to ensure admins use the applicable SKUs.
Machine Catalog creation succeeds if the allowed SKUs are used and fails if the policy is not met.
Policy Definition: Add a tag to resources
Suggested Effect: Modify
Citrix Behavior: Tags are often used to identify departmental ownership, production vs. dev, or the cost center of an Azure object.
This Azure policy can be applied to automatically tag the objects contained in the Resource Group used by MCS.
Policy Definition: Audit VMs that do not use managed disks
Suggested Effect: Deny
Citrix Behavior: MCS can use unmanaged disks, snapshots, or managed disks to create machines. The resultant machines can use managed or unmanaged disks.
As discussed in the BC/DR critical design area, managed disks are recommended for all Citrix virtual machines. This policy should be set to Deny, blocking VM provisioning with unmanaged storage from Citrix MCS.
Policy Definition: Azure Backup should be enabled for virtual machines
Suggested Effect: Audit
Citrix Behavior: Apply this policy to the Citrix infrastructure, master image VMs, and persistent desktops Resource Groups to audit resources and report on resources without backup enabled.
Platform Automation and DevOps
CAF recommendation: “Don’t force application teams to use a central process or provisioning pipeline for the instantiation or management of app resources. Existing teams that already rely on a DevOps pipeline for app delivery should still be able to use the same tools they have been using. Remember that you can still use Azure Policy to maintain guard rails, independent of how resources are deployed in Azure.”
A concept often executed with a transition to public cloud workloads is the creation of an OS build pipeline for Windows workloads. An OS build pipeline is a DevOps methodology in which organizations leverage IT automation tools to establish version control for a base OS build. These automation tools can also be used to redeploy machines after updates or rebuild “cattle” machines after failure. “Cattle” machines are arrays of more than two servers built using automated tools, and are designed for failure, where no servers on the chain are irreplaceable. These are the opposite of “pet” machines, machines or pairs of machines that are treated as indispensable or unique systems that can never be down.
In a Citrix world there are “cattle” and “pets”. Let’s break them down.
Citrix “Pets” – Components that can be built with automation, however, are recommended to be persisted after initial deployment
- Cloud Connectors – Each Cloud Connector can be initially deployed using automation. During installation Cloud Connector certificates are verified and trust a relationship established between Citrix Cloud for the associated Resource Location. This is unique per Connector. Rebuilding would require reestablishment of this trust relationship making the Connector unavailable for brokering and connection of user workloads. Effectively, it’s a full redeployment. Redeployment during production use would need to verify core services as part of error handling, adding complexity and risk. The Cloud Connector software is maintained automatically by Citrix Cloud, customers are only responsible for updates of the underlying Windows OS and applicable security/monitoring agents.
- StoreFront – Similar to the Cloud Connector, StoreFront can be built using automation however items, such as a user’s subscription data, (i.e. favorites) are dynamic. This data would need to be exported from current state and imported as part of a rebuild process, creating risk for user data loss.
- Persistent VDI – While a persistent VDI can be built from an OS build pipeline and distributed via Citrix Machine Creation Services (MCS), if a user’s desktop is unavailable, they will be unable to work. Additionally, replacing this machine can result in lost user configurations or data.
- Citrix Federated Authentication Services (FAS) – Citrix FAS can be deployed in Azure to establish SSO to Windows when using a SAML based identity provider, such as Azure AD, Okta, or Ping. PowerShell can be used for deployment however FAS servers must be authorized with a Certificate Authority during initial configuration. Rebuilding would require reauthorization with the CA.
- Citrix Application Delivery Management (ADM) service Agent – Linux based appliance deployed from the Azure marketplace and maintained via Citrix Cloud.
Citrix “Cattle” – Components that can be maintained with automated tools and rebuilt as needed.
- Pooled Single-Session and Multi-Session Workloads – Similar to Persistent VDI these machines can also be built using the image established from the OS build pipeline and distributed via MCS. This enables the OS team to manage centralized images using established automation tools, such as Chef or Puppet, while allowing the Citrix team to scale out the necessary workloads using the built-in Citrix tools. Automation is targeted toward a single machine that will act as the “master image”, simplifying management and troubleshooting. This master can then also be used for updates of the existing VDAs. An example workflow is illustrated below:
- Citrix Application Delivery Controller (ADC) – A Citrix ADC can be automatically deployed to an Azure subscription with established prerequisites. When deploying highly available ADCs they are aggregated behind an Azure Load Balancer and can be managed using Citrix ADM. StyleBooks and Configuration jobs can deploy new nodes, replace existing, or update configurations. AutoScale can also manage ADC capacity including the deployment and resizing of the ADC nodes.
Up Next – A Deeper Dive on Azure Networking and Monitoring
If you have made it through the first three parts of the Citrix TIPs – Citrix on Azure: Enterprise-Scale Landing Zones series, I offer you a sincere thank you. I hope this series helps set up you and your team for greater Citrix and Microsoft Azure success.
For the final part of the series we will explore the Cloud Adoption Framework critical design areas of Azure network topology and connectivity and build off the above business continuity, compliance, and automation concepts by highlighting Azure management and monitoring.