The Citrix Alliances team, Citrix Consulting, and a few of our trusted community advocates conducted a series of scalability tests comparing the latest generation of cloud instances provided by Microsoft Azure, Amazon Web Services (AWS), and Google Cloud Platform (GCP). It’s been two years since we’ve updated our public cloud scalability guidance, and as you can imagine, a lot has changed! New VM instance sizes have been released and we have had multiple VDA releases. Our new data updates the original cloud scalability leading practices presented a few years back.
Cloud scalability is a combination of three primary considerations: performance, manageability, and cost. These variables influence the key metric for efficiency in the public cloud, $/user/hour. As we revisit our Citrix leading practices for scalability in a cloud world for 2018, let’s break down the considerations in greater detail.
In the cloud, you are no longer limited to physical hosts, but rather, to families of VM instance types, each with their own costs and resource allocations. Microsoft, Amazon, and Google provide their own variety of VM types mapped to a primary resource optimized for the workload they intend to support. Choosing the right instance type for your workload will be a critical variable of its overall performance.
The VM types available at the time of this writing are summarized below.
Figure 1: Instance Types in Azure, AWS, and GCP
The general purpose and compute optimized families were the primary focus of the 2018 scalability testing. These instance types fit most task and knowledge worker use cases, are the most commonly deployed in production Citrix deployments and are loosely mapped to XenApp and XenDesktop workloads, respectively. The LoginVSI task and knowledge worker profiles were tested using general purpose and compute optimized instances on Azure. The LoginVSI knowledge worker profile was tested using general purpose instances on AWS and GCP.
When determining the size of cloud instances, you can either “scale out” or “scale up”. Both methods have a trade-off regarding manageability. A scale out approach with fewer users per server provides smaller failure domains (i.e. fewer users impacted by resource bottlenecks or server failure) and better flexibility for power cycling to save on-demand costs. (I’ll provide more on this later). However, a you must manage a larger number of VMs and Windows licenses. A scale up approach involves the inverse, but with better manageability through fewer VMs and lower Windows OS licensing costs.
Azure, AWS, and GCP offer VM sizes up to 72+ vCPU per VM. Since 32+ vCPU VMs would create a fault domain that wouldn’t be tolerable for most enterprises, we limited the sizes under investigation to between 2-16 vCPU per instance. This allowed us to compare both scale out (2-4 vCPU) and scale up (8-16 vCPU) with sizes in alignment with field deployments.
At a high level, cloud computing pricing typically falls into two categories, on-demand capacity, and reserved capacity. On-demand capacity may be subject to availability and is charged while virtual machines are running. Reserved capacity is guaranteed and typically can be purchased in multi-year increments. There is no cost benefit to power cycling reserved capacity.
A summary of the on-demand instance and reserved instance models for each vendor is shown below:
||EC2 On-Demand Instances
|Reserved||Reserved VM Instances
||EC2 Reserved Instances
Determining the model that makes the most sense for Citrix workloads is dependent on the project daily uptime of the VDAs supporting your use case. This is typically a factor of the working hours plus idle and disconnect timers (since sessions can’t be migrated).
An example of comparing VDA uptime to cost is highlighted in the table below. From this table, an F16s_v2 workload in Azure with a daily uptime of 17 hours or greater may have better cost efficiency as an Azure-reserved Instance.
Figure 2: Daily Instance Compute Cost vs. Uptime
What is the best way to deploy and operationalize these different cost models? For example, during the design phase, a specific use case requires 20% of the 100 instances required to support the use case be available at all times. Reserved instances are purchased to support this requirement. Citrix Smart Scale can optimize a mix of reserved and on-demand instances for the same instance type. Smart Scale augments standard XenDesktop power management with load-based and schedule-based policies that can automate the power operations of a delivery group to optimize costs.
Within Smart Scale, the default (or a minimum number of machines) can be set to represent the reserved instances determined during the design to support a given use case. In the scenario above, this is 20%. Then using the load-based and additional scheduling for peak/off-peak hours, pay-as-you-go instances can be optimized based on demand. This is illustrated in the diagram below.
Figure 3: Smart Scale Sample Configuration
While these details are provided as guidance, and a deeper understanding of cost impact for Citrix on public cloud design, it is recommended to consult with the applicable vendor regarding detailed pricing considerations and Microsoft licensing implications.
LoginVSI was used to determine the single server scalability (VSImax) of Windows Server 2016 workloads with Microsoft Office 2016. In order to generate conservative, real-world, easily reproducible results, test instances were unoptimized with no policy or OS optimizations applied and Windows Defender running with no exclusions.
With these workload parameters, the following instance types were optimal in terms of performance and cost ($/user/hr, published on-demand pricing):
- Azure – Standard_F16s_v2 (16 vCPU, 32 GB) – Detailed Analysis: Whitepaper – Scalability and Economics of Delivering Citrix Virtual Apps and Desktops from Microsoft Azure
- Amazon – M5.2XLarge (8 vCPU, 32 GB) – Detailed Analysis: LoginVSI Blog – Citrix Virtual App user density on AWS
- Google – n1-standard-8 (8 vCPU, 30 GB) – Detailed Analysis: Citrix Blog – Right-sizing Citrix XenApp on Google Cloud Platform
In an on-premises world, CPU consumption tends to be the primary metric for user density and single server scalability on a physical host. With more $ per hour allocated CPU, the compute optimized instances on Azure outperformed the general purpose instances for LoginVSI task and knowledge worker profiles. With IaaS just being workloads running in someone else’s datacenter, CPU is still the limiting factor for scalability in most Citrix deployments. We look to test this hypothesis as we continue testing the compute optimized instances in AWS and GCP.
In our original cloud scalability guidance, the $/user/hour sweet spot fell into a scale out model, contradicting the original “go big or go home” approach we originally promoted on-premises deployments. The latest instances shift the optimal $/user/hour to the scale up model with larger instances providing better cost efficiency. So, the good news is now you can apply the same best practice of scaling up to both on-premises and cloud-based workloads.
It is important to note that VSImax does assume “full load.” So, while your mileage could vary, these instance types are a pretty solid starting point for a public cloud POC or initial rollout. But what about configuring Smart Scale to optimize costs? What about more memory-dependent workloads? This is largely dependent on the overall workloads in your use case. Review the use case’s requirements and test. Evaluate if your performance is optimal and adjust accordingly. Stick with the on-demand pricing models, such as Microsoft’s pay-as-you-go, during this evaluation before reserving instances. Over time, determine the average uptime of your VDAs to understand if there is a cost benefit to reserving a percentage of your capacity.
If you’ve done similar testing with Citrix workloads in a public cloud it would be great to hear from you. Feel free to add a comment below and share your story. Thanks for reading and good luck as you continue your cloud journey.
Enterprise Architect, Citrix Consulting Services