That is the question…and it’s fitting for my first official Citrix blog post. Because I’ve been asked this question so many times by customers, partners and even colleagues, I figured it would be nice to simply point to a link instead of repeating myself over and over again. This is also one of those questions that you’re going to get different answers on depending on who you ask. It’s like discussing scalability numbers with a room full of Citrix Architects – everyone has an opinion, and at the end of the day, we’ll pretty much agree to disagree.
I first want to point out that the question I’m tackling is “Should I Virtualize PVS” and not “Can I Virtualize PVS”. Because you absolutely can virtualize PVS. But would you want to? That’s the harder question and what you’ll get different opinions on. I’m going to give you my opinion – that of a Citrix Consulting Architect who has implemented both physical and virtual Provisioning Servers many times in the real world – note this is just my 2 cents and not the official stance of Citrix Systems. Now that I’ve set the stage and done a little CYA like a true Consultant, let’s get into the good stuff…
The general consensus across the industry is that PVS should be physical because it’s not an ideal “virtualization candidate”. And why is that? First we need to define what makes a good or bad “virtualization candidate”. And in order to explain this concept, I’m going to steal a page out of my buddy Ron Oglesby’s famous ESX book:
- “When a decision is required on whether or not a server should be provisioned as (or migrated to) a virtual server, several criteria should be considered. In each case the amount of use, potential hardware utilization, number of users, and any other special hardware requirements will have to be reviewed before a decision is made”.
So Ron is basically saying “it depends” – our favorite answer in Consulting. 😉 But Ron also goes on to elaborate on the virtualization candidate concept and provides a nice magic quadrant, where he places ideal virtualization candidates in the top right quadrant and poor candidates in the dreaded bottom left quadrant. An example helps explain this concept – the Citrix License Server is one of the better candidates for virtualization when it comes to Citrix infrastructure because it consumes very little memory and its CPU consumes less than 5% on average, even in a very large environment. On the other hand, Citrix Provisioning Server has resource bottlenecks that don’t translate well into the virtual world, specifically network and disk, making it a poor virtualization candidate in the classic sense. For this reason, the majority of our customers in the enterprise space have ultimately chosen to go physical for PVS. But it’s not that easy, the times are changing and some customers aren’t even sure why they are going the physical route to be honest. And if you asked me this question a few years ago, I would have probably told you go to physical. But now that we understand the resource bottlenecks of PVS and exactly how it scales, virtualizing PVS might not be the worst idea in the world anymore, especially with a couple recent advancements in technology. Let’s go deeper…
First let’s talk about the potential disk bottleneck, which can be largely mitigated if you design your PVS environment correctly. (I’m assuming you’ve read my colleague Dan Allen’s whitepaper by now, but if you haven’t, it’s quite excellent and will help you better understand how PVS uses system cache, which is key to this discussion and overcoming the disk bottleneck.) Since designing/configuring PVS is out of the scope of this post and would likely require a separate whitepaper, I’ll make a couple key assumptions – you’re running PVS on a 64-bit OS to maximize kernel memory and you’re leveraging a block-based protocol to serve vDisks. The whole “block-based protocol” concept is easier to understand in the physical world (vDisks reside on one or more LUNs and the PVS boxes are connected to the shared storage array via FC, iSCSI, etc.). But also keep in mind that direct attached storage (i.e. local) will also allow PVS to leverage Windows’ built-in caching capabilities in a virtual world. And we can also do things like pass-through LUNs and whatnot in the virtual world, but back to the post. Assuming you’ve done those things and PVS is serving almost everything out of memory as opposed to reading from disk (which will cripple performance), then disk bottlenecks start to become more of a non-factor. Keep in mind I’m also assuming you are using the most popular write cache configuration – target device hard drive, which is typically a secondary virtual disk in the XenDesktop world that ultimately resides on shared storage. And the storage that hosts these “wC drives” should be separate LUNs (likely multiple LUNs in a large environment), and preferably configured with some type of RAID configuration optimized for writes as opposed to reads. But we could talk about the PVS wC, IOPS and LUN sizing forever, so again, let’s get back to this post and move onto the network.
And this resource bottleneck (network I/O) is what makes most people say physical instead of virtual. And with good reason…the lack of true (recommended/supported) LACP support on XenServer and 1 Gb NICs being mainstream typically put the nail in the coffin and we would recommend physical (provided the number of target devices are high enough, which I’ll get to in a minute). But if you are going to virtualize PVS on ESX/vSphere, then you might have a fighting chance because VMware has supported static LACP configurations for a while now. If you’re not up to speed on LACP or 802.3ad, this essentially means that when you team/bond two 1 Gb NICs, you either get 1 or 2 Gb of effective throughput. And this is obviously huge when it comes to scaling PVS and deciding whether it’s an ideal virtualization candidate in your environment – we’ve found that each 1 Gb NIC on PVS can serve around 300-500 target devices in the steady-state. And no matter how many 1 Gb virtual NICs you assign to a PVS VM on a hypervisor without LACP support, you’re going to be network constrained once you reach 1 Gb. The same holds true in the physical world – if you have a box with 4 NICs and don’t team NICs or team NICs without 802.3ad support, then you’re still going to be network constrained once you reach the 1 Gb mark. But, as I said earlier, the times are changing – LACP support is just around the corner in XenServer, 10 Gb NICs are becoming the norm and we already have support for Single Route I/O Virtualization in the current shipping version of XenServer. Assuming the rest of your networking gear (i.e. switching infrastructure) is also 10 Gb-capable (or SR-IOV compatible), then you probably have a great opportunity to virtualize PVS.
The last thing you need to look at is the number of target devices. You just heard me say 300-500 target devices per 1 Gb NIC. It is important to note those are “steady-state” numbers. In other words, one PVS server wouldn’t be able to support 500 target devices all booting up simultaneously. Please also note I provided a range there…and that is because this highly depends on the number of concurrent vDisks you are serving, the size of the vDisk and the Operating System contained within the vDisk. Let me explain with an example – if you’re serving up a single 15 GB Windows XP vDisk, then you’ll probably be closer to that 500 number. If you’re serving up two or three 40 GB 2008 R2 vDisks simultaneously, then you’ll probably be lucky to get around 300. And remember this is the “1 Gb scenario”. So if you’re only planning to only use PVS for XenApp and have 100 XenApp servers in your farm, then you’re probably wise to virtualize PVS and save some money. Or if you have a small XenDesktop environment with only a couple hundred Windows 7 VMs, then virtualizing PVS is also probably a good bet. Or even if you have a XenDesktop environment with 1000 VMs and you’re limited by a hypervisor without LACP support or 1 Gb switching infrastructure (very common), you also might be able to cut up a beefy physical host and deploy 3 or 4 virtual PVS servers (remember N+1 design as always). And since I know I’m going to get this question, a good starting spec for a virtual PVS server is 4 vCPUs and 16 GB RAM. And of course more than a single 1 Gb NIC if you can.
So if you asked me this question a few years ago after we acquired Ardence, my answer was “Probably Not”. But my answer to this question today is “Maybe”. And that’s because we understand PVS a lot better and there have been a couple key advancements in technology (10 Gb NICs, SR-IOV, etc.). If you ask me this question again in a couple years (after XenServer has true LACP support and the majority of networks are upgraded to 10 Gb), my answer will likely be “Probably”. Since I love football (and my Oregon Ducks), the equivalent analogy would be LaMichael James just got upgraded from Doubtful to Questionable to Probable (past, present and future, respectively).
I hope you enjoyed my inaugural post. So the next time someone asks you this question, hopefully you can articulate why “it depends” or point them to this article. Thanks for reading and let me know if you have any questions or feedback.