“New” Citrix Best Practices 2.0

Update, June 28, 2022: Since publication of this blog, we’ve provided updated guidance around recommended practices for Citrix Provisioning Services. Learn about our updated guidance.

It’s been a couple years since I published the first “New Citrix Best Practices” article, so I wanted to publish another article for a couple reasons.

The first is pretty obvious in that things change quickly in this industry – what we considered leading practices last year might not be anymore. Even I look back at that article from 2014 and laugh a bit at some of the stuff I wrote.

The second reason is that “Article 1.0” was one of the most popular pieces I’ve ever written, so the content must have resonated or proved valuable to some folks out there. And it was also one of the most commented articles on Citrix Community/Blogs with 93 total comments and counting. So, I feel like it’s a great time to refresh the list and continue to challenge some of our bad habits and old ways of thinking.

Now, let’s bust some myths …

Common Myths and “New” Best Practices (v2.0)

PVS Ports and Threads. I still see so many folks (including our own CCS team, so we’re guilty, too!) using non-optimal settings in terms of PVS ports and threads. If you haven’t done so yet, read my colleague’s post on updated guidance for ports and threads, as soon as you can; bookmark it and start using these leading practices on every PVS deployment going forward.
XenApp CPU Over-Subscription Ratio and COD. I still see so many XenApp administrators unwilling to over-commit any cores, let alone implement the “1.5x” ratio that I’ve been preaching for the last 5 years. But, as I have been saying for the last year or so now, it’s time to take a hard look at implementing “2.0x” for XenApp/RDS workloads.Why? Hardware is better, hypervisor schedulers are better, humans are lazier than ever and the list goes on and on. So, I’ve actually been recommending (and implementing) a 2.0x CPU over-subscription ratio on a lot of projects lately. If you put it to the test with LoginVSI or real workloads, I bet you’ll find it is the optimal sweet-spot in terms of SSS or user density more often that not. And, sort of related to this XenApp scalability discussion, don’t be afraid to enable Cluster on Die (COD) if you have newer boxes with Intel HCC Haswell-EP+ chips.Because these Windows-based XenApp/RDS workloads that we are deploying on these hypervisors are highly NUMA optimized or “aware”, you can squeeze an extra 5-10% density out of your boxes by simply changing the default snooping method in the BIOS. If these concepts of CPU over-subscription or COD are foreign to you, I’d recommend reading the XenApp Scalability article I published last year.
Protocols and Codecs. This is another easy thing to do, but I still don’t see many customers doing it. I was in London a couple weeks ago presenting on this topic in great technical detail (check out my BriForum London Session for all the gory details). What all this really boils down to is this: if you’re deploying a “modern” MSFT operating system as part of your Citrix deployment (i.e. Win10, 2012 R2, etc.), then I highly recommend switching the default graphics codec from Thinwire H.264 to Thinwire non-H.264 (aka Thinwire Plus).Most apps and use cases do not need or benefit from H.264 and leaving that enabled can reduce your SSS, since it’s a CPU hog and 99.9% of XenApp/XenDesktop workloads these days are CPU-bound. And on the flip-side, if you’re deploying a “legacy” MSFT operating system (i.e. Win7, 2008 R2, etc.), then I recommend sticking with the proven Legacy Thinwire implementation. Legacy Thinwire is optimized for those legacy operating systems that relied on GDI/GDI+.If you don’t know how to toggle these graphics encoders, then the built-in policy templates are your best friend. I should also note that as of a week ago, when we shipped 7.9, we changed the default codec from Thinwire H.264 to Thinwire non-H.264. I personally think this is a great move.
Farm/Site/Zone Design. I’ve written a lot about Multiple Farms and Sites and if you’ve read my stuff, you’ll know I’m a big fan of multiple farms/sites and basically leveraging a pod architecture to increase resiliency and minimize failure domains. Because it’s not a matter of “if” you’ll go down, it’s a matter of “when.” But this is one I have to address head-on because there is some really bad guidance floating around out there in the blogosphere on this topic.Yes, the FMA architecture is bringing back pieces of the glorious Local Host Cache, but even with the 7.9 release, it’s not there yet. We still have reliance on a centralized SQL database and primary zone. And I’ve seen folks writing about this saying the LHC is back or all there – let me be clear, it’s not. We have connection leasing that made its debut a few releases back and we introduced some multi-site or zone concepts in 7.7. But if you read the fine print in WilliamC’s awesome article (or test it yourself), these FMA-based “zones” aren’t like the old IMA-based zones.What I mean by that is if you’re doing just a couple thousand VDAs, then you need to have sub-10 ms links! For this reason alone we really haven’t been implementing these FMA-based zones just yet in the field – we tend to go with multiple sites just like before. And again, to be crystal clear, connection leasing 2.0 or the FMA-based equivalent of the LHC is not in the 7.9 release that shipped last week. You’ll just have to wait and see what we have planned next.
vSphere Cluster Sizing. A few years back, we used to stand really firm on this one, saying that you should probably cap vSphere clusters with XenDesktop or XenApp workloads at 8 and 16 hosts, respectively. But I’m now recommending (and implementing) 8-16 hosts per cluster for XenDesktop workloads routinely. And as many as 24-32 hosts per cluster for XenApp workloads, especially if you go with larger XenApp VM specs like you should be! With all this said, it’s still wise to scale out and leverage multiple clusters but we shouldn’t be capping clusters at 8 hosts these days simply because some Consultant said so in 2011.
PVS and MCS Memory Buffers. First off, if you haven’t upgraded your PVS infrastructure and “converted” to RCwOtD for the wC method and VHDX for the images themselves, then it’s probably time to at least consider it. But the more important thing here that applies to both PVS and MCS (yes, TobiasK – we finally got it for you in 7.9!) is making sure you modify the default memory buffer that PVS’s latest wC method uses and now MCS can use in 7.9. I’m talking about the amount of non-paged pool memory that PVS or MCS can use to effectively cache I/O before we spit it out to disk. While the amount will vary on the type of workload, I do not recommend going with the default since it’s pretty low. I recommend going with 256-512 MB RAM for XenDesktop workloads if you can…and 2-4 GB for XenApp workloads. I have found that is a great sweet spot to cache about 90% of the I/O being generated by the workload in memory vs disk. And by the way, if you’re interested in learning more about how the new MCS I/O optimization feature works in 7.9, one of our CTPs, Andrew Morgan, wrote up a nice article here – MCS I/O Optimization.
XenMobile “Optimizations.” We’re not just about Virtualization here at Citrix – we have Mobility, Networking and Cloud products, as it turns out, too. 😉 And we’ve been doing a lot more XenMobile deployments as of late. And my colleague, Ryan McClure, just published an outstanding article on XenMobile optimizations, documenting some of the “optimizations” we have been configuring on basically every deployment. And I put optimizations in quotes because these tweaks really aren’t – they are more like better defaults that we’ll soon be incorporating into the product. So, if you’re embarking on that Mobility journey, be sure to read that article in its entirety because it could save your life.

Wrap-Up

As I said in the first “New” Best Practices article, I really want to encourage everyone to challenge old ways of thinking and question some of these age-old best practices. Sometimes we get it wrong at Citrix. Sometimes our CCS folks might follow or implement outdated best practices. And if we do, please let us know about it and we’ll get it fixed. Please leave me a comment below and I’ll make it my personal mission to get it fixed. And then we’ll get it communicated throughout our organization and Community so the world is a better place.

Do you have another “myth” or best practice that you think has changed dramatically over the years? If so, please drop me a comment below. As I said a couple years ago, I’m a big believer that this type of transparency can help our customers and partners design and implement better Citrix-enabled solutions in the future.

Cheers,
Nick

Nick Rintalan, Lead Architect – Citrix Consulting Services (“CCS”)

Topics

Products