“New” Citrix Best Practices

Overview

There are best practices and then there’s reality…I have been saying that for years. Because I believe too many folks think some best practices are set in stone or should be implemented or followed no matter what the situation. To be honest, this could not be farther from the truth. It’s why many Professional Services organizations (CCS included) use the phrase “it depends”. There is also a reason why our Legal team at Citrix cringes when we use the phrase “best practice”…because for a certain situation at a certain customer with a certain set of business requirements, that practice we recommended might not actually turn out to be “best” or truly optimal, and it could result in say downtime, disaster, upset customers, lost revenue, etc. And that word “best” may end up having legal implications, which is why we tend to use “leading practices” in our documentation instead these days. But the point is best or leading practices evolve…certain best practices change over the years…a “best practice” in one situation at one customer might be a worst practice at another customer. It really does depend.

For this article, as we start 2014 and as I was doing some reflection on my almost 10 years at Citrix, I was thinking about which particular Citrix best practices have changed the most over the years. Which old best practices are flat-out wrong these days? None of these are actually “new” as the title of my article might imply – but I still hear a lot of these common “myths” or old best practices being said or implemented every day. So the following is a collection of some of the more popular myths and leading practices that have changed the most dramatically over the last few years in this Citrix world we live in. I hope this finds everyone well in the New Year and please don’t be afraid to challenge leading practices in the future…best practices are meant to change and I create new best practices every day. 😉 (The list below is in no particular order by the way.)

Common Myths and “New” Best Practices

Session Reliability is Bad. I can’t tell you how often in the past I used to recommend disabling CGP or the “Session Reliability” feature. It honestly used to be coded pretty poorly and ended up causing excessive network traffic and masking real network issues, while providing little to no useful feedback to the end-user. But after our smart engineers in the UK got a hold of it a few years ago as we moved to an IMA-less XD architecture in v5, CGP changed. And it changed in a good way. Now we recommend leaving SR/CGP enabled! Check out this and this for more info.
XenApp VMs with 2 vCPUs. This was another leading practice for some time about 5 or 6 years ago. But with recent advancements in hardware, hypervisor schedulers and NUMA-awareness, I think we’ve finally proved this wrong. We’re actually recommending XA VMs with larger VM specs all the time nowadays, such as 3, 4 or even 8 vCPUs. Check out this and this for more info.
PVS Should be Physical. Our original stance on PVS was to make it physical. But after 10Gb+ networking, advancements around things like LACP and understanding how PVS works so we can size it correctly, we almost always recommend virtualizing PVS these days. Check out this and this for more info.
Isolate the PVS Stream Traffic. This is still somewhat controversial, especially because we have dated technotes saying that you should do this for performance or troubleshooting reasons (both invalid reasons in my opinion by the way). But again, with recent advancements in virtualization and networking, I’d argue there is very little to gain from doing this. Our customers that keep their designs simple usually have the most success with PVS. Check out this for more info.
Only Redirect Certain Shell Folders. Not really Citrix-specific, and more MSFT-specific, but still something I wanted to touch on since profile design is near and dear to my heart. We used to recommend only leveraging folder redirection for specific shell folders such as MyDocs and AppData (and maybe Desktop in some cases). But ever since MSFT re-designed the user profile namespace about 5 years ago and even told everyone that they recommend redirecting all shell folders, we have been saying the same thing. You can argue the merits of redirecting AppData, but that’s sort of beside the point here – we want you redirecting pretty much everything you can so only ‘ntuser.dat’ is roaming in and out!
MCS Cannot Scale. I still hear this one being said by a lot of folks and they don’t even know why half the time. MCS can scale. Especially if you couple it with thin provisioning and Intellicache. And it doesn’t require or generate 1.6x IOPS compared to MCS like we thought a few years ago (it’s more like 1.2x today). Yes, there are still some operational challenges associated with using MCS in an enterprise setting…but it can absolutely “scale” and it works just fine. And it’s going to get a lot better in the future, without the complexity of extra infrastructure and networking configuration that PVS introduces.
Multiple Farms are Bad. I have been trying to debunk this myth for years. Multiple XenApp farms are not a bad thing in my mind. Neither are multiple XD “sites” (really, they are farms, too – we just call them sites). I’m a big believer in horizontal scalability and the “pod” architecture for known scalability and stability. We have lots of tools to manage multiple XA farms these days, too. And with virtualization and hypervisors being mainstream, I am always going to argue that spinning up and managing a few extra ZDCs or Controllers is well worth it compared to an outage caused by trying to scale vertically. Check out this for more info.
20 VMs per LUN Should be Strictly Followed. First off, this is a good rule of thumb. But it only applies to block-based storage (FC, iSCSI, FCoE, etc.). I see too many people still quote this number or design using this rule when using NFS, which is file-based. I also see people designing with this rule in mind, but then they are using vSphere and a VAAI-capable array! Save yourself the management nightmare of managing hundreds of tiny LUNs – you can probably go a little bigger. Check out this and this for more info.
Pagefile Should be 1.5x RAM. Thanks again to the wizard, MarkR, for debunking this myth for all of us. This best practice is probably a decade old now and should never, ever be followed! Please ask yourself if you even need the capability of a complete memory dump…and then do some simple testing and figure out what size to make the pagefile in your particular environment. Check out this for more info.
SSDs and Shared Storage are the Only Answer. I still get a ton of questions about whether SSDs are good, bad or ugly…and whether shared or local storage is the way to go. The bottom line is SSDs are still expensive and they are not all created equal in terms of write performance and longevity (yes, prices have come down and we’ve made some improvements over the last few years, but still…). And shared storage arrays from the big iron vendors are always going to be expensive. This is why many 3rd party companies have popped up over the last 5 years, to the likes of Whiptail, Nutanix, Nimble, GreenBytes, etc. This is also why I believe that dynamic storage tiering with QoS and storage virtualization are the future…many customers are getting smart in this area and going with a hybrid approach already in terms of combining SSDs and spinning disks…and maybe even using local storage for VDI deployments altogether to save a ton of money (especially when HA or reliability of non-persistent desktops are of little concern!). So before you simply buy SSDs or that next million dollar array from our friends over there at EMC, ask yourself if you really need it for your virtualization project. Take a look at the I/O your workloads are generating first. Analyze your requirements again. Look at how it might affect your operational model. Or can you maybe take a smarter, different approach this time around as opposed to what you’ve been doing for the last decade?

Wrap-Up

That last bullet is probably a good one to end on since it’s sort of the point of this article – I really want to encourage everyone to challenge old ways of thinking and question some of these age-old best practices. Sometimes we get it wrong at Citrix. Sometimes our CCS folks might follow or implement old best practices. And if we do, please let us know about it and we’ll get it fixed. Even leave me a comment below and I’ll make it my personal mission to get it fixed. And then we’ll get it communicated throughout our organization and Community so the world is a better place.

Do you have another “myth” or best practice that you think has changed dramatically over the years? If so, please drop me a comment below and I’ll either respond or maybe even add it to the list above with an update to the article. I’m a big believer that this type of transparency can help our customers and partners design and implement better Citrix-enabled solutions in the future. 2014 is going to be a great year.

Cheers, Nick

Nick Rintalan, Lead Architect, Americas Consulting, Citrix Consulting Services (“CCS”)

Topics

Products