Have you ever heard conflicting information about a setting or best practice and wondered what was really true or the “real” best practice?  Never from Citrix, right? 😉

I’ve recently heard some people saying to use fixed vDisks with PVS and others saying dynamic.  So what is the true or real best practice?  That’s precisely what I’d like to discuss in this article.

Before I begin, please note:

  1. I’m not going to provide an overview of the Virtual Hard Disk (VHD) format or explain the difference between fixed and dynamic VHDs.  I highly recommend reading Jeff Muir’s short article on the VHD format if you want a history lesson or refresher.  The actual VHD specification (now “owned” by Microsoft) is something I recommend checking out as well – it’s important to understand concepts like the footer and block sizes so you know how dynamic VHDs grow.
  2. I am only talking about PVS in this context of fixed vs dynamic.  Many other technologies and products out there use the VHD specification.  And just because one disk image type is the “best practice” for PVS doesn’t mean it’s the way to go for other products or technologies out there that use the VHD spec (XS, H-V, etc.).  In fact, I’ll explain why PVS is “special” in this article and why we can go against the original best practice of fixed!
  3. I am only talking about the vDisks – the VHD file that contains the operating system image that PVS streams to target devices.  It’s very important to distinguish between the PVS write cache and the PVS vDisk.  And again, just because one disk image type is the way to go for the vDisks doesn’t mean the same holds true for the write cache.  While we support the equivalent of “dynamic” for the secondary drive that hosts the write cache file (.vdiskcache) – that equivalent being thin-provisioning – the write cache is not an ideal candidate for thin provisioning.  We are constantly writing to that file in very small chunks (i.e. 4k random write IOPS), so the penalty associated with going dynamic or thin-provisioning that drive could be significant, depending on the type of storage that backs it.  So I want to be clear – this article is only about the vDisk (but maybe I’ll do another article about thin provisioning the write cache later since it is somewhat controversial!).

Now that I’ve done the typical Consulting CYA thing, let’s get down to brass tacks.  We acquired Ardence in 2006 and the best practice (after we adopted the VHD specification for vDisks I might add…we didn’t originally use VHD!) was always something along the lines of “fixed for production vDisks, but dynamic is OK for POCs and testing”.  Sounds reasonable, especially because performance has always been king in this game.  So we went on for about 3 or 4 years and published many things that said to always use “fixed” vDisks.  Well, actually our guidance on this topic has been pretty confusing or non-existent to be honest. That’s partly why I’d like to “clear the air” today.  Let’s examine some of the Citrix resources on this topic.

In our XenDesktop with PVS best practices whitepaper, we didn’t even mention fixed or dynamic vDisks. Oops…and I think I co-authored that.  Guilty!  In our PVS 5.6 best practices whitepaper, we say to use fixed because “the internal structure of dynamic VHDs is different and can cause alignment issues”.  OK…at least that’s some guidance but is that really true?  Let’s hold that thought for a second.  In the PVS for XA Implementation Guide, we say to use dynamic but we don’t provide any reasoning or justification. Darn.  In the PVS eDocs page where we discuss the topic of imaging, we just state that you can go with either fixed or dynamic.  So no help there.  And finally, in the technote where we detail the best practices for creating a XD image, we also just state the two types and provide no guidance.  Fail.

So if you add it all up, that’s a few resources that don’t say anything, one resource that says fixed and one resource that says dynamic.  Awesome…now it makes sense why I see customers doing different things, partners saying different things and our own Consulting team not even knowing what we recommend on this topic!

So what’s the answer?  Well, if this was some technology or product other than PVS and performance was important, I’d recommend fixed.  But this is PVS…and PVS is “special” as I said earlier.  In my opinion, the reason we recommended fixed all this time was largely because we didn’t understand how PVS used system cache.  And I give a lot of the credit to understanding this to my buddy, Dan Allen.  If you don’t know how PVS leverages kernel memory, please do yourself a favor and read Dan’s excellent whitepaper called “Advanced Memory and Storage Considerations for PVS“.  But essentially, if we have a properly configured PVS server running on a 64-bit platform with plenty of RAM available and we’re truly caching the disk contents stored within the VHD in memory, then we don’t really care much about the disk or VHD because everything will be served out of memory!  And since most of the time we are using PVS we are using a standard mode image that is read-only, we aren’t really writing anything to that VHD, are we?  This is why I said PVS was special…as time goes to infinity, our reads to that vDisk (VHD) approach ZERO. That’s because we’re either using a block-level protocol to access the VHD (iSCSI, FCP or even local storage) or we’re accessing the VHD with NFS or CIFS, but we’re tuning PVS so it actually caches the contents of the vDisk in MEMORY.  This is also why we say PVS-based XD deployments are approximately 90% write IOPS and only 10% read IOPS – because PVS leverages its system cache to serve all those reads to subsequent target devices.  This is super-important stuff, so if this is new to you or you’re scratching your head, please read Dan’s whitepaper again.  This can mean the difference between a smoking hot PVS implementation and a very poor one.

So we can use dynamic for the PVS vDisks then, right?  That’s a fair assumption to make at this point, but there’s one more thing we need to know first.  Does PVS add anything to the beginning or end of the VHD like an extra header or footer?  Does the PVS VHD really have a “different internal structure” and not follow the VHD specification (possibly causing alignment issues resulting in poor performance)?  Do we grow the PVS VHD in non-standard block sizes or something?  I checked with our PVS Product Architect and PVS LCM team on this and the answers are NO to all of these.   And this is probably where even more of the confusion comes from…while we do add some PVS-specific info to the vDisk, we don’t add it to the actual VHD file.  We use the PVP file (which is the “properties” file left over from the Ardence days) to store this “side-car” info that is unique to PVS and PVS only.  But that’s good news because we simply follow the VHD specification for dynamic disks and we grow the VHD in 2 MB or 16 MB chunks (which you can specify when you create the dynamic vDisk in the PVS GUI).  And while there is a header and footer in the VHD, it’s not PVS-specific and I’ve confirmed we don’t extend the header or footer…it’s simply the header and footer used in all dynamic disks according to the PVS spec defined by our friends at MSFT.  And let’s remember…it’s not like we are really writing or expanding this standard mode PVS VHD file anyway, right? We only initially read from disk after reboots and suck everything into system cache (memory) as soon as we can, so we are barely even touching the actual VHD file in the steady state.  If we were leveraging PVS with private mode images (which are read+write), then it might be a different story here and dynamic could be bad.  But 99% of the time we are doing standard mode images and we’re going to serve everything from memory as opposed to disk in the steady-state.

So now I can safely recommend to use dynamic VHDs with PVS – even in production environments! Not only will you save a lot of time when doing things like vDisk updates, backups, copies, etc.  But you’ll also save a lot of money on storage costs by reclaiming all that white space!  And going with dynamic (versus the old best practice of fixed) could mean the difference between using extremely cheap local storage on a blade versus expensive drives on a SAN.  And most importantly, we don’t sacrifice performance when using dynamic vDisks because PVS is special.

Remember, if we have a correctly designed PVS environment, then we’re caching the contents of the vDisk in memory and barely ever touching the disk or VHD file.  And the fact that we’re using standard mode images means we’re not actually writing to that VHD file.  And lastly, all of our PVS-specific info is contained in a separate properties file (PVP) – we simply follow the VHD specification in terms of how we store the OS image inside a dynamic VHD and we grow/expand the VHD according to MSFT best practices as well.

I really hope this clears up any confusion (and I’ll make sure this “new best practice” is reflected in our documentation).  I’d like to personally thank Jarian GibsonJoe Shonk and Thorsten Rood for bringing this to my attention at BriForum last summer (and questioning our best practices, frankly).  This should save our customers a lot of time and money as I said above…while getting the same performance we know, love and expect from fixed vDisks.

Cheers, Nick

Nick Rintalan, Senior Architect, Citrix Consulting