We’ve already “turbo-charged” your storage throughput with the new CiRwOtD feature (if you don’t know that acronym, you better ask somebody!) But what about improving PVS merge, boot and streaming times?
That’s exactly what I wanted to share with you today – partly because this config can improving streaming performance, but also because this is a “hidden” or undocumented feature of PVS 7.7.
Before I get down to brass tacks, I just wanted to give a shout-out to the PVS team. PVS 7.7 was released a couple weeks ago along with XenApp/XenDesktop 7.7 and I don’t think it got the hype it deserved – there are some fantastic things we’ve included in this release. My favorite features are probably a real PoSh interface and the ability to do in-place upgrades, both things we’ve honestly been begging for a long time now. But my other 2 favorite things are more performance-related – VHDX support and multiple read IO threads. Dane Young has already done an amazing job at documenting some of the VHDX goodness and how it can improve things like merge operations (it’s a must-read!), so I’ll focus on the latter in this article.
Turbo-Charging PVS Streaming
We snuck something into the 7.7 code that can really improve target device streaming. And it’s not like streaming performance was poor before, it’s just better now. Why? The key PVS driver, which takes over after the bootstrap, is now optimized for multi-core systems. In other words, the PVS driver that does the streaming in the steady-state is multi-threaded and can handle multiple read operations concurrently.
The beauty of this new feature is you literally have to do nothing to turn it on – it’s “enabled” by default if you will, but there are 2 requirements:
- You need to use the 7.7 PVS driver. All that means is you need to update your target devices to 7.7. It’s important to note that you do NOT need to re-create your BDM ISOs or partitions to take advantage of this (I point this out because I had a mistake in the first version of this article, saying that you did need to use the 7.7 bootstrap – keep reading for why that is).
- You need to make sure your 7.7-based target devices have at least 2 vCPUs or cores assigned. One note on this – even if you specify 4 or 16 vCPUs for your target devices, we’ll only allocate 2 read threads max on the PVS side.
So, if you are using 7.7 target devices and specify 2 or more CPUs, PVS will automatically leverage 2 threads for read operations.
Impact & Results
I’ve run a few tests already and the results are impressive. Before (PVS 7.6 with a single CPU), it took about 1 minute 20 seconds to boot and completely login a target device with 100% of GPOs processed; it now takes just under 1 minute (59 seconds on average). And all I did was update my targets to 7.7 and specify 2 vCPUs. And it’s 27% faster.
One other thing I need to clarify since folks were starting to comment about it – you do NOT need to do the ‘msconfig’ tweak to seemingly increase CPUs for the boot cycle. And you’ll see more performance gains in the steady-state versus boot. Why? That is because the bootstrap that is initially used by the BIOS cannot take advantage of multiple cores – only the PVS driver that takes over after the bootstrap can leverage multiple cores and benefits from this new feature.
My former colleague, Dan Allen (now at Bromium, but still a PVS fanboy and we’re also collaborating on a new whitepaper at Team VRC to show the impact of web browsing), also ran a test in his own lab and found his load times improved by 25%. Your mileage will obviously vary, but these initial numbers seem to be pretty consistent. And this is one of those no-brainer things to implement (without any downsides or side effects) simply by upgrading to 7.7, if you so choose.
If you are able to do some comparative tests like we’ve done (whether it’s this new feature or VHDX-related), I’d love to hear about your performance gains in the Comments section below. Hope this trick helps you in your travels.
Lead Architect, Citrix Consulting Services (CCS)