This is the second blog in a two-part series to respond to the remaining unanswered questions from my Citrix, Microsoft, HP: Best Practices for Scaling Virtual Desktops webinar on June 17, 2010. I covered the first half of the questions in XenDesktop Scalability Q&A Part 1. Some questions with similar content have been combined into a single question and others have been adjusted for typing errors.

Q: Can I get a podcast of this audio?
A: Yes. You can replay the webinar by clicking here though registration may be required.

Q: This lab infrastructure looks quite impressive but what would be the price tag on it? What would the initial investment be?
A: Unfortunately I do not have an answer for this question. I did ask that question while I was at HP and they told me that I would have to contact my HP sales rep to get a quote. I will say that HP, Microsoft, and Citrix have recently teamed up to help you jumpstart your VDI implementation, including discounted licenses and implementation services. You can check out the Activate and Ignite offers by clicking here. Also, HP has released a whitepaper on a building a single rack VDI implementation that supports up to 1,000 users. You can find more about this configuration by downloading the Whitepaper – HP Reference Configuration for Citrix XenDesktop directly from HP’s website.

Q: What are the advantages of for Citrix Provisioning Server vs. other provisioning solutions?
A: I cannot speak for other provisioning solutions, so I am just going to share the advantages of Citrix Provisioning Services (PVS). The primary advantage of Provisioning Services is the ability to save storage costs. With PVS, you can create a single desktop image that can then be streamed over the network to the guests using the PXE boot protocol. The guests can be diskless, though for best performance we usually recommend a 2-3 GB local disk to function as a write-cache. The storage savings comes from only needing a write-cache drive which represents a fraction of the size needed for a full VHD. For instance, a normal virtual desktop deployment of 100 desktops each with a 20GB Windows 7 image would require about 2000GB of costly SAN storage. With PVS, the storage can be reduced to a single 20GB Windows 7 image and 100 2-3GB write cache disks or about 320GB, which could be local or SAN storage.

The second advantage I see is the ability to upgrade or revert with only a simple reboot of the guest. Since a single image can be used across multiple virtual desktops, updates need only be performed in one location. Then virtual desktops automatically pick up the most current image upon reboot. If for some reason that image fails to perform, the administrator can easily revert to a previously working image and reboot the guest. Factor in the reduced overhead on the host’s CPU and disk subsystems, (because the disk reads are now coming over the network), and the solution improves guest performance significantly.

Q: What metrics do you monitor to identify bottlenecks within Provisioning Server? Are the metrics the same between physical and virtual Provisioning Server?
A: Really good question. Normally, you monitor the same metrics you would for any other server, since the bottleneck could be at the CPU, RAM, Disk, Processor or Network level. However, that said, you will find that the network is by far the most utilized resource, so watching the Bytes Sent/sec performance counter is where I start. If your write-cache is on the server-side (instead of client-side) then you will also need to watch the Bytes Received/sec, the Processor % Idle Time, and the Current Disk Queue length. From the test results I have seen I would say the second most utilized resource is the CPU, so watch that one if you do not have a beefy box.

To give you an idea, for the 3500-desktop run we had three BL460C blades each with 48GB RAM, dual Nehalem processors and 4GB of network bandwidth. The network bandwidth was reaching 3GB when booting up all 3500 servers. The processors were at 50% idle. I had only one vDisk image so the current disk queue length averaged slightly over 0 and the server reported 45GB of available RAM.

Q: What metrics do you monitor to identify bottlenecks within the Desktop Delivery Controller?
A: As stated earlier, bottlenecks could appear in any resource, so I watch key indicators across the board. That said, the resource most likely to be the bottleneck with the Desktop Delivery Controller is the CPU. Watch the CPU %Idle time and the CPU %Processor time for the Citrix services, such as CdsPoolMgr on the Pool Master and both CDSController and IMASrv on all the member servers. Also watch the Context Switches and Processor Queue Length for general processor health.

Q: Can you provide the formulas used in the IOPS calculator spreadsheet?
A: No, but I am planning on releasing a copy of the spreadsheet in Q3 so watch my website for updates.

Q: What is the expected completion date for the whitepaper?
A: Currently the draft is circulating for review. Provided the reviews are completed in a timely manner and the document doesn’t have any major flaws, the plan is to release it by the end of Q3.

Q: When will offline virtual desktops be available?
A: Technically speaking, you can have offline virtual desktops today with XenClient, which is free. You can download it here.

Q: What method should I use for sizing XenDesktop?
A: In a general sense, Citrix and other vendors provide sizing guidelines. Some vendors, such as HP even have sizing tools available for download to help you get the best hardware fit for your needs. I believe the best method is probably to do a pilot with your power users. If you are serious about VDI, the best approach is to take and build a small pilot environment. The results of the pilot will give you an idea of what the performance expectations of the virtual desktop should be as well as a taste for how it will fit into your environment. Then you can take your performance data and correlate that with published VDI data on the same hardware, from there you can extrapolate what hardware you will need.

For instance, say you build Windows 7 desktops on a BL460C with 64GB RAM and Nehalem processors and choose Hyper-V as your hypervisor. At the end of the pilot you may determine that the optimal number of users is 50. If published results for the same configuration show an optimal number of users on a BL460C running a medium LoginVSI workload is 75, then you know that your workload requires about 50% more capacity than the medium LoginVSI workload. At this point, you can use the LoginVSI medium workload results on any hardware or hypervisor and adjust the capacity by 50% to get your number on that same configuration.

Q: Without seeing lab results yet, what is the largest number of virtualized desktops I can expect to run concurrently? What is the real-world experience vs. the lab?
A: Since this question is not specific as to the bounds of the concurrency I will presume it is referring to concurrent desktops on a single server. Essentially, the number of concurrent desktops that you can run on a single server depends primarily on your user workload. Generally speaking, the LoginVSI medium workload that we use in the lab is purely for comparative purposes and is unlikely to represent your actual user workload. What LoginVSI does do is give you an idea of what the hardware is capable of supporting in a specific configuration. I have seen the number of desktops for a single server range from 30 to over 120 with the limiting factors usually being available RAM or processor.

Q: Would one expect to see the same results regardless of the hypervisor being used or will that make a difference? Would we expect to see similar performance/numbers if we used vSphere4 or XenServer Enterprise/Platinum instead of Hyper-V?
A: I believe I answered this one during the webinar, but I will reiterate my position again. I believe hypervisors are now becoming more of a commodity with respect to performance and scalability with the key differentiators being features such as migration, snapshots, etc. Each hypervisor has its areas of focus and some may perform better than others in certain configurations. Overall I would venture to say that I expect similar scalability numbers (say within 10% tolerance) on the same physical hardware when comparing hypervisors.

Q: In Active Directory – the more global groups an ID is associated with the larger the ID bloat becomes. Today there is a limit on the size of the ID file which also limits access to the server farm. Today Microsoft doesn’t have any way of correcting this issue. Do you know of a way to get around it?
A: Unfortunately, no. Microsoft Kerberos tokens have a maximum limit of 1024 SIDs, (less in some circumstances depending on the length of your AD domain’s FQDN), which when exceeded prevents user logon because all group memberships cannot be evaluated. The only solution at this point is to reduce the amount of group memberships. Microsoft is aware of this issue and has provided guidance and best practices to help limit the exposure. You can download their guide from the Microsoft website but I cannot provide a better solution at this time.

That concludes the unanswered questions from my webinar. I hope you find the answers helpful. If I misunderstood your question, feel free to clarify it as a comment below.

If you found this information useful and would like to be notified of future blog posts, please follow me on Twitter @pwilson98 or visit my XenDesktop on Microsoft website.