When the Xen Project took flight and the community took up the challenge of developing the industry’s most scalable, secure, and high performance type 1 hypervisor, I got to see for myself the tremendous benefits of community based development – complete freedom to innovate and to openly discuss any issue, from architecture to security, performance and best coding practice. Since every line of code is publicly available, there’s no point trying to pretend that it is beautiful if it is not, secure if it isn’t, or that it performs well if it doesn’t. Moreover, the community development style is dispassionate, and based on a simple philosophy: “best code rules”.

In line with our commitment to open development comes a commitment to open publication of research and other investigative results that use Xen. Early in the Xen project we found that our friends at VMware were watching the xen-devel mailing list and kindly alerting the press whenever a bug report crossed the wire. Their thesis: clearly if the community was working to fix a bug, Xen could not possibly be enterprise class. The community’s response to this sort of attack was to highlight the benefits of having a large number of developers and many different sets of eyes on the code, because bugs get found quicker and the code base moves forward faster. We reeled in a 5 year market advantage for VMware in under two years. We also noticed that our contributors took much greater care to develop high quality, extensively tested code before submitting it. And of course, Xen continues to move forward fast and to benefit from research at universities around the world.

In the spirit of openness then, I want to be the first to publish an interesting (embarrassing?) performance comparison between hypervisors done by Bernie Hannon’s team at Citrix that shows one area in which Citrix XenServer has room to to improve… relative to Hyper-V R2… for Windows 7 virtual desktops. Of course, acknowledging this is tantamount to a commitment to fixing it, so you should also view this as a statement of intent. The XenServer roadmap has a powerful set of optimizations for desktop virtualization, and our rapidly growing VDI-focussed team has a powerful set of optimizations in hand for “Midnight Ride” and its successor, “Cowley”.

Before getting to the results, here is the “caveat emptor”:

  • These results have not yet been replicated on a broad set of hardware, and
  • They are not intended to be conclusive or definitive – you can’t argue that they represent “truth” for real world use cases or even generalize this to mean anything more than that for this particular test, these were the results. Moreover Microsoft has now announced SP1 for Hyper-V, and XenServer “Midnight Ride” is in public beta, so the results are of transient interest only. We will test the latest & greatest from all vendors and publish the results soon.
  • Finally, since these results come from Citrix, and not an independent benchmarking team with no potential conflict of interest, you might want to simply disregard them until such time as those independent benchmarks are available. Indeed you ought to treat any vendor self-publicized benchmarks with a degree of caution.


  • Compare single server VM density for XenDesktop 4 using several hypervisors, including HyperV R2, XenServer 5.5, and others. In every case all vendor recommended optimizations for maximum performance and guest density were enabled. In our results we present XenServer 5.5, Hyper-V R2 and the best other hypervisor result.
  • Determine maximum density of useful desktop VMs per host and per CPU core

The results are XenDesktop 4 specific. Since XenDesktop (uniquely) is hypervisor independent it is possible to make this evaluation, whereas clearly, the VM density using VMware View on XenServer or Hyper-V would be a big round zero, so if you’re after View performance, you need to look elsewhere.


  • We used two types of desktop VMs:  Windows XP (1vCPU/512MB) and Windows 7 (1vCPU/1.0GB). Our
    test configuration is (we believe) identical to that of Project VRC with Login VSI 2.1 as the workload generator.
  • Host server: Dell PowerEdge R710/2 socket-quad core x5570, 2.93GHz/ 72GB RAM (max w/ 4GB DDR3 DIMMs)/Hyperthreading enabled (16 logical CPUs).

The Test setup looks like this:


Well, it turns out that of all hypervisors we have tested, Hyper-V R2 does best for Windows 7 VM density for this benchmark. Congratulations to the Hyper-V team for a fine job on this workload. XenServer 5.5 peaks at about 10% less than Hyper-V, and all other hypervisors perform less well than XenServer. The graph also shows the best performing “other” hypervisor, which underperforms both XenServer 5.5 and Hyper-V R2. We then plugged these numbers into the XenDesktop 4 ROI tool and found that Hyper-V would offer the customer a savings of about $225,000 for a 5,000 seat XenDesktop deployment, over 3 years, compared to the best other competitive hypervisor (XenServer and Hyper-V do not compete: both are free ).

For Windows XP guests Hyper-V R2 doesn’t do such a fabulous job. I’ve spoken to Jeff Woolsey, PM for Hyper-V, who acknowledges this readily because XP has a relatively short remaining lifetime, and because of the focus at Microsoft on Windows Server workloads and Windows 7 as the new client OS. Perhaps as importantly, the current maximum supported number of VMs per core on Hyper-V R2 is 8, whereas for XenServer in this configuration, Citrix supports 16 VMs per core. Here, XenServer beats all other hypervisors in terms of useful desktop VM density. The best of the unnamed hypervisors is second, followed by Hyper-V. Again using the ROI calculator, we see that XenServer offers a savings of about $160,000 over 3 years for a 5,000 seat deployment, compared to the best competitive hypervisor.

Finally, Project VRC has recently published results comparing Hyper-V R2, vSphere 4 and XenServer 5.5 for virtualized Windows Terminal Server workloads. They found that Hyper-V was a whisker ahead of XenServer, and both were substantially ahead of vSphere 4. Again, Kudos to Microsoft. The results clearly stung the VMware folk who rushed out a patch that they claim fixes their TS performance issues, but they have yet to disclose whether this affects their VDI performance, and we have yet to test it.

The Bottom Line

  • First, it is clear that the industry’s two leading hypervisors for these benchmarks for both TS and VDI, are Hyper-V R2 and XenServer 5.5. We should expect further gains with Hyper-V R2 SP1 and XenServer “Midnight Ride”.
  • Second, we need Project VRC to independently test the latest and greatest version of each product and to give the industry clear guidance.
  • Third, we in the XenServer camp will redouble our efforts to outperform all hypervisors.
  • Fourth, there is a key point that you may have missed: Each hypervisor we tested was optimally tuned using the vendor’s guidelines. What this means is that none of the purported benefits of vendor-specific features to optimize VM density per server deliver any benefit for this particular benchmark. While both Hyper-V R2 SP1 and XenServer “Midnight Ride” support dynamic memory optimizations, this set of benchmarks discovers the maximum number of useful, active desktops that the platform can support. Memory over-commit is of no use here, and in general needs to be managed with great care to avoid poor performance. Independent of vendor, treat all claims about memory optimization with a pinch of salt. It is easy for an airline to over-sell the seats on an aircraft – but what matters is how many passengers actually get to fly.

Finally, it is crucial to realize that VM density is but one dimension of the complete TCO equation for VDI based desktop virtualization, and but one parameter in the choice of “best” overall performance. Scalability (including of management), hypervisor cost, choice of storage architecture and virtual infrastructure platform scalability and manageability, and the end to end lifecycle cost of user desktops, including the ability to leverage existing management tools and skillsets, all play a key role in establishing TCO. The industry’s benchmarks need to evolve to test a broad set of parameters, and you should test different vendor solutions and arrive at your own conclusions. Insist that your vendor provide independent third party validation of their claims of superior performance or density.