Today Dutch Meyer of UBC, and Jake Wires of the Citrix XenServer storage team in Vancouver submitted our implementation of the Microsoft VHD virtual hard disk format to the Xen community for inclusion in the open source code base.    So, if you want to write applications that read/write and process VMs in VHDs, you now have everything you need.   The software is licensed under the BSD license.

Why are we doing this? 

  • First, the various Xen implementations from the Linux vendors vary wildly in their support for virtual hard disk images, and the performance of their implementations.  Thus far we have yet to see any good implementations of VHD in the Linux vendor category.    Cluttering users’ storage with raw image files without any of the benefits of the built-in capabilities for snapshotting, cloning etc that are fundamental primitives in any production virtualization environment, is just a bad idea.    
  • Second, since the majority of VMs will be in the VHD format in future, we want to enable the ISV ecosystem to adopt the format and quickly deliver a rich set of add-on capabilities that allow users to be more productive in their virtual environments.   VHD is more than just a VM format used by Hyper-V – it’s a delivery format from Microsoft for future versions of Windows.  The format is documented publicly and the specification is available under the Microsoft Open Specification Promise program.   
  • Third, with Xen as the dominant hypervisor in use in the world’s largest clouds, we want to enable cloud operators to benefit from our optimized implementation of  the VHD format to accelerate their progress towards hosting Windows in their clouds.    Being an optimistic chap, and noting VMware’s sudden warming to open source, there is presumably a non-zero chance that they will pick up our VHD code, realizing that VMDK will at some point go the way of the dinosaurs. 
  • Finally the code also supports QCOW, which means it should be easy to adopt for Linux distros that have been living in a parallel universe without VHD support.  Hopefully the QCOW team will implement the VHD support as another supported format within QCOW, which would be extremely powerful.

The release notes follow.These patches contain a completely rewritten blktap implementation and are an open source release of what Citrix intends to use in future releases of XenServer.   
They also contain Citrix’s implementation of the VHD image format.

VHD is what XenServer uses to store file-based images, and this code is considerably more robust and efficient than the qcow implementation that is in the tree today. 
Benefits to blktap2 over the old version of blktap: 
* Isolation from xenstore – Blktap devices are now created directly on  the linux dom0 command line, rather than being spawned in response  to XenStore events.  This is handy for debugging, makes blktap  generally easier to work with, and is a step toward a generic  user-level block device implementation that is not Xen-specific. 
* Improved tapdisk infrastructure: simpler request forwarding, new  request scheduler, request merging, more efficient use of AIO. 
* Improved tapdisk error handling and memory management.  No  allocations on the block data path, IO retry logic to protect guests  transient block device failures.  This has been tested and is known  to work on weird environments such as NFS soft mounts. 
* Pause and snapshot of live virtual disks (see xmsnap script). 
* VHD support.  The VHD code in this release has been rigorously  tested, and represents a very mature implementation of the VHD image  format. 
* No more duplication of mechanism with blkback.  The blktap kernel  module has changed dramatically from the original blktap.  Blkback  is now always used to talk to Xen guests, blktap just presents a  Linux gendisk that blkback can export.  This is done while  preserving the zero-copy data path from domU to physical device. 
These patches deprecate the old blktap code, which can hopefully be removed from the tree completely at some point in the future.