By now you’ve probably heard about XenServer 7 and its cool new features. However, this release is more than just its feature list – we’ve also been investing in the foundations to make XenServer 7 a platform on which we can deliver even more innovation, features and enhancements over time.
Let’s take a look at a few of the platform and architectural changes we’ve made under the covers of XenServer 7.
But before we start let’s take a look at XenServer’s architecture for a bit of orientation. The Xen Project hypervisor is a “type 1” hypervisor that runs directly on the hardware. On top of this we run a privileged Linux virtual machine, called “domain 0” and based on CentOS, that runs the management toolstack and API, contains drivers for network and storage I/O, etc., and whole bunch of other stuff.
So what’s changed?
Domain 0 Linux version
One significant change in XenServer 7 is the upgrade of the domain 0 environment from CentOS 5.10 to CentOS 7.2.1511. In itself, the domain 0 Linux environment does not deliver any customer visible features however, because it hosts most of the on-server components of XenServer, it is hugely important to the operation of those components and our ability to add new features that depend upon them. By moving to an up to date, but stable and enterprise grade, Linux platform we are well placed to support XenServer 7 well into the future, benefiting from bug fixes and improvements coming from the CentOS project and its upstream providers and open-source communities.
The updated Linux platform also creates opportunities for future features and enhancements that would have been difficult or impossible on the older CentOS 5 platform. Coupled with XenServer’s existing Linux 3.10 kernel, the new platform given us more options to leverage Linux filesystems, such as ext4, as the foundations for future XenServer storage features.
You may be wondering why the move to CentOS 7 didn’t automatically mean that XenServer 7 acquired various features and mechanisms that CentOS 7 happens to have, such as SELinux or the XFS filesystem. This is because the domain 0 Linux platform is a component of the XenServer system (rather than XenServer being an application that runs on top of Linux) and is therefore tightly integrated into the overall system – when we upgrade components of the system our first priority is to ensure a like-for-like upgrade which preserves the integrity, functionality, quality, performance and security of the system. If the new component comes with new mechanisms that didn’t exist in the older version, we carefully consider if and how we can integrate those new mechanisms into the XenServer system (for example moving from ext3 to XFS would require careful consideration for XenServer upgrade and rollback use-cases and therefore isn’t a transparent change) – these additional integrations may come in later releases than the underlying component upgrade.
With the release of XenServer 7.0, we mostly use CentOS 7 as a like-for-like replacement for CentOS 5 – with new mechanisms in the newer version being candidates for integration and exploitation in subsequent XenServer releases. As with many of the foundational changes we’ll talk about in this blog I’d expect to see more of the value of CentOS 7 surface in customer visible features and enhancements over the next few XenServer releases.
Previous versions of XenServer have used a disk partitioning layout consisting of a 4GB primary partition (this hosts the domain 0 Linux environment and the XenServer stack that sits inside it), a 4GB backup partition to enable roll-back after an upgrade, and the remainder of the disk used as a storage repository for VM disk images. In pre-installed OEM systems there may be an additional OEM partition as well.
Although 4GB is more than enough for the XenServer software the same partition is also used for third party add-on “supplemental packs” as well as all the host logs and temporary staging for hotfix files and so on. Even with log rotation and compression there have been some cases of the filesystem filling up due to the volume of log files. Before XenServer 7 there have been some mitigations to this including a mechanism to cap the total volume of log files (in addition to the per-file rotation) and placing log files on a separate volume in space borrowed from the local storage repository. In XenServer 7 we decided it was time to move to larger partitions and make the log partition a default configuration. This means that we now have:
- 18GB XenServer host control domain (dom0) partition
- 18GB backup partition
- 4GB logs partition
- 1GB swap partition (just in case – we try to avoid using swap)
- 5GB UEFI boot partition
However, the traditional 4GB layout can be used in cases where the primary disk is smaller than the required 46GB. In cases where older XenServer hosts are upgraded to XenServer 7 the host will be repartitioned if the local storage repository is empty (as will often be the case if shared storage is used). See the installation guide for more details.
Xen Project hypervisor
XenServer 7 saw us upgrade from Xen 4.4 (in XenServer 6.5) to Xen 4.6, also incorporating the new content from Xen 4.5. This brought a number of useful mechanisms, fixes and enhancements including the virtual memory event subsystem and support for introspection using Intel EPT and AMD RVI which form part of XenServer’s Direct Inspect APIs allowing entirely agentless anti-malware VM introspection. The upgrade provided a new framework for managing which CPU features are exposed to VMs – a prerequisite for enabling VMs to take advantage of the CPUs advanced instructions such as AVX2 while enabling down-levelling of feature sets to permit VMs to be moved between difference CPU generations.
As with the domain 0 CentOS 7 upgrade, there are a number of new mechanisms the upgraded Xen hypervisor brings that are not used by XenServer today but provide the foundations and opportunities for future features. Some areas of particular interest are enhancements to Xen Security Modules (XSM) – a mandatory access control framework similar in style to SELinux, albeit at the hypervisor, rather than OS layer; support for vTPM 2.0, potentially allowing a chain of trust from hardware TXT/TPM boot right through to trusted boot of virtual machines; and the PVH virtualization mode – a step towards unifying Xen’s two modes, PV and HVM, to reduce complexity and get the benefits of both modes at the same time.
Xen 4.5 and 4.6 also brought a number of performance enhancements – keep an eye on http://xenserver.org/ for a series of blog posts from XenServer performance lead Jonathan Davies which will dig into a number of performance and scalability improvements in XenServer 7.
Active Directory integration
Since Active Directory support was introduced in XenServer 5.5 we’ve used the Likewise tools to interface with AD servers. In XenServer 7 we upgraded these tools to a more recent PowerBroker Identity Services (PBIS) packages which offers better support for complex group structures and better credential caching along with a number of other compatibility, performance, scalability and stability enhancements.
Use of the IOMMU
XenServer’s physical I/O device drivers run in the domain 0 kernel and often need to allocate memory to be used in DMA transactions to hardware devices such as RAID controller or network interfaces. Having a hypervisor adds a layer of indirection in memory mapping (compared to a bare-metal operating system) and therefore the domain-0’s view of memory is different to the physical device’s view of memory. For single memory pages this is easily handled with a simple translation however for DMAs of more than a page in size the physical memory region may not be contiguous and therefore a single translation of the starting address is not possible. The simplest solution for this is to pass all DMA transactions through memory obtained from a memory pool with a 1-to-1 mapping with host physical memory. This pool is known as the software I/O translation lookaside buffer (SWIOTLB) or a bounce buffer. However, this adds a memory copy to every DMA, increasing latency and lowering performance; and it risks DMA failures if the SWIOTLB becomes exhausted or fragmented.
XenServer 7 avoids these deficiencies by making use of (where possible) the hardware’s IOMMU – a device that allows physical devices to access memory using virtual addresses rather than physical addresses and therefore using the same memory mapping as domain-0. This means that even if the physical memory for the DMA is non-contiguous the device can DMA to/from it as if it was, using the exact same address as it was given by the domain 0 device driver. This avoids the SWIOTLB memory copy and the pool exhaustion problem leading to better performance and reliability.
We’ve been busy making changes to both our storage and network I/O data paths to increase performance. See Jonathan Davies’ performance blog series for more on this.
The XenServer toolstack and API is largely composed of the XAPI daemon. XAPI is the API endpoint and almost all API implementations are handled directly by it. XenServer 7 introduces an internal API extension framework that enables new API calls to be added and their implementation handled by separate executables within the XenServer domain 0 environment. This is designed to enable a more modular, extensible toolstack where new functionality can be added with rebuilding XAPI itself.
- The Open Virtual Switch (OVS) was upgraded to version 2.3.2 – a stable and proven release
- Use of Linux cgroups to manage resources within domain 0 to provide better responsiveness under load
- Lots of internal refactoring of the XAPI toolstack to make it easier to add new capabilities
In conclusion there is a lot more cool stuff under the covers of XenServer 7 than you might think. This is one of the biggest and best XenServer releases we’ve ever made and it will form the foundations for a stream of new features, innovations and enhancements over many releases. Stay tuned for more…