Storage Optimized VDI

By Andrew Dent

Challenged on the Storage Front

A recent Forrester report claimed that Enterprise data has grown 47% over the last two years. Despite the fact that much of this data is unstructured, and not performance sensitive, much of it is stored on expensive tier one storage systems. This cost of provisioning such storage is excessive. Additionally, storage arrays, with their long life span, together with poor data growth forecasting, means customers typically purchase much more storage than they need. There is a resulting need in the market for storage solutions that are optimized for the workload they are hosting. These storage solutions should be easy to deploy to avoid overprovisioning.

Such issues are particularly pronounced when thinking about VDI. When a customer comes to consider VDI, they often become entangled in a complicated storage discussions, due to the storage intensive nature of VDI. It’s important when embarking on a VDI discussion that the customer, specifically storage architects, are aware of the various storage options available to their implementation. Proper consideration here may have dramatic implications on the storage costs, and affect the outcome of the opportunity.

In this article we conduct a qualitative exploration of storage options on the market today. This article will help you have more enlightened discussions on storage with customers, and will potentially lead to more optimized storage designs.

When choosing VDI storage, there are two primary choices: local direct-attached storage (DAS) or shared storage.

DAS

The least expensive and easiest VDI storage option to configure is direct-attached storage (DAS). With DAS, the main advantage is that the hypervisor can communicate directly with the storage. That means network bandwidth limitations or latency, don't constrain storage communications. Another advantage is that when direct-attached storage is used, another host will not affect disk I/O. In a shared storage environments, all of the host servers must share disk resources. If a host happens to carry an unusually heavy workload, that host's tasks can potentially rob other hosts of disk I/O. This is not a problem when each host has its own storage.

Despite the benefits of direct-attached storage for VDI, it isn't always reliable. DAS offers no provision for failover. If the host server fails, then any storage devices that are attached to the host become inaccessible. Many variants of DAS are available. Examples are hard-disk drives (HDDs), SSDs and add-in flash modules that appear to the OS as a hard drive. DAS is best suited to non-persistent environments.

Shared storage

The preferred method for providing storage to virtual desktops is shared storage. In this architecture, each virtualization host connects to a centralized storage pool where the individual virtual hard disk files reside. Because all hosts are connected to a centralized storage pool, the infrastructure is protected against host server failure. If a host fails, its workload can be moved to a different host within the cluster.

The most common form of shared storage today is the SAN. Many VDI pilots, or even medium sized installs, are often deployed on excess general purpose SAN capacity. This is problematic in that VDI workloads have a much different performance profile than many other server workloads. VDI storage challenges, are at their root a bi-product of the Microsoft Windows Operating System. The OS itself, was designed to leverage a dedicated low latency local hard drive able to handle heavy disk write I/O. There are performance bottlenecks when this model is deployed concurrently on a traditional SAN. There is no effective way to guarantee QOS for the initial deployment, or any way to scale this deployment. These bottlenecks have traditionally been managed by over-allocation of resources, thru the use of lots of spinning disks to service the I/O. But such over-allocation is CAPEX heavy, increasing the time to ROI.

Storage architects need to think beyond this “over-provisioned SAN” approach when considering VDI. Selecting the right storage medium for VDI deployments will radically alter time to ROI. Increasingly there are a number of options on the market which enable organizations to capture faster time-to-ROI for VDI deployments.

I/O Acceleration Software

I/O acceleration software is designed to slide into any new or current storage deployment, and reduce the load on the back-end storage array. This type of solution can be very helpful when an organization has already made a large purchase in a specific storage array, but that array may not have been originally intended for VDI. By offloading the higher I/O requirements to an I/O acceleration software solution, organizations can further leverage their current storage, keeping costs under control.

An example of I/O optimization is to improve the way virtual desktops use storage. Virtual desktop images duplicate a lot of information. There is an opportunity to consolidate this duplicate information in real-time before it is stored. This will reduce I/O and also reduce storage capacity.

Product in this space include Citrix IntelliCache (XenServer only), Atlantis ILIO and Microsoft Clustered Shared Volumes.

SSD

To mitigate the I/O limitations with traditional hard disk technology many vendors have used SSD (Solid State Drives) or flash memory to service workloads with high I/O requirements. SSD’s have no moving parts and are not limited by the mechanical aspects of a traditional hard disks operation. SSD’s are also superior in terms of I/O and data transfer. This performance however, does come at a cost. This cost premium has been mitigated somewhat through the combined use of SSD’s and HDD’s in an approach that uses different types of storage – Hybrid Storage Arrays (see below). Space can also be a challenge with SSD’s. PVS only stores the ‘write-cache’, which makes SSD viable – but for MCS, if you have multiple master images, space management can be difficult.

Hybrid Storage Arrays

Hybrid storage players sell arrays that combine SSD storage, flash-based SSD, RAM, and traditional rotating hard disks to create a high-end storage platform. By doing so, hybrid storage vendors can provide customers with storage solutions that boast the capacity advantages of traditional SAN storage solutions, with I/O performance levels that are often orders of magnitude greater than what would be possible with a similar number of traditional hard disks by themselves.

Hybrid Storage arrays include software storage software services, which are commonly found in enterprise arrays including, Data protection and replication, Clustering capabilities for scale-out configurations, Capacity optimization, such as data deduplication/compression, and Reporting and predictive analytics.

Hybrid storage arrays are ideally suited for single-purpose solutions, such as VDI. Many of these solutions can be integrated within a SAN fabric, but most are delegated to DAS, or deployed as separate storage fabric. This creates specialized storage dedicated to servicing the needs of the VDI deployment. This creates simplicity because customers do not have to involve their existing storage teams to implement the solution. This means that typically one vendor is responsible for the storage, and thus, organizations have one number to call.

The hybrid storage approach vendors in this space include GreenBytes, Nimble and Whiptail.

Storage and Convergence

As interest has increased in Hybrid storage arrays, extensions of this model are beginning to appear in the market. Storage is increasingly being considered as another component of the ‘converged infrastructure’ space, and devices are appearing on the market that package both storage and compute in the one form factor, optimized for VDI deployment.

A high profile player in this space is Nutanix, who converges compute and storage in a single appliance, purpose-built for high performance virtualization. A single Nutanix node, contains two six-core Xeon 5600 processors, with eight cores allocated to run hypervisors and virtual machines, and four cores allocated to run the Nutanix virtual storage controller. This controller virtualizes a pool of Intel and Fusion-io solid state disks and Seagate SATA drives, and presents virtual machines with block and file I/O access to data spread across these disks inside the cluster. A single Nutanix node is able to support over 20,000 IOPS in a 2U form factor - perfect for the high I/O generated by VDI workloads. Another interesting play here is FlexPod. Flexpod is a reference architecture for server, storage and networking components that are pre-tested and validated to work together as an integrated infrastructure stack. The stack consists of products from multiple vendors and is sold by NetApp Inc.

Convergence still an emerging space. There is great interest in convergence, as it promises to reduce costs and speed deployments. There are however, a number of barriers to its widespread adoption. Purchasing of compute and storage is currently siloed within IT department. This has the opportunity to complicate joint sales. Additionally at this early stage in the market, the technical and operational benefits of this approach in reducing long-term operational expense or not clear.

Software Defined Storage (SDS)

Storage solutions, like other infrastructure, are increasingly being moved into software. This allows the purchase of software products that can transform commodity storage hardware into virtual storage arrays, and serve as an alternative to rigid, single-purpose storage appliances. In SDS environments, applications, developers, and other business stakeholders will be able to define their storage requirements and have their request fulfilled without needing to know anything about the underlying storage hardware infrastructure. SDS is a key component in the software-defined data center, and it will need to integrate with other pieces, such as software-defined networking (SDN), to accelerate access to IT resources.

Though most perceive storage products as highly specialized proprietary hardware appliances, the vast majority of storage appliances on the market are essentially commodity x86 server hardware running the vendor's proprietary software stack (as per our Nutanix example). Ultimately, as software-only storage options become more mature and storage professionals accept commodity hardware as a viable alternative to integrated storage appliances, we will get to a point where storage functionality becomes just another application run on servers. In this scenario, file (NAS), block (SAN), and object storage will be created on demand using commodity CPU, RAM, hard drives, and flash resources.

The key components of an SDS solution are:

  • Automated provisioning: By provisioning storage through application programming interfaces (APIs), applications can automatically request the resources they require. This type of provisioning has been popular in public cloud storage environments such as Amazon S3, but it has yet to become prevalent in the enterprise. A number of storage players, including EMC, HP, and NetApp, are building API-based provisioning into their storage offerings.
  • Virtualization: Virtualization aggregates heterogeneous storage into a shared pools of storage. This breaks down the problem of space storage capacity currently trapped in siloed storage solutions. All the major storage vendors, including EMC, Hitachi Data Systems (HDS), HP, and IBM, have integrated storage virtualization into their product portfolios.
  • QoS (quality of service): Most existing storage system do not have the ability to provide performance isolation. As a result QoS for multiple applications on a single storage infrastructure cannot be guaranteed. Vendors such as IBM and SolidFire provide storage QoS capabilities to control the amount of I/O and throughput that individual applications are able to consume. Microsoft, with storage spaces. Storage Spaces has the ability to use leverage SSDs for automated tiering and/or write-back caching.

Many analysts see software-defined storage as the future of storage; however, more maturity is needed in this market before every organization will be able to deploy these types of solutions. The longer term benefits of automated provisioning, virtualization, and QoS are compelling for the VDI use case.

Lets Go!

In this article we have taken a look at advancements in storage. Near term, I/O acceleration solutions and Hybrid Storage arrays are very useful systems that enable us to reduce the cost of VDI deployments, and these trends look set to continue with the arrival of converged storage, and software defined storage. Like many complicated solutions the devil is in the detail with respect to bottom line costing’s. It is impossible to point to any one storage solution as the way to reduce a customer’s storage costs. But on all VDI opportunities, it is important that customers understand the drivers of storage costs, and are aware of solutions in the market that have the potential to reduce costs.

Like any solution, it’s important to remember that storage costs will not be the only driver. Our traditional storage partners - NetApp, HP, Dell and IBM all have established reference architectures with sales and support networks to back them up. Many of the newer storage vendors have yet to get to this level of maturity, and such issues should be considered within the overall context of the opportunity.