I have been wondering about the need for fault-tolerant architectures specifically with regard to the servers hosting the virtual desktops. I definitely believe the supporting infrastructure should be fault-tolerant, but what about the myriads of servers running just a hypervisor and hosted desktops?
Here are my two premises:
1. Virtual desktop adoption right now is primarily driven by cost savings, but most architectures required to support virtual desktops are expensive.
2. Fault-tolerance increases the architectural cost so it makes sense to only design fault tolerance into an architecture when either the impact or the risk of failure are high.
In some situations, such as when the user has access to only a single dedicated or assigned desktop, it makes sense that desktop would need to be highly available. The impact of a user not having a desktop available would be high. However, let’s consider the case of a desktop pool where a user could be assigned any one of a 1000 desktops. In this situation of a single desktop becomes unavailable, the user could simply reconnect and get the next available desktop. Similar to a car rental agency, as long as the extra capacity exceeds the number of unavailable resources no user goes without access.
In a physical environment, when my desktop crashes, I maybe shout some disappointing statements about losing my unsaved work and then reboot, I am usually up and running again within five minutes. In a virtual environment, the end-user experience is similar. When the virtual desktop becomes unavailable, the user makes statements about lost work, reconnects to the desktop pool and then logs back in – usually within five minutes. The impact of the lost desktop is minimal because of extra capacity available to handle the temporary loss of resources.
One obvious difference between the physical workstation crashing and the loss of a virtual desktop scenario posed above is that a failed hardware component in a virtual environment would impact all the desktops on a host, not just a single user. So, the trick is to calculate the impact of losing a host server or two and compensate by adding additional capacity. In most cases that additional capacity will be less expensive than incorporating fault-tolerant server components into every server.
Since I believe the community is always more wise than a single person, in this case me, I thought I would solicit some feedback around my thoughts. Of course, my wisdom does not extend to figuring out how to get poll to show up in my blog, so if you would like to vote or view the results of the poll question shown below, you will need to click here If you have an opinion not covered by the poll please add a comment to my blog.
As always if you found this blog useful and would like to be notified of future blogs, follow me on Twitter @pwilson98.