I have a simple question: Do you know your currently running server workloads? It does not matter if physical or virtual workloads, but my assumption is probably not all of them, maybe even not half of them.
Why am I asking this question? Today, many organizations try to move away from server sprawl and under-utilization, where systems rarely use more than 5% to 10% of their computing resources. Virtualizing workloads dramatically changes this scenario and adds many benefits to it besides the typical facts you hear related to server virtualization: cost reduction, reduction of server hardware, flexibility, and greener IT. This sounds all nice and we start immediately with the virtualization project, however without having a proper overview of the workloads that are planned to be consolidated.
If a virtualization project does not consider proper planning, it will not succeed, since transforming alone of physical workloads to virtual workloads does not solve the challenges and does not gain the above mentioned benefits. Just imagine the following:
- Missing Prove: Virtualizing servers saves money since consolidating several virtual servers on a single physical server shows direct savings in many ways. However, without any data and metrics, it is hard to prove to stakeholders and management.
- Bad candidates for virtualization: There are workloads that are not suitable for virtualization such as resources hungry databases (usually high memory and/or CPU usage), which may slow down an entire physical server impacting other running VMs on the same machine, unless you decide to run it alone as a VM on a physical server, but this does not gain anything.
- Hardware failures: One of the most obvious risks with virtualization of workloads is server hardware failure running the hypervisor. In a traditional environment, a server failure would impact the running service on that physical server. However, a server failure in a virtual environment can bring down all VMs that reside on it. Not having sufficient redundancy for the virtual environment will significantly affect your production environment.
So, what is the best way to create a consolidation plan? Besides the typical aspects of a project plan for the analysis phase considering goal, success criteria, milestones, and time lines, a tool to gather details about your workloads is helpful. One recommended tool is Novell PlateSpin Recon that not only allows assessing your current workloads, but also allows capacity planning once you rolled out your virtual environment.
Starting with an assessment of current workloads for a server consolidation project is recommended in order to provide the data and metrics for prove, determine ideal virtualization candidates, and create a workload matrix for workload distribution and redundancy planning. Therefore, assessing performance metrics of your workloads are key areas to look for, which are:
- CPU utilization
- Memory utilization
- Disk I/O utilization
- Network I/O utilization
These metrics should be captured ideally over a time frame of 30 days. This ensures that you not only have data of business hour utilization, but also time frames of downtime as well as specific peak times such as end of month reporting, which usually hits backend systems. Once reviewing the captured date, consider to look into the following aspects of it than just the averaged value:
- How much utilization do workloads have during business hours? 80%-90%?
- How much is the utilization outside business hours?
- Is there any specific time frame, where the peak goes beyond usual utilization?
Reason for this is to ensure not to be misled by averaged values since low utilization outside business hours will decrease the overall utilization of resources, which will disguise highly utilized workloads that may be not good virtualization candidates.
Server hardware specifications
Besides the performance metrics, it is crucial to capture the server hardware specifications of the physical workloads being monitored with Novell PlateSpin Recon such as server model, CPU speed, number of CPUs, number of disks and network cards. Having this information will allow you to map captured performance metrics to these and provide you the required decision base for determining virtualization candidates. To make this more clear, assume you have captured performance metrics about a server reporting 90% CPU utilization. Looking into the server hardware details, you determine that it is a dual-processor system with a CPU speed of 1.2 GHz. Migrating this server workload to a virtual workload with two virtual CPUs on the newest CPU generation such as Intel Nehalem would make this a good virtualization candidate.
Additional assessment areas
Many virtualization projects are driven by the cost saving aspect and therefore, besides performance metrics and server hardware details, the assessment can also consider TCO values (power usage, depreciation cycle) and forecasted environment growth. This gives to the stakeholders and management the information they are looking for.
Outcome of assessment
After capturing all required data, analyzing the data, and putting together two to three possible consolidation scenarios, you will be able to define the consolidation plan for your virtualization environment. This should be part of the assessment report presented to get buy-in from stakeholders and management since you deliver the information they need. The report should simplify the captured data in form of diagrams, because you will have a lot of numbers and diagrams explain findings much better (like the one below).
The assessment report should provide the following details at the minimum:
- Number of assessed workloads and how many out of these are virtualization candidates
- Consolidation ratio (number of virtualization candidates / number of physical server hosts)
- Percentage of server count reduction
- Percentage of rack units reduction
- Percentage of annual energy cost reduction (if possible)
Also, to simplify the next step of such a project, which is usually a proof of concept or a design of the planned consolidation scenario, the report should include a workload distribution matrix. This matrix has nothing to do with The Architect from the movie, but provides an overview of how the virtualization candidates should be distributed to achieve an optimal utilized and balanced virtualization infrastructure.
I hope this does not look complicated, but my own experience conducting this kind of assessment at customers with the help of the tool Novell PlateSpin Recon, I was able to pull out these insights easily and delivered the required information to make the right decision. It is not just a decision to move to a virtualized infrastructure, but also ensuring that the virtualization is successful to meet business and technical requirements. Therefore, it is important to know your environment.
If you would like to get more insights into this process, we will have a Hands-on Learning Lab “Virtualizing workloads efficiently with XenServer” at Citrix Synergy that will demo the usage of Novell PlateSpin Recon in its latest version.
Senior Architect, Worldwide Technical Readiness
Follow me on twitter: @TarkanK