The much anticipated Intel Nehalem platform is now available from leading server OEMs – finally available on servers rather than the new Mac Pro machines a month ago, that is.  Now known as the Intel Xeon 5500 series, Nehalem is the codename for an Intel processor micro-architecture, successor to the Intel Core micro-architecture. The first processor released with the Nehalem architecture was the desktop Core i7, last November. The first system to use Nehalem-based Xeon 5500 series processors was the Mac Pro workstation announced in March 09, but glancing around the various server vendors, I see compelling offerings from Sun, Dell, HP and IBM are all raring to go.

Also today, the new free XenServer virtual infrastructure platform is available for download from over 250 partner sites worldwide.  The response to our decision to change our go to market strategy for XenServer has been tremendous, and the list of partners who have volunteered to host downloads for the product is testament to the incredible interest in the product. 

The coincidental timing of the two announcements couldn’t be better.  Intel is on record stating  that a single Nehalem octo-core server can replace nine single core servers.  And if you run free XenServer on that, and fully utilize the available resources, you can easily double the number of servers you can replace. At the same time Nehalem, by virtue of its new micro-architecture and 45nm process, reduces system power consumption by about 20 per cent.  So one new server can probably replace a rack full of legacy systems.   

Bernie Hannon in our performance lab has been doing some performance tests on Nehalem – pitting the  Xeon® E5570 against the Intel® Xeon® E5405 (Dunnington) using XenServer. His tests simulate a Microsoft SQL Server 2008 transaction processing workload and measure the I/O capacity of the configuration using SQLIO. His full results will be published this week and I’ll link to them.  He has a blog out today too, which discusses some of the same results as I have below.  In the interest of minimizing redundancy, check his post for the authoritative performance results.

SQLIO allows us to benchmark disk read and write performance for the host server configurations used tests using DBHammer. We tested disk reads and writes using two common SQLIO sizes, 8K (random) and 64K (sequential), allowing for the fact that both are typically present in SQL Server workloads depending on how the user has optimized for IO. By their nature, random data reads and writes are not very efficient and are performed in smaller increments (8K) to minimize IO request servicing latency. On the other hand, sequential disk reads and writes are more efficient, produce less latency and can therefore be performed in larger (64K) chunks.  Most users optimize their SQL Server environments to minimize the number of random read/writes.

We simulated a transaction processing workload with a 10 million record database against which we used DBHammer to generate transaction typical SQL Server 2008 client workloads. We started with 200 simultaneous clients, and then steadily incremented the number of simultaneously active clients in the DBHammer workload. Each client workload test ran for 30 minutes, with measurements beginning at 10 minutes and lasting for 20 minutes.   We measured max transactions-per-second every fifteen seconds, and added additional client workload in increments of 200 clients until the average CPU utilization of the system under test reached 90%.  

To summarize the results: We found that the Intel® Nehalem Xeon® E5500 class CPU shows a remarkable performance gain over the Xeon® E5400 – offering an average speedup of about 53%. The Xeon® E5405 system reached peak utilization with about 1000 clients, and CPU utilization of 95% was reached at the 1,600 client workload level, with a maximum of 13,708 TPS.  The graph below summarizes the results.  Bernie’s full results are herebut you may need a login to get them.  In these results the system appears to be substantially  I/O bottlenecked (the system is spending much of its time processing I/O on behalf of the guest(s)).  I’m looking forward to getting some test results for Nehalem platforms using IOV enhanced 10Gb/s NICs, which is currently in flight with our friends at Solarflare.  More results soon.