Benchmarking vGPU…. Seeing is believing, a cautionary tale of frame rates!

For a serious CAD or GPU intensive 3-D graphical application I’m coming to the conclusion that a gaming benchmark like Unigine whilst useful for evaluating GPU sharing technologies such as vGPU and  vSGA, does need understood and used in conjunction with other benchmarks. Unigine Tropics is not a particularly GPU intensive application and as such probably best suited to benchmarking for VDI / knowledge workers.

I’ve been blogging about some of the factors to consider when benchmarking rich graphical or CAD applications and tools to help you do so but with each benchmark or evaluation I seem to find out more on the subtlety of the technologies and factors customers need to understand to avoid being bamboozled by salesmen.

Spot the difference

Unigine’s benchmarks are really busy! There is an awful lot going on and quite fast: flying pirate ships, oceans, clouds, palm trees and dragons. If you look at a single run and the frame rate you might well miss subtleties. If you are evaluating vSGA versus vGPU or similar it can be really insightful to set them up running side by side and play spot the difference, screenshots make this easier. It was only after chatting to our solutions’ lab guys and some OpenGL experts that I appreciated the tricks these benchmarks can play.

Consider this image of roughly the same part of Unigine’s Tropics benchmark, on the left is vGPU on XenServer and on the right a competitor’s technology for GPU sharing relying on their drivers.

Look at the shadows. vGPU on the left has shadows that contain subtle colours, this is known as soft shadow where shadows reflect an object’s colour. On the right though, the shadows are flat and all black shadow without the object’s original colour. i.e. the benchmark code is following a completely different code path and API set for some of the rendering.

Apparently there are a few likely reasons for this happening:

  1. The right-hand solution is aware that it is becoming resource limited and has switched to a simpler render to compensate
  2. The right-hand solution is defaulting to simpler code paths on subtler features to bump up the frame rate score given by the benchmark, render less and you can get a higher-frame rate.
  3. Soft shadow is a fairly recent feature, it’s possible the solution on the right and the vendor simply haven’t implemented the latest OpenGL/DirectX APIs

It’s not an effect you would find prevented you doing a lot of CAD modelling however most organisations want to use the best rendering they can for marketing brochures and as the screen shot above shows, it would be really disappointing and frustrating if you were the rendering guy trying to explain to your boss why your shadows on your product brochure just didn’t look cutting-edge.

There were other effects that seemed to follow different code paths that I’m not sure I’d have spotted especially on a single screen demo in busy user-conference and certainly not without comparing side-by-side. Below the quality of the ocean in vGPU (again on the left) just seems a lot better and more realistic:

There are a lot of benchmarks that do not have a user interface or that users are running automated benchmarks on recording the frame rate. This particular investigation has been a cautionary tale in not just counting the frame rate but in looking what is within the frames themselves. I’ve seen quite a few synthetic benchmarks that run the footprint of the OpenGL/DirectX libraries for particular applications but now I’m seriously questioning whether I can trust the frames per sec (fps) metrics they rely on as without viewing the contents of those frames. I was rather unnerved to find vendor drivers can fall back to completely different code paths in this way. Seeing really is believing, in this case!

Other thoughts on Unigine Tropics

This is a benchmark that seems to be heavy on CPU usage and more suited to VDI or perhaps PLM users than serious CAD users, as such I think to make a fair assessment of the scalability of a solution you really do need to benchmark both CPU and GPU, I’ve blogged about tools you can use to do this.

If I was evaluating buying a particular solution though I would also be playing spot the difference and asking a vendor to turn on the subtle features to find out if the frame rate is being artificially inflated but also as to whether a solution actually supports the latest technologies used by these highly visual applications properly.

Unigine Heaven is a bit more intensive on GPU usage but also has gaming application like footprint. Gunnar Berger from Gartner has detailed and YouTube’d some investigations into vSGA compared to vGPU using Unigine Heaven, that you can view from the links on this page.

For CAD users it is probably a good idea to investigate modelling oriented benchmarks (e.g. Redway3d or others) that reflect the use of a geometric model and kernel. Gaming benchmarks generally don’t reflect the workloads of HLR (Hidden Line Rendering), Boolean intersections or B-geometry parameterisation common in most CAD packages such as Dassault Solidworks, Ansys WorkBench or Siemens PLM’s SolidEdge.

Thanks

My latest insight into GPU benchmarking was provided by a lot of fruitful discussions with and technical education from our solutions lab and technical marketing team (very tolerant people – willing to let a CAD blogger hang around annoy them!) who have blogged about the different GPU sharing solutions available for XenDesktop on both XenServer and vSphere, and also about how you can automate such tests (just make sure you look at the frame contents too!). The images in this blog were results from some of their tests. I expect to see some very interesting and much more substantial data and reference articles published by them soon, I’ve just touched on a couple of things that as a CAD engineer I thought were interesting. Do watch out for dark shadows though!