NVIDIA GPU accelerator cards have been a popular option for high-end VDI uses cases for many years. These cards provide the ability to virtualize graphic intensive workloads onto standard hypervisor hosts without dedicating hardware to a single individual user, allowing VDI architects to spread the cost of what is relatively expensive equipment across many users.

Back in 2015 I wrote a blog post about how to integrate NVIDIA cards with App Layering that included both a recipe for handling the installation and a utility to make it possible to modify the vSphere VMX settings for the cards manually or dynamically. The good news is that now you don’t need a utility, and the process to deploy the drivers is easy if you follow a few defined steps.

GPU Architecture Summary

First, I wanted to cover some basics about the virtual GPU architecture for anyone who is new to the technology. The testing I have done in my lab uses a vSphere Hypervisor, so I will concentrate on options available for vSphere customers.

vSphere supports several ways to use NVIDIA graphics cards to provide GPU functionality to virtual machines. These include:

  • Virtual Shared Graphics Acceleration (vSGA)
  • Virtual Dedicated Graphics Acceleration (vDGA)
  • Virtual Shared Graphics Acceleration (vGPU)

vSGA

This provides the ability to share NVIDIA GPUs among many virtual desktops. An NVIDIA driver is installed on the hypervisor, and the desktops use a proprietary VMware-developed driver that will access the shared GPU. This option supports only up to DirectX 9 and OpenGL2.1. At this point I think that everyone deploying shared graphics on vSphere implements vGPU technology because it provides better performance and newer versions of graphics library support.

vDGA

This is a hardware passthrough mode where the GPU is not shared but accessed directly by the virtual machine. This mode supports the real NVIDIA graphics driver and attaches directly to the GPU from the VM. This option is very expensive if used for virtual desktops because each GPU card can only support a very limited number of desktops. It’s more viable for shared Citrix Virtual Apps session hosts because the GPU can then be shared by all users on the session host. This option supports the latest versions of DirectX and OpenGL and should have the graphical performance of a high-end graphical workstation.

vGPU

vGPU has many of the benefits of vDGA but can also share the NVIDIA GPUs. Here, you install an NVIDIA VIB driver in the hypervisor and an NVIDIA driver on the virtual machine. vGPU supports DirectX 11, 12, 2D, Open CL 1.2, and OpenGL 4.6. See the NVIDIA virtual GPU software documentation for more details.

NVIDIA supports different GPU profiles for each type of GPU card. The profiles change the size of the frame buffer anywhere from 1 GB to 16 GB, which, in turn, translates to the number of shared GPU sessions a card will support. Different cards support different number of sessions from two to 64 per card. Of course, as you increase the number of sessions supported, you decrease the performance of each session.

One thing that I found out when writing this blog is that NVIDIA now supports V-Motion with GPU enabled desktops when using vGPU on ESX 6.7 or later with Virtual GPU Software version 7 or later.  This is great because now servers can be put into maintenance mode without having to ask all the current users to log out first.  The current vGPU software release is 12.x see NVIDIA Virtual GPU Software User Guide for more details.

Before I go further, I want to point out a few good resources on using NVIDIA and Citrix together that I found when researching the topic. The first is our HDX 3D Pro documentation for single-session and multi-session, which is very good. The other is a blog post by my colleague Mayunk Jain that covers VMware’s vSGA technology and how HDX 3d Pro integrates with NVIDIA vGPU to provide a great graphics solution.

Hypervisor Integration

Setting up NVIDIA GPU cards on vSphere hosts is outside the scope of this post, but at a high level, the cards(s) must be installed in the host. A “virtual GPU manager” software driver must be installed on the host. For vSphere this is delivered as a VIB that you copy to your host, then run the installer. After installing, you must set the Graphics Device Setting on each GPU to “Shared Direct” mode for vGPU. You can find this setting under the host configuration hardware graphics, as shown below.

You must edit each GPU and set the mode as shown below.

More information is available at https://kb.vmware.com/s/article/2033434.

Please note, if you check the “Restart X Org server,” the host does not need to be rebooted for the changes in mode to take effect.

Recipe for NVIDIA Driver Installation

(If you are in a hurry, I’d suggest you check out this shorter overview of the App Layering Recipe for NVIDIA GPU. For a little more detail on both the process and the recipe, keep reading.)

Now, on to the specifics of how to create a layer for the NVIDIA Windows Drivers. The first thing I always do when researching the best way to handle a new recipe is to make sure I understand how an application install works outside of App Layering.  For something like GPU cards, where there are hardware interactions on both the host and virtual machine level, this is even more important.

After installing my GPU card and configuring the host software, I create a new full clone Windows 10 virtual desktop by deploying a virtual machine using an MCS connector in App Layering. I then reset the domain account for the desktop by removing it from the domain and adding it in. I had to do this because the platform layer gets added to the domain so the VM thinks it has an AD account, even though it doesn’t.

The next step is to edit the virtual machine settings to add an NVIDIA GPU. Edit the settings and click on “Add New Device,” then select “Shared PCI Device.”

Then select the desired vGPU profile.  Here choose the appropriate profile for your license type.  The profiles are either a, b, or q for virtual apps , virtual pc or virtual workstation respectively.  The 1, 2, 4 or 8 stand for how many GB of GPU memory each machine will be provisioned.

You will have to reserve all the VM’s memory when using vGPU.

Once the VM is started (now that the GPU is assigned), you can run a command on the host to see if the VM has attached a GPU. Connect to the host using ssh and run:

nvidia-smi

You will then see something like what’s below in the red box, which shows that this VM has attached a GPU.

Now we can install the NVIDIA Windows Drivers. But first, before you install the NVIDIA drivers, you need to make sure Remote Desktop has been configured on the virtual machine. Why? After installing the NVIDIA drivers, the VMWare Remote Console will no longer function. So, edit the remote settings in the system control panel to “allow remote connections to this computer.”  Also, I recommend making a snapshot of the virtual machine before installing the drivers, just in case you need to fail back and start over. Then install the drivers, reboot, and test.

I always test two ways. First, I go to the control panel and open the NVIDIA Control Panel. You should see a spinning NVIDIA icon as shown below.

Then I go to YouTube and find a high-resolution video to place and make sure its smooth. When I am sure the desktop is working properly, I will delete the snapshot that I created for “just in case”.

With the new version you will also want to define your vGPU license Servers both primary and secondary.

Create a Template

The great thing about creating the test virtual machine is that now we can also use it to make a template to use in App Layering for both packaging and publishing connectors for use when working with GPU cards. To make the template, clone the virtual machine you made to test NVIDIA, then remove its hard drive and delete it because App Layering connector templates will not use a hard drive.

Create a Connector

Once you’ve created the template, create an App Layering vSphere Connector using the template. Use Offload composting; using a cache is not required because this connector will not be used that often.

Create a Layer

The first thing I did when starting this process was to search “Citrix App Layering NVIDIA.” I found an important CTX article about a configuration required to make the NVIDIA Desktop Manager and Control Panel work. See https://support.citrix.com/article/CTX241448 for details.

My hope, as always, was to use an App Layer for the NVIDIA drivers, even though I know, from our discussion groups, that many have used platform layers for them. An app layer would mean less overall rework because platform layers must be updated much more often than app layers. So, I first tried creating an app layer using the new connector. However, when I published the image, adding in the app layer, the NVIDIA Desktop Manager and Control Panel would not open, and it did not seem like the GPU was working. So, I switched to a platform layer.

Modified Platform Layer

My first step was to add a version to my Win10 platform layer. This worked for me because I am just testing this out. However, assuming not all of your desktops will be NVIDIA enabled, it’s likely you will have to create a new platform layer for NVIDIA desktops that is separate from your non-NVIDIA desktops. You can keep separate versions for both, but it gets confusing keeping track of them that way.

Remember to use the new connector so that the packaging machine will have the NVIDIA Shared PCI Device and profile defined. When the packaging machine boots, make sure the RDP is defined to allow access. Remember that in packaging machines you can’t use the domain, so uncheck the NLA box on the remote desktop config as shown in the red box below.

Next, find the IP address of the packaging machine so you can connect to it using RDP after you install the NVIDIA Windows Drivers and reboot. Store the NVIDIA driver installer on a share and launch it on the packaging machine. The installer unpacks the software locally, then runs it.

I used the Express Installation.

Remember once the drivers are installed and the packaging machine is rebooted you can no longer access the VMware console. You must then use RDP. Reconnect to packaging machine using the IP address in Remote Desktop Connection.

Then run Shutdown for Finalize.

Wait! There’s Also an App Layer Required

Remember earlier when I pointed out that there was a CTX article about adding the NVIDIA paths to the AlwaysOnBoot registry? We still need to do that, but not in a platform layer. We must create an app layer called something like “NVIDIA Persistence” and add the registry values there. After the packaging machine boots, log on and run regedit to create the AlwaysonBoot multi-string value in HKLM\System\CurrentControlSet\Services\unifltr. Then add the NVIDIA program paths within the multi-string value, as shown below.

This will exclude those folders from being included on the App Layering writable volume of the virtual machine. After creating the key, run Shutdown for Finalize.

Publish an Image

Publish a master image using an MCS or PVS connector that includes the new platform layer and NVIDIA persistence layer. For the MCS connector, make sure to use the same VMWare template you used for the packaging connector. This will ensure that all the hardware for the NVIDIA card is configured and things like the BIOS time and timezone are set properly on the master image. For PVS you would also use the same VMWare template, but you would do that in the XenDesktop Setup Wizard when you create the PVS targets.

Takeaways

I hope this post is helpful for those of you who need to integrate NVIDIA vGPU with Citrix App Layering. To recap the discussion:

  1. It’s easy to integrate NVIDIA, Citrix App Layering and Citrix Virtual Apps and Desktops.
  2. Install the NVIDIA Windows Drivers in a Platform Layer.
  3. Remember that after the NVIDIA Windows drivers are installed, you can no longer use the VMware Remote Console on the virtual machine so make sure RDP is working.
  4. Create an “NVIDIA Persistence” layer to define the required AlwaysOnBoot settings to exclude NVIDIA folders from the App Layering writable volume.
  5. I tested and was glad to see that the NVIDIA profile used in the vSphere template can be changed at any time on any virtual machine and the new profile will work fine. So, there is no rework required to support different NVIDIA profiles.