A particular failure you may have encountered with the v1.0 release of the Linux VDA occurs with a dialog entitled “HDX Session Validation Failure” appearing immediately after session launch. This forces a logout of your session after 30 seconds.
While this problem may still occur in the v1.1 release, I am happy to report that the Linux VDA development team made improvements to minimise its occurrence. I’d also like to take the opportunity to explain why this failure occurs and what you can do to rectify the situation if you see it.
Why session validation?
A fundamental part of session launch in a XenDesktop deployment is the process called Brokering. This is performed by the Broker, which is responsible for negotiating launch requests with worker machines. The Broker communicates with the Broker Agent on workers using the Citrix Brokering Protocol (CBP). It selects workers to handle the incoming ICA connections for desktop and application launches based on the worker’s readiness to fulfil the launch request. To ensure the launch is secure, CBP requires the Broker to associate users with their specific sessions. This user-to-session association is verified by having the Broker Agent impersonate each user that is logged on.
For the Linux VDA, the Broker Agent needs access to the Kerberos credentials of the user for impersonation during session validation. This requires the system environment be configured to cache Kerberos credentials in a simple flat file format. If the Broker Agent is unable to access the cache file for the user of a session, the above dialog is immediately shown after session launch, and then the session is terminated 30 seconds later.
Resolving session validation failures
There are a few diagnostic checks you can perform to pinpoint the cause of session validation failure. The checks I have outlined below are for a RHEL7 machine joined to AD with Winbind. You will find that the checks to perform for Centrify DirectControl or Quest are similar.
Diagnosing your session environment
I recommend running the following commands in a HDX session which has failed session validation before it is forcibly logged out. I usually take a screenshot of the results given the 30 second timeout.
- List the cached Kerberos tickets for your user account by running:
- Search /tmp for possible credential cache files belonging to your user account by running:
- Check if the KRB5CCNAME environment variable has been set by running:
If the timeout proves to be too quick, you can run the same commands for the same user account with Secure Shell (SSH) or another remote login program. With the exception of the KRB5CCNAME environment variable, the results should be the same as for HDX.
Analysing the results
Check for expired tickets
Let’s start by looking at the output from “klist”. Here is the output from my working RHEL7 machine. If “klist” has successfully displayed the ticket cache, check the expiry time of your ticket.
If you find that your ticket has expired, Winbind has failed to update your ticket cache. In this case:
- Check that /etc/samba/smb.conf has:
- Delete the existing ticket cache file:
Often this is sufficient to fix the problem. The cache file will be created again on the next login.
Check for missing ticket cache
With session validation failure, I often see “klist” fail with an error message saying that the cache file could not be found.
This error is generally caused by incorrect Kerberos or AD integration settings on your machine. I’ll address this in greater detail later in this blog.
Check for an inaccessible ticket cache
On rare occasions “klist” fails whenever users do not have permission to access their own cache file.
This error can be caused by an administrator running:
This creates a cache file belonging to root for the specified user. To rule out this error, confirm that the cache file belongs to the user. This is the reason for running “ls” earlier.
Delete the ticket cache file if you find it belongs to another user. This is usually sufficient to correct the problem and the cache file will be created again on the next login.
Check whether the KRB5CCNAME environment variable is set
The KRB5CCNAME environment variable is usually set by a Pluggable Authentication Modules (PAM) “aware” program on successful login by a user. This environment variable should refer to the ticket cache file of the user:
The Linux VDA is PAM “aware” and retrieves all PAM related environment variables on a successful login. This includes the KRB5CCNAME environment variable. However, there are cases where PAM will fail to set the KRB5CCNAME environment variable after the user has been successfully authenticated. This occurs when the clock skew between the Linux VDA machine and the AD server becomes excessive.
Here is an excerpt from pam_winbind.conf which mentions the clock skew in the description of the “krb5_auth” option:
pam_winbind can authenticate using Kerberos when winbindd is talking to an Active Directory domain controller. Kerberos authentication must be enabled with this parameter. When Kerberos authentication can not succeed (e.g. due to clock skew), winbindd will fallback to samlogon authentication over MSRPC. When this parameter is used in conjunction with winbind refresh tickets, winbind will keep your Ticket Granting Ticket (TGT) uptodate by refreshing it whenever necessary. Defaults to “no”.
With the fallback to “samlogon” authentication, the Kerberos credential cache file becomes irrelevant and therefore there is no need to set the KRB5CCNAME environment variable. The v1.1 release of the Broker Agent relies on this variable to find the ticket cache file. Without the variable the Broker Agent will resort to opening the cache file assuming it has a well-known filename format. This is not always successful.
Generally, if your machine suffers from clock skew, the Broker Agent will not register with the Broker. However, I have seen rare cases where registration has succeeded in spite of clock skew. If you suspect you have encountered this, you can confirm that Winbind is failing Kerberos authentication by enabling debug logging for PAM. Modify the/etc/pam.d/password-auth file (which is included by /etc/pam.d/ctxhdx) by appending “debug” to the “auth” line for the “pam_winbind.so” module:
When PAM fails to set the KRB5CCNAME environment variable due to clock skew, you will see a message as follows in either /var/log/secure or /var/log/messages:
Refer to the “Configure Clock Synchronisation” and “Fix Time Synchronisation” subsections of the install guide to resolve this clock skew issue.
Earlier I mentioned that missing ticket cache files are often the result of incorrect Kerberos or AD integration settings on the Linux VDA machine. Besides reviewing the install guide instructions, I recommend that you run Linux XDPing to identify the misconfigured settings. The culprit is usually a misconfigured KRB5CCNAME type setting. This will be highlighted by XDPing as shown below:
If this error occurs you will need to check a few settings, especially the KRB5CCNAME type for Kerberos and the AD integration tool configured on your Linux VDA machine. On my Winbind RHEL7 setup I usually perform the following:
- Ensure that Kerberos tickets are verified using both the secrets TDB file and system keytab. This involves the “kerberos method” configuration setting in /etc/samba/smb.conf.
As shown above, the “authconfig” tool generated this section of the file. You will find that this includes the “kerberos method” setting. I recommend moving “kerberos method” out of the generated section as I have done to avoid inadvertently changing it when running “authconfig” again.
- Ensure the “krb5_auth” and “krb5_ccache_type” configuration settings are correct in /etc/security/pam_winbind.conf. In particular:
- “krb5_auth” should be set to “yes”.
- “krb5_ccache_type” should request a “FILE” krb5 credential cache type. On newer Linux platforms capable of using kernel keyrings to implement credential caches, this setting may be defaulted to “KEYRING” by the vendor.
- Ensure the “default_ccache_name” configuration setting in /etc/krb5.conf does not specify a conflicting KRB5CCNAME type to that in /etc/security/pam_winbind.conf. I recommend deleting this setting if it is specified or making it the same as for pam_winbind.conf.
- Finally, remember to restart the Winbind service if you have made any configuration changes:
Well, that’s all for troubleshooting session validation failures in the Linux VDA. I trust you’ve now some insight into resolving validation failures if ever you encounter them in the wild.
To read more from the Linux Virtual Desktop Team, be sure to check out all of our posts here.