At ExtraHop, our engineers and solutions architects work with many Citrix customers, helping them to troubleshoot and tune their growing desktop and application virtualization deployments.
When we’re assessing the performance of a Citrix deployment, we always come back to five key metrics that characterize the performance of the Citrix environment. We call these metrics “The Five Ls”:
- Logon Times
- Load Times
- ChanneL (Yes, including Channels is cheating, but mnemonics are worth it!)
In this blog post, we’ll discuss how the Five Ls impact the performance and delivery of your Citrix applications and what actions you should take to improve performance in those areas. Special thanks to Ken Avram who originally compiled this content!
A Citrix Launch is simply a user starting a Citrix session. By analyzing data in flight, including ICA communications, the ExtraHop platform breaks down launches into the following categories:
- Normal launches
- Session Sharing (The reuse of an existing connection (session) if the second application lives on the same XenApp server as the first application launched via that session)
- Slow launches (Those greater than 30 seconds, although this may be modified)
Questions you can answer:
- Which users are experiencing slow launches?
- Which servers are producing slow launches?
- Which applications are producing slow launches?
- Is there a specific geo-location (subnet) that is experiencing slow launches?
2. Load Times
The Citrix Load Time metric shows the time from when a user clicks on (launches) an application until the ICA Server presents the application to the user for use. You can use this metric in conjunction with Logon Times (see below) to narrow down the culprit.
Possible causes for poor load time performance:
- Roaming profiles – Profiles can be stored on a server that is slow or overloaded. The first thing to check is CPU and Memory. If these seem normal, then disk latency could be an issue. If the profile is stored on a local disk, the best practice would be to run perfmon (a built-in utility) and look at the disk queue counter. Any disk queue over two indicates a disk bottleneck.If the storage is remote, then check for IOPS load on remote storage. Also, be aware that, by default, Windows only allows 50 UNC handles to be open at one time and if more than that try to access the UNC, it will queue up the requests. This can present itself during boot storms such as when everyone is logging on at the same time in the morning. There is a registry setting that can increase this limit. Some of this can be remediated by using a professional profile manager; Citrix includes its own User Profile Manager that works much better than the built-in Microsoft one.
- Redirected folders – It is important to observe the event viewer of the XenApp/XenDesktop machine for any errors with Redirected Folders. This can cause a lot of heartburn and timeouts due to bad policies that deal with redirected folders. Redirected folders are used to speed up the logon times by storing documents to a “share” rather than copying them over and back between logons and logoffs. If this is not configured correctly, this can really slow down logon times. As stated above, professional profile managers can assume this role as can Citrix User Profile Manager.
3. Logon Times
The Citrix Logon Time metric refers specifically to the authentication portion of a launch. Slow logon times are likely due to Active Directory or DNS issues.
- Active Directory – There can be several issues that affect logon times with Active Directory. When a logon commences, it queries Active Directory for a logon server defined in Active Directory Sites and Services. If this is configured incorrectly, you may get a logon server clear across the country (or the world if you have remote sites that far away) that have terrible latency, so having this configured correctly is crucial. Also, check the event logs on Active Directory controllers for replication errors. If you are seeing these on some of them and you happen to hit that controller as a logon controller then the logon itself may fail. These errors need to be fixed immediately so that the logons can be consistent.
- Active Directory Group Policy – This is one of the most misunderstood components of Active Directory and is usually fraught with issues (precedence processing, overrides, competing policies, policies unable to load, etc). This can be difficult to find and that is why it is recommended to have a complete separate test environment in order to ferret out any issues before going into production. Two tools for diagnosing this are the RSOP Tool (Resultant Set of Policies) and the group policy wizard. These tools can be used to verify group policy issues but will only be useful to admins who understand how group policy works in the first place.
- DNS – DNS can be the cause of many common Citrix logon issues since so many processes depend on DNS for resolution. If your DNS isn’t working at 100% you’ll see tons of red herrings.
- DHCP – This subsystem is often ignored and undermanaged, and it may be to blame for some Citrix logon problems.
- LDAP – Failed LDAP authentication can cause Citrix logon issues, too. This is a good place to check if you’re having Citrix logon troubles.
There are two types of latency that you should be tracking and both are monitored by the ExtraHop platform:
- Network Latency: This is reported when by observing a specific ICA packet from the client that contains latency information. This measure of latency is calculated by the Citrix solution.
- Client Latency: This is reported by observing a packet from the client on the End-User Experience Monitoring (EUEM) virtual channel reporting the result of a single ICA round-trip measurement. This is only reported if the EUEM beacon is turned on. In many environments, EUEM will not be enabled.
For practical purposes, you should focus on Network Latency as this will be the one reported to Citrix Director as the user experience. Latency is one of those issues whose cause is quite hard to narrow down because of all the interdependencies that are involved with network transport. Here, we attempt to list the dependencies that are known to plague Citrix environments most often. (Many of these issues require you to “prove the negative” and show that the root cause is not the Citrix environment.)
Possible causes for latency:
- Bad network switch or bad switch configuration
- Bad cable
- Switches not set to fixed speed and instead are auto-negotiated
- Sites and Services incorrectly configured (see above under Active Directory)
- Users doing large printing jobs (this is a Citrix Policy that can be configured to throttle print jobs on remote sites)
- Users using bandwidth-intensive applications (YouTube, watching History Channel, etc)
- Users copying large files, especially for remote users
- Citrix Policies not tuned for remote users
- Applications running slow. This is not a latency issue but can appear to be; the user gets the impression that the network is slow. Poor application performance is a whole different conversation but could be caused by a machine that is starved for CPU/Memory.
- Mismatched networks, i.e. 100Mb going to 1Gb or vice versa.
- IOPS starved backend storage. This will be especially apparent when using XenDesktop and it requires many more resources than XenApp for backend storage. This is not a latency issue but can appear as such.
“Channels” refers to Citrix Virtual Channels. By observing activities on the Citrix channel, you can derive a wealth of information about what users (such as Dr. Ken Pickles) are doing. Things to look for:
- Which channels take up the most network bandwidth?
- Screen updates are going to take a majority of bandwidth
- Printing is typically the second highest bandwidth consumer
- Audio usually comes up third on the overall bandwidth consumption scale
Other Areas to Watch
Besides the Five Ls, you will also want to keep an eye on the following areas when troubleshooting and tuning your Citrix environment:
- XML Broker – The Web Interface/Storefront uses the XML broker to enumerate applications that the user is allowed to access. Some installations have multiple XML brokers but really only two are required for redundancy for most Citrix installations. XML brokers depend in IIS so if a patch interferes with that operation then the XML broker will not function correctly.
- Provisioning Servers – Provisioning servers are used to provision additional Citrix servers as needed in the environment. This relies heavily on network resources more than anything to deliver these servers and their states over a very consistent network. Provisioning Servers are usually deployed in a cluster of two but can be up to four for redundancy and resiliency.
- CGP – Common Gateway Protocol includes Session Reliability and is based on the SOCKS proxy protocol. This is add-on to the ICA channel was made popular with wireless devices. As wireless devices “roam” from access point to access point, they can “hold” the connection rather than dropping it. In typical ICA fashion, if a connection is momentarily dropped you will see a “connection dropped” message and have to re-logon to the Citrix Session. With CGP, it uses port 2598.
Want to learn more about how to use wire data analytics in troubleshooting? Download our Citrix troubleshooting guide to learn how ExtraHop can help.