One of the most substantial additions to Citrix XenDesktop 7 Director is the addition of the Dashboard. Director Dashboard was conceptualized to enable the administrator to get a single pane view of the current status of a site. In the past when an administrator was asked to report on the status of a site, the administrator would need to click through a number of delivery group pages in Citrix Studio to find out something as simple as total number of sessions connected from a site. Citrix Director’s Dashboard allows administrators to see in real time graphical representation of the health and usage of the site they are monitoring. Administrators can view data over the past hour for user connection failures, machine failures for both VDI (Desktop OS) and RDS (Server OS) as well as number of connected sessions and average logon duration of all connections that occurred in the last hour. It also shows the status of the hosting infrastructure that is configured to be used by the site and the state of the services running on the controllers that are part of the site.
When an administrator logs in, if there is a single site configured to be monitored by Director, the administrator is shown the Dashboard of that site. If Director is configured to manage more than one site, then on selection of a site from the site drop down on the landing page, the Dashboard for the site is shown. The Dashboard can be accessed from any other page by clicking on the Dashboard button on the Navigation panel.
The Dashboard comprises of 3 sections.
- Failure panels – 3 categories of failures: User connection failures, Desktop OS machine failures and Server OS machine failures.
- Sessions Connected and Average Logon Duration graphs.
- Infrastructure panel – comprises of the Host and Delivery Controller sections that show status of the hosting servers configured to the site and the delivery controllers that make up the site respectively.
Note: All graphs have data for the last completed minute. Except the Logon Duration Panel where the data is updated till 3 minutes in the past to allow for collection of the completed logon’s data.
There are three buttons on the Failure Summary bar of the Dashboard view for each of the failure categories. These buttons can be used to slide out the panels and view the graphs. The panels have been designed to automatically slide out when the first failure in the last 60 minutes occurs and remain open if there is at least one failure that has occurred in the last hour. These panels stay open even when the failures are fixed, but the admin can click on the buttons in the Failure Summary panel to collapse them. If the admin wishes to see a particular panel and opens it, then the panel will not slide back in unless the button is pressed again or the entire page is refreshed.
Each of the panels are divided into 4 sections:
- Total number of failures – Displayed as a number of the failures of the particular type in the last hour
- Failure Type breakdown table – Shows the distribution of the failures into the different categories that can occur for the particular failure type.
- Delivery Group breakdown table – Shows the distribution of the failures in the different delivery groups that are valid for the failure category. (User connection failures occur for both Desktop and Server OS delivery groups and the Machine failures for Desktop and Server OS machines occur for Desktop and Server OS delivery groups respectively.) Note the table is auto sorted in descending order of the number of failures per desktop group.
- Failure graph – shows the distribution of failures over every minute of the last hour. Below each graph is a “View Historical Trend” link which navigates the admin to the Trends page to be able to see trends in the past about this kind of failure.
If a row from the Failure type Breakdown tables or the Delivery Group breakdown table is selected then the graph changes to show the data for just that particular type of failure or all types of failures for the specific delivery group respectively.
If the row contains a nonzero value then the administrator can click on this value and is navigated to the Filters view with the particular search criteria selected.
Let’s delve into each of the panels’ contents a little more.
User Connection Failures:
The User Connection failures panel as the name suggests reports all failures in the attempt to start a session. This graph is a bar graph as a connection failure is an event that has occurs at an instant. We count these instances and track the failures in the chart on a per minute basis. Each data point is the total connection failures in that minute. The number on the left denotes the total connection failures that have occurred over the last hour.
This failure category can be broken down into 5 types of failures:
- Client Connection Failures – Failures due to the inability of the client side to complete the session connection. For example, connection timed out, server was not reachable.
- Configuration Errors – Failures due to configuration done by administrators such as putting the Delivery Group or a particular machine in Maintenance mode.
- Machine Failures – Failures due to the machine that is to launch a session, itself failing. Details discussed in the following Failed Desktop/Server OS machines types.
- Unavailable capacity – Failures due to the configured capacity of a particular delivery group having been completely consumed. For example, too many users logged into a Server Desktop OS delivery group or a user accessing a Pooled Random delivery group once all the machines in the delivery group are already assigned to other users.
- Unavailable Licenses – Failures when the delivery controller is unable to acquire a license from the license server to launch the session.
See the screenshot above.
Failed Desktop OS Machines
The Failed Desktop OS Machines failure panel shows the machines in a failure state for the various delivery groups. The number on the left denotes the number of machines in a failed state at the date/time shown below the number. The graph here is a line graph as it allows the administrator to track the number of machines that were in a failure state over the last hour. Each data point is the total number of machines in a failed state at the point in time (each minute).
This failure category has 3 types of failures:
- Failed to Start – Failures due to a guest machine being unable to start as in disk is detached when attempting to boot or the hosting server reported that the vm could not be booted up.
- Stuck on Boot – Failures due to the guest operating system being unable to boot up fully, even the machine itself is started. For example, OS BSOD during boot or unable to locate the boot partition.
- Unregistered – Machine is unregistered from the Delivery controller which can occur due to loss of network connectivity between the two, the clocks on the two being out of sync or the Desktop Service not running on the desktop.
See the screenshot above.
Failed Server OS Machines
The Failed Server OS Machines panel just like the Failed Desktop OS panel shows the failures of RDS machines in various delivery groups. The number on the left denotes the number of machines in a failed state at the date/time shown below the number. It also has a line graph associated with it just like the Failed Desktop OS machines panel above
This failure category has 4 types of failures:
All the 3 failure types described in the Failed Desktop OS Machines failures above. Additionally the 4th type is, Maximum Load – Failures due to the RDS machines being over their configured load limit, as in too many sessions or CPU or Memory having crossed the threshold specified for the delivery group etc. (These can be adjusted from Citrix Studio.)
Remember, that clicking on any of the non-zero values in any of the tables described above will take the administrator to the Filters page with each of the failures being listed as well as the appropriate filter already applied based on which value was selected.
The Sessions Connected graph shows the distribution of the number of concurrent active sessions over all delivery groups in the entire site. Each data point represents the number of active sessions in that minute (this includes RDP sessions as well as console sessions that are active on the machines being monitored by the site).
Average Logon Duration
It shows the average session logon duration over that last hour as a number and a graph showing average session logon time per minute for the last 60 minutes. This chart is three minutes latent in order to display the completed logons. The number on the left denotes the Average Logon Duration over the last hour based on the number of logons (total time taken by all logons that completed in that minute /number of logons completed during that minute). It also shows the number of sessions that completed logon in that minute. See the panel in the above screenshot.
For more information about logon duration read the blog post: Director Logon Duration Explained
This panel gives information on the status of the infrastructure of the site being monitored and is made up of 2 sections: Host and Delivery Controller.
This table displays the status of the Hypervisors that are configured to host VMs of the site and alerts that have been triggered, if any, on those hypervisors. (Note: Hypervisor alerts set on Hyper-V servers are not currently supported.)
The 4 types of alerts that are shown in this section are:
- CPU Usage alert – Triggered when CPU usage on a Hypervisor (or one or more of the hosts in a pool) is above a configured threshold.
- Memory Usage alert – Triggered when Memory usage on the Hypervisor (or one or more of the hosts in a pool) is above a configured threshold.
- Network Usage alert – Triggered when Network usage on the Hypervisor (or one or more of the hosts in a pool) is above a configured threshold.
- Disk Usage alert – Triggered when Disk throughput on a storage volume is above a configured threshold.
When an alert is triggered the alert is shown in the table as shown above under the status column. The administrator can then click the text and it will open the dialog in the right of the above screenshot. The administrator can then see which host the alert was generated from.
Note: The alert will continue to be displayed until the next alert check time occurs. To know how to set the alerts and how to manage them read the blog post: Hypervisor Alerts in Citrix Director
This table displays the status of Delivery Controllers that make up the site. For each Delivery Controller the status of the services running on it are also displayed.
The columns in the table are:
- Delivery Controller: Hostname of the Delivery Controller.
- Status: Status of the Delivery Controller, Online or Offline, i.e. the Director server is either unable to reach the Delivery Controller or the Broker Service on the Delivery Controller is not running. (Below screenshot shows the state of the panel when the controller is offline.
- Services: Shows the number of core services that are currently not available, including Citrix AD Identity Service, Broker Service, Central Configuration Service, Hosting Unit Service, Configuration Logging Service, Delegated Administration Service, Machine Creation Service and Monitor Service. Just like the alerts in the Hosts table, the administrator can click the alerts text and a pop up displaying the name of the service, the time the service failed and the location of that service.
- Site Database: Indicates whether the site database is connected or not (i.e. the Delivery Controller is unable to contact the Site Database or there is an issue with the database configuration or there is version mismatch between the database and the service.
- License Server: Indicates whether the License server configured for the Site can be connected to or not (i.e. the Controller is unable to contact the License Server, if they are running the same machine then the service may be stopped).
- Configuration Logging Database: Indicates whether the Configuration Logging Database is connected or not (the Citrix Configuration Logging Service on the controller is not running).
- Monitoring Database: Indicates whether the Monitoring Services Database is connected or not (i.e. the Delivery controller is unable to contact the Monitoring Services DB or the Citrix Monitoring Service on the controller is not running).
Note: This blog is applicable for Citrix Director that is part of XenDesktop Version 7 and 7.1