Reports in Citrix SCOM Management Packs use the Operations Manager data warehouse database (OperationsManagerDW) as a source for graphical representation of data in a useful way. Often, you might be confronted with a situation for which the retention needs to be adjusted for a specific environment. The problem is that OperationsManagerDW can be one of the largest databases, so retaining its data—especially redundant data—could turn out to be very costly.
Let’s talk, today, about how configuring aggregation retention times in the OperationsManagerDW can help remove unnecessary data and SCOM maintain performance.
OperationsManagerDW is a Microsoft SQL Server database that stores monitoring and alerting data for historical purposes. The most important function of OperationsManagerDW is reporting. This is possible because of the availability of long-lasting data. Data gets stored in both the operational and data warehouse databases. However, only in the data warehouse database it is retained for extended periods of time. Custom data is also stored only in the data warehouse database. It is possible to store performance and event data in either the operational or data warehouse databases.
Data aggregation and grooming
Citrix SCOM Management Packs first store raw data in OperationsManagerDW, by default every 5 minutes. Raw data is then processed and stored as aggregated hourly and daily data. Hourly aggregation takes all the samples from the last hour, calculates the minimum, maximum, average and standard deviation values and then stores them in a separate hourly aggregation table. Daily aggregation also takes raw samples, but from the last 24 hours.
SCOM stores collected data into datasets. Datasets are collections of different types of data. Each dataset can contain different types of data (for example, performance, alert, event, state) in different periods (raw, hourly, daily). Citrix SCOM Management Pack for XenApp and XenDesktop contains also custom datasets that you can use for machine and session monitoring. In this section I will explain how you can configure custom aggregation.
Raw performance data, used for various reports, is stored for 10 days by default. If you want to change the period for which this data is stored (for example, you would prefer to keep it longer in case aggregations are failing, or, on the other hand, you would prefer to get rid of it earlier because of the amount of raw samples and the related storage cost), you can make use of the following procedures.
Here is a table of the default dataset retention times for XenApp and XenDesktop’s custom data aggregation that is used for machine and session performance:
|Data set||Aggregation type||Days to keep|
|MPXAXD Machine DataSet||Raw data||10|
|MPXAXD Machine DataSet||Hourly aggregations||400|
|MPXAXD Machine DataSet||Daily aggregations||400|
|MPXAXD Session DataSet||Raw data||10|
|MPXAXD Session DataSet||Hourly aggregations||400|
|MPXAXD Session DataSet||Daily aggregations||400|
Running this query on OperationsManagerDW retrieves all the available datasets:
select * from Dataset
order by DatasetDefaultName
The most common datasets are Alert, Event, Performance, and State.
To retrieve all the available datasets together with the period for which the data is stored, run the following query:
from StandardDatasetAggregation sda
inner join DataSet as ds on ds.DatasetId = sda.DatasetId
The resulting columns are:
- DatasetId: GUID of the dataset that you are going to use to determine which entry to configure
- DatasetDefaultName: Display name of the data set
- AggregationTypeId: Type of aggregation (explained in the following table)
- MaxDataAgeDays: Period (in days) for which data is retained for specific dataset’s aggregation
|100+||Our custom raw|
You can change the MaxDataAgeDays value for all the dataset aggregations that have type ID of 0, 20 or 30. Data sets with type ID of 100 or more are not used for aggregation, so the MaxDataAgeDays value just means how long the data is retained. To change grooming setting for selected datasets, go to the Object Explorer pane under OperationsManagerDW > Tables > dbo.StandardDatasetAggregation and select Edit table. Check the exact DatasetId in the results of the previous query and adjust MaxDataAgeDays to the value you want. Make sure to follow some general rules, depending on the actual configuration of your environment:
- Keep daily aggregations for the same amount of days or more than hourly aggregation
- Don’t modify settings if you don’t really need to
- Do not modify the Configuration dataset
- Do not modify the performance RAW dataset if this is not absolutely necessary
- Keep data retention aligned with your reporting requirements
- Hourly datasets use 24 times more space than daily, so use daily datasets for longer periods
- You can groom events to less than the default 100 days if you do not use them in reports
Rest of this blog post presents data about storage consumption by different custom datasets with data retention set at 100 days and 400 days. For following examples, we will only lower hourly aggregation retention to 100 days, because it makes the biggest impact on the database size.
The machine dataset
The machine dataset is used to store machine discovery and machine performance in Machine, MachinePerformanceDaily, MachinePerformanceHourly and MachinePerformanceRaw tables. Raw performance is collected every 5 minutes with 3 different custom rules. There are different metrics that are collected, for example, CPU usage, memory usage, network read/write, disk read/write, and so on.
This is a calculation on how much storage space does machine data occupy.
This is an example for 200 server OS machines. Daily aggregated data takes the least space at first glance. Over a longer period, raw data has less of an impact, because it gets groomed (entries are deleted in parallel with aggregation). Decreasing hourly data age can make a significant difference in large environments, if you do not require detailed performance reports for more than 100 days.
The session dataset
The session dataset contains data about the sessions that were made on applications or desktops. Session discovery data is inserted for each session that is started. Agent checks for new sessions every 5 minutes and updates the data if there are any or if any existing sessions have been logged off. Session performance is collected every 5 minutes, but only if there are sessions running.
The session dataset contains the following tables:
|Session performance raw data||10|
|Session performance hourly aggregations||400|
|Session performance daily aggregations||400|
|Connections and logons||400|
Let’s take an example for an environment that has 5,000 average concurrent application sessions per day, 5,000 desktop sessions per day with average session time of 8 hours per day. On the graph you can see how much does session dataset occupy in OperationsManagerDW. After a few days, the growth becomes linear until 400 days after which the database size remains approximately the same. The size increase per day is around 125 MB without grooming.
The session dataset depends mostly on the number of sessions. In large environments, keeping performance data may become a costly policy—this is why collecting session performance data is optional. If you wish to collect session performance data, you need to install the product’s Machine Agent. Changing hourly aggregation retention time to 100 days drops the dataset size to approximately 2,800 MB, which is 2 GB less than before.
The performance dataset
The performance dataset is the default SCOM dataset which is used to store performance data. This dataset has PerfDaily, PerfHourly and PerfRaw tables by default. There are around 40 rules in Citrix SCOM Management Pack for XenApp and XenDesktop that collect different performance data for logons, sessions, server load, and so on, and add it to the performance dataset. Rules collect performance data every 5 minutes and aggregate hourly and daily. This is an example for 1 site with 3 controllers, 20 server OS and 20 desktop OS delivery groups.
All examples in this post use a default or lowered hourly retention time. In summary, Citrix SCOM Management Packs datasets require approximately 6,750 MB with the default settings in an environment described in the examples. Changing hourly retention time to 100 days for all of the custom datasets would save at least 3 GB.