Zones, Latency and Brokering Performance

With the addition of Zones in XenApp/XenDesktop 7.7, I wanted to dig a little more into the impact of latency on brokering performance.

Both Craig and William have talked about zones already, but I’m going to dive into just one area: brokering performance with latency.

For the majority of end users, enumerating and launching resources is something they’ll do every day. With the addition of zones, we’re allowing users to be on higher latency links, so long as there’s a local broker.

With this additional latency, there will, inevitably, be an impact on end user experience. For the majority of work that users will do, they’ll see the expected slowness that’s linked to round-trips between the satellite brokers and the SQL database.

However, for launching Apps, there is a pain point in actually brokering sessions. This pain point is due to the need to pick the lowest-loaded VDA on which to launch an app. This occurs within a database transaction and needs a snapshot of all the current loads on the VDAs within the Delivery Group. To achieve this, a lock is taken out on all the workers in Delivery Group, which stops other users (IE causes serialization) from taking the same locks. It also waits on and blocks out worker state changes (eg. session events).

With low latency, the delay between taking the locks and releasing them is very small. However, as latency increases, so does the time the locks are held, and so the time to broker sessions increases.

To back this up, we’ve looked at a variety of latencies and launch rates. The latencies are the round-trip times (RTTs) and were based on Verizon IP Latency Statistics. Note that most of the RTTs are lower than the maximum values listed, but we wanted to make sure that we were testing with some useful RTTs.

Round-trip times of 10ms cover most inter-country delays. 45ms covers North America, Europe and Intra-Japan; 90ms covers Trans-Atlantic; 160ms covers Trans Pacific, Latin America and Asia Pacific; finally, 250ms covers EMEA to Asia Pacific.

We tested with a variety of concurrent requests, covering values from 12 to 60 in increments of 12.

Note: the VDA Sessions are simulated, as the testing is focused on the impact of latency on the broker. For this testing, there are 57 VDAs within one delivery group. Each test attempted to launch 10,000 users.

**10ms RTT results**
Concurrent Requests	12	24	36	48	60
Average Response Time (s)	0.9	1.4	1.6	2.1	2.6
Brokering Requests per second	14	17.8	22.9	23.2	22.9
Errors (%age)	0	0	0	0	0
Time to launch 10k users	11m57s	9m24s	7m16s	7m11s	7m17s

As expected, 10ms is fast enough to handle the loads placed upon the system. No errors were seen, and is the fastest way to launch users. At the maximum launch rate of 60 concurrent users, average response times were 2.6s, taking ~7m11s to launch all 10k users.

**45ms RTT results**
Concurrent Requests	12	24	36	48	60
Average Response Time (s)	1.7	3.1	4.3	6.4	7.3
Brokering Requests per second	7.1	7.8	8.4	7.5	8.2
Errors (%age)	0	0	0	0.01	0.01
Time to launch 10k users	23m28s	21m19s	19m51s	22m15s	20m19s

With 45ms, results were still good, at the very high launch rates, 1 or 2 users saw an error. Note: the impact of serialization can be seen on the response times, with an increase from 1.7s to 7.3s to broker a session. Total time to broker 10k users was 20-23m.

**90ms RTT results**
Concurrent Requests	12	24	36	48	60
Average Response Time (s)	2.9	6.4	9.5	12.9	16.2
Brokering Requests per second	4.1	3.7	3.8	3.7	3.7
Errors (%age)	0	0	0	0.01	0.01
Time to launch 10k users	40m30s	44m29s	44m11s	44m55s	45m04s

Again, 90ms results saw few errors. However, the impact of transacting over latency becomes more obvious with users seeing an acceptable average time of 2.9s to broker a session with 12 concurrent requests, increasing to likely unacceptable 16.2s to broker a session with 60 concurrent requests. In this case, it’s actually more advantageous to broker users at a lower rate. To log all 10k users on took 40-45 minutes

**160ms RTT results**
Concurrent Requests	12	24	36	48	60
Average Response Time (s)	5.7	11.4	17.3	23.2	28.0
Brokering Requests per second	2.1	2.1	2.1	2.1	2.1
Errors (%age)	0	0	0.12	4.0	17.7
Time to launch 10k users	1h19m0s	1h19m27s	1h19m55s	1h20m26s	N/A

With the 160ms, we start to see significant errors occurring with higher launch rates, with 4% errors at 48 requests, and 17.7% errors at 60 requests, along with response times approaching 30s. However, up to 36 requests the error rate is 0.1% with an average brokering time of 17s. Note: it’s hard to judge the launch time for 60 requests, as 17% failure is hard to factor in.

With this latency, we’d recommend not passing 24 concurrent requests. Also the size of the site may also be a factor–logging 1k users in would take ~8m. This would scale up to 1h20m for 10k users. As such, we wouldn’t recommend a large site with this level of latency to the database.

**250ms RTT results**
Concurrent Requests	12	24	36	48	60
Average Response Time (s)	9.3	15.4	26.7
Brokering Requests per second	1.3	1.6	1.3
Errors (%age)	0	0	4.6	42.8	99.0
Time to launch 10k users	2h08m33s	1h46m52s	2h03m46s	N/A	N/A

With such high latency, a large number of timeouts occurred at higher concurrent launch rates. At 48 requests, 42% of requests failed, and 60 requests timeouts were so common that the site would be unusable, as 99% of requests failed. This rendered other data unhelpful as the average response time was of a few successful requests.

The only acceptable launch rates were 12 and 24 requests. It would be hard to recommend deploying a large site with this level of latency, as logging 1k users in took 13m with 12 concurrent requests, and 11m with 24 concurrent requests. 10k users would take up to 2h8m.

Throttling requests

If you do need to work with high latency, and find that too many time-outs occur, a registry key was added to XenApp/XenDesktop 7.7 to allow it to handle only a fixed number of concurrent brokering requests. Any requests above the limit will request that Storefront retries the request after a few seconds. This will help back off requests and, thus, reduce lock queuing. However, some users may end up seeing extended launch times, as they’re always unlucky and their requests is always backed off.

The key is a DWORD and should be stored in:

HKLM\Software\Citrix\DesktopServer\ThrottledRequestAddressMaxConcurrentTransactions

If the key doesn’t exist, then no limit on brokering requests is made. Note: the key is per DDC, so the total requests on the SQL server needs to be split amongst the remote DDCs.

Summary

Brokering does work over latency, but the latency needs to be considered for sizing a remote zone. If a zone is large, it may still be desirable to keep a database local to that zone. If the zone is small, using a remote zone may work well and might also reduce management cost without impacting on the end user experience.

Note that we recommend that your zones have less than 250ms RTT, beyond that you should consider setting up different sites.

Topics

Products