NetScaler is a high performance Application Delivery Controller (ADC) designed to scale up to millions of client TCP connections. In version 9.x, we introduced a new technique called Web 2.0 Push to scale the performance of Comet applications. To serve large numbers of simultaneous users, comet apps need a large server infrastructure that holds mostly idle user connections. Using the Push feature, the idle user connections can be offloaded to the NetScaler freeing up server resources to run the application logic.

Update: Added some clarifications based on feedback plus updated the more information section with additional links.

What is a Comet application?

This definition from Wikipedia suffices (paraphrased):

Comet is a web application model where a long held HTTP request allows a server to send updates to the client without the browser explicitly requesting it. Comet is also known by several other names, including Ajax Push, Reverse Ajax, HTTP Streaming, and HTTP server push etc. Comet applications typically use Ajax with long polling to detect new information on the server.

In long polling, the server holds on to a HTTP request until it has any new data to send (or a timeout expires). On receiving a response, the client immediately sends a new request that is held at the server. Other ways of achieving this include using HTTP chunks over a single request or not terminating a response (no Content-Length or chunk headers).

Problem – Too many idle long-lived connections on the server

To handle these types of applications, the web server must scale up to handle a lot of simultaneous client connections (one per active user). Most of these connections are idle, long-lived connections. The memory usage per connection in most application servers ranges from 50-100KB. This would roughly be able to achieve 20K connections per GB of RAM (best case).

Memory usage per TCP connection is roughly 0.5-1KB. So the NetScaler easily manages 1M connections per GB of memory. On platforms like MPX17000 with 32G RAM, this works out to around 35M simultaneous TCP connections. The NetScaler appliance is ideally suited to hold on to idle user connections and let the server spend resources on the actual application logic.
Update: The 35M TCP connection number is for regular HTTP load balancing on nCore builds. The Push connections are processed on a single core (in 9.2nCore builds) and would scale roughly to 2M concurrent connections on a MPX17000.

How does NetScaler Push work?

NetScaler, in essence, is a TCP connection proxy. It terminates an incoming client connection and creates a new TCP connection to the server and forwards the client’s request on that connection.

Step 1: NetScaler inserts custom HTTP header X-NS-PUSHVSERVER

In the request forwarded to the server, NetScaler inserts a custom HTTP header,

X-NS-PUSHVSERVER: 10.217.6.64_80

This is the destination (IP: 10.217.6.64 and port 80) to which asynchronous update messages from the server must be sent. In some applications, the server sending the update need not be the same as the server receiving the original request.

Step 2: Server decides if request needs to be deferred and sends appropriate response

If the server has no immediate data to send to the client (defer the request), the server sends a HTTP header response to NetScaler with two additional HTTP headers –

  • X-NS-DEFERRABLE (Value YES indicates this request can be deferred)
  • A custom HTTP header (NSSERVERLABEL below) whose value can be used as a label to identify the user. This value is opaque to the NetScaler and could be user-id or any other application specific information.

An example server response is shown below:

HTTP/1.0 200 OK
Server: TinyHTTPProxy/0.2.1 Python/2.5.1
Cache-Control: no-store, no-cache, must-revalidate
Pragma: no-cache
Content-Type: application/x-amr
Connection: Closed
X-NS-DEFERRABLE: YES
NSSERVERLABEL: 16318370962850900588694
Content-Length: 0

If the request is deferrable, the NetScaler decouples the client and server connections. The server connection can be reused for other requests, if desired. The client connection is placed into a deferred state waiting for updates.
If the server has data to send to the client, it could optionally treat this as a normal request and send the response without the custom headers. This works for long-polling scenarios.

Step 3: Server sends asynchronous updates to the Push Vserver

When the server has an update for the client, it sends a request to NetScaler using a REST interface. The user label is part of the URL and the MSG_END parameter indicates whether the update is complete. The body of the request is forwarded as-is onto the client connection. The NetScaler also responds to the POST request by returning a status of success or failure in the body. Full details of the REST interface are available here or in the NetScaler Traffic Management guide.

POST /CLIENT/V10/16318370962850900588694?MSG_END=1 HTTP/1.1
Host: 10.217.6.64
Accept-Encoding: identity
Content-Length: 722
<722 bytes of update data>

If the server is done with the request, it could set the MSG_END parameter to 0. This will terminate the client request.

Advantages of using NetScaler Push

Normally, the server needs to hold on to one idle connection per simultaneous user. To handle 100,000 users (say), this would need 20 servers (@5000 per server) with relatively poor utilization. Using Push allows better utilization of the server resources for running application logic or a lesser number of servers to serve the same number of active users. Reducing the number of servers obviously reduces operational costs by saving on management as well as power consumption.
NetScaler is already deployed in the network infrastructure as a load balancer. The Push feature is available on all NetScaler appliances (including the virtual NetScaler VPX appliance) in Enterprise and Platinum editions.

More Information

NetScaler Push REST API
Push feature – configuration steps
NetScaler Push – Java reference implementation
NetScaler Traffic Management Guide
NetScaler product documentation 9.1