While Citrix Provisioning Server is not really my area of expertise, my fellow consultants working in that field have the challenge to create architecture designs for this component. In a PVS deployment, provisioned workloads – physical or virtual – query DHCP and get then directed to a TFTP server that offers some sort of boot image which is a file of usually a couple of megabytes. If this part of the puzzle fails, e.g. because the TFTP service is not available, a provisioned workload cannot start. Thus, each reasonable system design should provide a way to make TFTP highly available.

( Thanks to Marcus Jäger for inspiring this solution… )

The method of choice for adding scalability and availability to almost arbitrary TCP/IP services is – of course – Citrix NetScaler. Typically, NetScaler terminates a connection on layer 4 or 7 and creates a new context to the backend. However, NetScaler has no TFTP specific load balancing service / vserver type for termination on L7. Also, UDP ports change during the connection which make the standard UDP type inapplicable. http://support.citrix.com/article/CTX116337 describes how to load balance TFTP on layer 3 by basically exchanging the destination IP of the client request and NAT’ing on the way back. To make this work, the TFTP servers need to route all emitting traffic through the NetScaler. This can be undesirable because it interferes with L3 design and imposes NAT on all non-TFTP traffic emitting from the TFTP servers as well. In the following, we’ll describe an alternative.

How it works using DSR

DSR or Direct-Server-Return is a method where NetScaler resides only on the path from client to server but not on the path from server to client. The following depicts the test setup:

 
There is a client network and a server network separated by a router. On the server network, there is a one-armed NetScaler and two TFTP servers. The NetScaler has a VIP running on the .20 of the server network. The TFTP servers have IP addresses .30 and .31. Also, both TFTP servers have the .20 as a loopback address, i. e. an address the servers accept packets for without answering ARP requests for it. The VIP is set up for ANY protocol and ANY port and uses MAC based forwarding, i. e. while the services are referenced by their IP address, the actual forwarding is done by changing the MAC address only.

Let’s assume the client is requesting a file from the virtual TFTP instance. The greatly simplified packet exchange would look like this:


The client sends a UDP packet to port 69 of the (virtual) server instance using a random highport as a source. Since the destination is not on the same network instance, it is encapsulating the IP packet in an L2 frame destined for router 1 which is the gateway to the server network. The MAC address of the router was derived using ARP. The router has a direct connect to the server segment and and uses ARP to determine the MAC address of the VIP which is that of the NetScaler. Consequently, after decrementing the IP TTL by 1 it encapsulates the IP packet in an L2 frame from itself to the NetScaler.

The NetScaler identifies the frame as belonging to the ANY/ANY vserver and does its load balancing magic. In our example, it derives service tftp1 as the recipient of the packet. It derives the backends IP using ARP and encapsulates the packet into an L2 frame originating from the NS and destined for tftp1’s network interface. The MAC forwarding modes lets the NetScaler stay off the destination IP which it would have rewritten otherwise.

Tftp1 accepts the packet since it has a loopback address that is the same as the VIP of the NetScaler and hands it over to the TFTP server. With the standard FreeBSD TFTP implementation running with inetd, which is the one I have set up the test with, the TFTP daemon creates packets originating at the server’s real IP, not the loopback. Following the protocol’s logic, they are destined to the UDP port that the request packet originated from (port X). Subsequently, the tftp client deals with the real server IP directly.

Isn’t this cool? NS’s flexibility, combined with the simplicity of TFTP, is brilliantly providing load balancing for TFTP by only modifying the the first packet of the exchange. Simplicity is power, isn’t it?

How to set it up (the gory details)

For the test, I have used to FreeBSD 8.1 ‘minimal’ instances running on XenServer. This should be equally feasible with Windows or any other *NIX. However, I am quite sure that there is no way of storing a complete config as dense as this.

The instances are set up as follow by adding the following lines to /etc/rc.conf

hostname="bsd1"
ifconfig_re0="inet 192.0.2.30 netmask 255.255.255.0"
ifconfig_lo0_alias0="inet 192.0.2.20 netmask 255.255.255.255"
inetd_enable="YES"
gateway_enable="YES"
static_routes="net2"
route_net2="-net 198.51.100.0/24 192.0.2.254"

bsd2 has the same setup with the exception of having the .31 IP address.

Also, I have uncommented the tftp line in /etc/inetd.conf

tftp dgram udp wait root /usr/libexec/tftpd tftpd -l -s /tftpboot

Restarting – though uncool and unnecessary for a BSD guy – takes care of applying the config.

The fun part is configuring the NetScaler

add ns ip 192.0.2.20 255.255.255.0 -type VIP

creates the VIP, ARP being enabled implicitely.

add route 0.0.0.0 0.0.0.0 192.0.2.254

takes care of the routing.

add server 192.0.2.30 192.0.2.30
add server 192.0.2.31 192.0.2.31
add service tftp1 192.0.2.30 ANY * -gslb NONE -maxClient 0 -maxReq 0 -cip DISABLED -usip YES -useproxyport NO -sp OFF -cltTimeout 120 -svrTimeout 120 -CKA YES -TCPB YES -CMP NO
add service tftp2 192.0.2.31 ANY * -gslb NONE -maxClient 0 -maxReq 0 -cip DISABLED -usip YES -useproxyport NO -sp OFF -cltTimeout 120 -svrTimeout 120 -CKA YES -TCPB YES -CMP NO
add lb vserver tftp ANY 192.0.2.20 * -persistenceType SOURCEIP -timeout 100 -lbMethod ROUNDROBIN -m MAC -connfailover STATELESS -cltTimeout 120
bind lb vserver tftp tftp1
bind lb vserver tftp tftp2

takes care of the rest, defining the two servers using the IP, the two services as ANY/*, USIP on. The vserver is bound to the VIP using ANY/* as well. Also, the mode is set to MAC and connfailover to STATELESS. I recommend setting the persistence to SOURCEIP, just in case the server is answering using the loopback and the client is using the VIP to communicate with the server during the whole exchange. For the load balancing method, I recommend round robin because exchanges are very time-limited and there is not much of a connection.

Architectural discussion

The downsides of this architecture compared to CTX116337 are

  1. The backend servers have to be on the same LAN as the NetScaler
  2. The backend vserver have to have a loopback address that is
    equal to the service VIP.  

 
Its advantages are

  1. No need for the NS to be on the reverse path, bringing performance – NS can never be a bottleneck (not that this is likely). Also, there is no need to change the default gateway on the backens.
  2. Enables the same servers to perform other functions without “tunneling” them through the NetScaler.

Outlook

Well, as you might have noticed there’s a missing piece when it comes to high availability which is a suitable service monitor. Fortunately, I already wrote a custom NetScaler monitor for this but that’s a story for another blog entry.

Working with NetScaler for almost 4 years the products flexibility still offers new stuff to find. NetScaler most definitely rocks – if XenApp and XenDesktop are the heart of the delivery center, then NetScaler is its artery. Now go out and load-balance TFTP! The default gateway in no excuse any more!