I got a post from “Anonymous” on one of my previous BLOGs a while back. Making symbolic references to the BLOG series locations the poster – who we will call “Ann Nonymous” – stated that she was getting routed to Sydney even though she was very close to the San Francisco datacenter. So we exchanged some emails and had a chat, and got to the bottom of the apparent mis-route issue.
Actually, there were a couple of other similar comments – on the same theme – that were sent to me. So I thought I’d write up this little BLOG that addresses “Application Proximity” based upon Ms. Nonymous’ experience.
Ms. Nonymous Meets Miss Route
It seemed that Ms. Nonymous’ ultimate issue was that her request was routed to the closest datacenter by GSLB. But then, once inside the GSLB allocated SSLVPN, the internal corporate network routed the actual user traffic to the closest instance of the application which was actually hosted in another site.
From the user perspective, this appeared to be a GSLB mis-route (aka “Miss Route” ).
This leads to a very important aspect of designing a GSLB architecture and flow. It is related to the earlier BLOG in which I discussed “Understand Your Network”. A corollary of this pertains to the understanding the applications and the data.
Understand Your Applications
In the incident referenced above, Ms. Nonymous did detect that she was ultimately not directed to the anticipated location. Unbeknownst to her,she was initially directed to the appropriate SSLVPN portal, but was then routed through the corporate network, to an instance of the application that was not close.
Why does this happen?
While commoditized application servers are likely to exist in every site, some high-end applications and their associated data may not be easily replicated. This may be due to complexities of synchronization, licensing, or other factors.
An example of this might be a database-driven HR application that exists in only two of the four datacenters. Therefore user requests for that application must be forwarded to one of the two sites. The following diagram shows this:
We, as architects, must consider these implications when designing a GSLB system since the appropriate application-level routing can be performed outside of or from within the corporate network.
What this really boils down to is that important GSLB architectural decisions must be made based upon where the applications and data exist, and who should carry the burden of network traffic.
Who Should Carry the Load?
In the example as shown, the corporate network is responsible for carrying user requests and response data across the corporate intranet. The IT group must therefore ensure that the capacities and performance exist to handle this load.
Alternatively, the GSLB infrastructure can be segmented such that, in the HR application example, the burden of communications rests upon the user. In the above diagram, that would mean that the user must connect to either the Miami or the Sydney site directly, through the ISP. The selection is made by GSLB, of course. This configuration offloads the corporate intranet.
If routing via the corporate intranet is chosen, there
are many ways in which to achieve such routing. One of the common methodologies is to use the load balancing services such as those offered by the Citrix NetScaler appliance or virtual machine. Load balancers typically allow the specification of different distribution algorithms based upon least requests, fastest response time, and others.
But remember that this, unlike GSLB, involves the routing of user traffic, and not DNS traffic! In this example however, Ms. Nonymous’ requests may also end up in Sydney, because that is a location in which the application exists.
Furthermore, as with GSLB, load balancers may be configured to use complex application-level health checks to ensure that any requests are sent to only healthy systems.
Alternatively, load balancing or request distribution methodologies native to the application or infrastructure may also be used. An example of this as provided by XenDesktop is pictured to the left.
Once a location or application instance is assigned, application dependence upon user interaction continuity must also be considered by the GSLB architect. Persistency is a process in which users continue to be directed to the same resource (typically location or server) as originally assigned. This is, too, is application driven. Sophisticated applications will typically require persistence until the user finishes all in-flight workloads.
Persistency options may be specified in the inter-datacenter GSLB infrastructure and as part of the intra-datacenter load balancing, or both. If the datacenter based application load balancing has been implemented with persistency, architects must consider implementing it as part of the GSLB design. This can be especially important if multiple FQDNs are in use.
Lastly, when defining GSLB routing policies, consider the location of the user data. Where are network based home directories located? Are they replicated? If so, and there are not other application dependencies, then GSLB may be permitted to direct user requests to any site. Again, GSLB architects must weigh the user benefits of user data migration or cross site access against the corporate network overhead implications.
In the end, the defining factor in datacenter GSLB distribution configurations will be the availability and/or equivalence of the applications and their associated data in each of the GSLB defined sites.
In a pure disaster recovery configuration, however, replication of critical data is a prerequisite. This simplifies the GSLB options.
Again, this BLOG came about as a result of emails comments submitted. Please feel free to send or post your comment. I welcome them.
And don’t forget to follow all or us at: Ask the Architect