The Berlin based Heinrich-Herz Institute recently published a press release announcing that their scientists were able to transfer 10.2 Terabit/s using a single wavelength fibre optic transmission, which is equivalent to transferring the content of 240 DVDs in one second (read the release here http://www.idw-online.de/pages/de/news412526).
While this is cutting edge science and it will take years until products using this technology will hit the market, it also made me think of how this will change our IT world.
I was just about to write how we could centralize everything regardless of its bandwidth requirements, knowing that this would be somewhat unrealistic as developers always find ways of utilizing the whole bandwidth available (it’s the same as with CPU and Memory). But I thought just don’t be to pessimistic and imagine bandwidth is almost unlimited. And right in the moment I was writing down the first couple of words a colleague of mine (Holger, who also contributed to this blog) made a comment about it. He said “It would be nice if they could decrease the latency, but this would require to increase the speed of light”.
I thought nice joke, ha ha…, but actually he was somewhat right. At least for users facing interactive applications such as published applications / desktops using ICA/HDX connections, VoIP or apps where data has to be loaded from the backend for every transaction, latency is very important and user experience goes down the more latency you have. This is because user have to wait longer for application or session reaction/responses to more latency exists. The best example is telephony, where it is easy to imagine you have to wait one second until the other caller hears you and another second until you hear him/her answering. Such calls are no fun… But believe it or in many customer conversations latency is, was and I assume will be even in the terabit era an underestimated topic.
So let me elaborate a little on latency. To my (limited) knowledge there are three main causes for latency, but there is only one where a TBit/s network could help.
1. Physics: As pointed out earlier network transmissions, regardless of using copper or fibre connections, are limited by the speed of light. Within a vacuum the light travels with roughly 300,000 km per second (186,000 miles/s). Within a fibre optic cable the light travels approx. 30% slower, which means 210,000km/s or 210km/ms. So a TCP packet transfered via an ideal terrestrial connection would travel for roughly 38ms to go from Zurich to Miami (approx. 8,000km) or a RTT of 72ms and a high bandwidth connection does not change anything about it.
2. Network routing / components: Unfortunately you typically don’t seen straight point-to-point physical connections between you and the server you’re connecting to. So in my Zurich -> Miami case the network packets are routed as follows:
My computer -> DSL Modem -> Swiss ISP -> Amsterdam -> London -> New York -> Miami -> local ISP -> The server
…in total 18 hops.
On one hand this increases the distance the packets need to travel from 8,000km to approx. 10,000km, which will increase the ideal packet travel time to roughly 47ms or 94ms (RTT). On the other hand a lot more latency is introduced by the pure amount of network components, as every of these has a certain reaction time before it processes a packet and forwards it to the next hop. So in total my latency sums up to 75ms or 150ms (RTT). This is an increase by almost 40% and a terabit connection is not likely to change this, except for the case that one or more parts of the route are congested, which leads to the last point.
3. Congestion: Congestion can be a root cause for latency you experience on LAN or WAN network connections. The reason is that network packets will be queued up at a router or switch if the next network segment or a network component itself is fully utilized (congested). Actually you won’t see a queue if a network segment operates at its limit, but the queue begins building up in the moment a single additional packet needs to be send across the wire or the line quality goes down due to whatever reason. So the longer the queue, the longer the packets have to wait and the higher the latency will be. Fortunately for this issue 10 terabit network connections can help, as from a todays point of view it is unlikely that these become congested (I know we also thought 640kb RAM is enough for everyone..). But don’t be afraid there are also other tactics to cope with congestion in nowadays gigabit world. These tactics are based on two different approaches:
- Prioritize: An easy and very common way is to implement Quality of Service. This allows you dedicate a capacity of a network connection for a certain service (i.e. ICA traffic) or prioritize a certain service based on network ports (i.e. TCP1494) or destination, which allows these packets to bypass the queue without any delay. Configuring QoS is a best practice for XenApp / XenDesktop infrastructures. Further information can be found here: http://en.wikipedia.org/wiki/Quality_of_service
- Cache and fake: Certain network devices, such as Citrix Branch Repeater, allow caching of network traffic. So in case a second user of a certain branch is looking at the same PowerPoint presentation or SAP menu within an ICA session than a user earlier that day, the actual network data does not have to cross the wire again, as it is still cached from the first occurrence. To be able to do this, these kind of devices continuously scan the incoming network traffic, cache it locally and match new packets with the ones already cached. In case there is a match not the network packet itself but a cache pointer is send across the WAN connection and the Repeater sitting in the branch is reading the relevant packet of its cache and send it to the local client, which reduces the network traffic. On an end-to-end byte stream level, this process is completely transparent. Further information can be found here: http://www.citrix.com/branchrepeater/overview or http://community.citrix.com/x/sAGICQ for a blog about a special use case).
But also components other than the Repeater offer caching functionality. A true master in caching and faking is the Citrix Online Plug-In (aka ICA Client), which has a variety of techniques build-in to decrease the data traffic send across the wire but also improve the user experience for high latency connections. Further information can be found here: http://hdx.citrix.com/hdx-internals (see section 2D Graphics).
Besides knowing what can cause latency from a general point of view, it is also important to constantly keep an eye on the latency within the company’s network using an proper and automated monitoring. Proper in these terms mean using a short interval and the right values. In many customer networks I’ve seen 5 minute sample intervals, whose values automatically where averaged out with data from the last 60 minutes. Then you will get a general trend for the level of network congestion but you will never see if there are traffic peaks that will fully utilize a connection and introduce latency. As you typically cannot monitor all routers (prio 1) or switch ports (prio 2) within a company network, as just the pure monitoring data will saturate the network, you might want to use different values for monitoring. Examples are average queue length, packet retransmissions or packet drops, that can be an indicator for congested connections.
A particular focus in terms f monitoring should be on the first and last hop of WAN connections. These are the up-/downlinks into enterprise or ISP MPLS clouds and typically represent a networking bottleneck. So while the backbone connections within the cloud usually are build with plenty of bandwidth reserves (this is also where we will see Terabit connections first), the last mile can be challenging. For these connections cost is also an particularly important topic, which might be covered within a dedicated blog by Holger at a later point in time.
So all in all terabit networks are coming but they will not be a solution to every networking problem, such as gigabit networks not have been. Besides pure bandwidth considerations it very important to keep an eye on latency and to try to minimize it wherever possible.