Let’s hear it for latency

Let me be the first to admit test labs have gotten something very basic very wrong: We put too much emphasis on throughput because it’s simple and sexy. What could be cooler than seeing how fast the latest network widget runs?

Plenty, as it turns out. In our recent test of 10G Ethernet switches, there were big differences in throughput. The paradox is that the faster the network, the less important throughput becomes.

Throughput is meaningful, but only when a network is heavily loaded. Another metric, delay, is meaningful for all traffic, all the time, on all networks.

Devices that add high delay slow your network regardless of whether it runs at 1 per cent utilization or at 100 per cent. Delay doesn’t have to creep into the hundreds of milliseconds before applications start conking out.

Delay with Gigabit or 10G Ethernet interfaces is usually in the dozens to hundreds of microseconds. When I write about results like this, I usually add boilerplate text saying, “applications don’t suffer until delay reaches into the dozens of milliseconds.”

Sorry, but I was wrong. Delays of just a few dozen microseconds can degrade performance dramatically on Gigabit and 10G Ethernet networks. The reason for the delays is TCP windowing.

In TCP, a transmitter sends only a limited amount of data before the receiver must send an acknowledgement. Windows usually include multiple packets, but if the transmitter doesn’t get acknowledgements within a set time, all the packets must be retransmitted.

Because at least 80 per cent of Internet traffic uses TCP, retransmissions can have a severe effect on application performance.

Here’s where delay comes into the picture. Let’s say we’re using Force10 Networks Inc.’s E1200 switch, receiving 1,518-byte Ethernet frames. Let’s further assume network utilization is light. At 10 per cent utilization, we would receive 81,274 frames per second, or one frame every 12 microseconds.

In Win 2000 and XP, the default TCP window size is 16K bytes – meaning no more than 11 frames can be outstanding without an acknowledgement. For 11 frames at 12 microseconds each, any delay of 132 microseconds or more would cause retransmissions.

Two switches in our 10G Ethernet test – from Avaya and Force10 – delayed each 1,518-byte frame by more than 40 microseconds. Eleven delays like that, and we’re into TCP retransmissions.

Force10 recently retested its own switch with new software. Force10 says delay for large frames is now 23 microseconds, roughly half of the value we measured. That’s high enough to cause retransmissions at 10G rates, but at Gigabit rates TCP retransmissions won’t be a problem. But window sizes change dynamically; the larger the window the greater the impact of a little latency.

It’s possible to determine the effect of TCP windowing on any network. All you need are three values: frame length, TCP window size and network utilization. What does delay look like on your network?

Newman is president of Network Test in Westlake Village, Calif., an independent benchmarking and network design consultancy. He can be reached at dnewman@networktest.com.