What we need to know about Net traffic

Determining how fast the Internet is growing is almost a parlour game among pundits these days. Part of the reason is simple practicality: Companies that depend for their livelihoods on supplying or using Internet infrastructure want to better understand growth trends so they can plan investments and growth curves accurately.

But there’s a broader scientific issue as well. The Internet has become a critical part of our economic and political landscape, yet we don’t really understand how it works. Sure, we know how the protocols themselves operate — but only recently, for example, did folks determine that Internet traffic follows long-tailed distributions rather than (as previously assumed) Poisson distributions.

For those who’ve been out of school for a while, the difference is this: Poisson distributions are the typical “bell-shaped curves” that often characterize random events — that is, events that occur independently of each other. Telecom engineers model traditional telephony traffic (aka voice calls) as Poisson distributions, which makes sense — my decision to call my mother in Corpus Christi is highly independent from your decision to call your stockbroker in New York. Critically, Poisson distributions get smoother as the volumes get bigger, so engineering a network to handle billions of calls is fairly straightforward.

Long-tailed distributions, in contrast, are characterized by a considerable degree of dependence, which also makes sense: If your machine has requested data from my server, there’s a strong dependence between the first packet I send you and subsequent ones, since they’re all part of the same data file. Because of this dependence, traffic tends toward clustering, and gets less predictable as volumes get larger.

And yes, there’s a practical point to this little detour into probability distributions: network engineering is fundamentally different if you assume long-tailed rather than Poisson distributions. Buffers need to be bigger, servers need to be more powerful, and planning needs to include more extreme traffic-burst scenarios.

Why didn’t we know this until recently? Because there’s no system for monitoring and measuring traffic in the ‘Net.

Individual carriers look inside their own networks, and peering providers (such as Equinix) examine characteristics of traffic that crosses peering points. But although many individual researchers and research institutions are attempting to monitor Internet traffic, none has a panoramic view.

Which brings us to an even more basic point: Nobody knows precisely how fast the ‘Net is growing. The best guess, according to experts who attended a conference with me recently, is somewhere between 75 per cent and 100 per cent per year. But that could be off by as much as 2X (that is, it could be as low as 50 per cent or as high as 200 per cent). One of the foremost researchers in the field, the University of Minnesota’s Dr. Andrew Odlyzko, estimates it’s between 50 per cent and 60 per cent. The folks at Equinix figure more like 75 per cent. And my own firm recently conducted a study modeling Internet demand at 100 per cent. But the truth is, nobody really knows.\

The bottom line: We need to get serious about studying the ‘Net. It’s way too important to leave to chance.

Related Download
3 reasons why Hyperconverged is the cost-efficient, simplified infrastructure for the modern data center Sponsor: Lenovo
3 reasons why Hyperconverged is the cost-efficient, simplified infrastructure for the modern data center
Find out how Hyperconverged systems can help you meet the challenges of the modern IT department. Click here to find out more.
Register Now