Of the many things we can be thankful for, one is the knowledge that we won’t be using our entire IT budget just to feed the network beast. Where a first-generation Gigabit Ethernet card could cost several thousand dollars, CDW’s home page (as I write) displays a brand-name network interface card for US$34.95. But, like most things, there are some trade-offs. We must ask ourselves: Does quality sometimes fall along with price?
As I look back on the last two years, it just seems to me that we encounter more “bad cards,” “bad ports” and “bad modules” than, say, five years ago. Given the wholly unscientific nature of my observations, I decided to conduct some brief research on how network vendors were dealing with mean time between failures (MTBF) of their devices.
And as “cheap” as network gear is, the real expense isn’t in the capital cost of a new NIC or a switch, it is in the lost business and user productivity that occurs when the device fails. Add to that the time to isolate the failure and source a new piece of gear if, for example, you don’t happen to have a spare switch on hand.
I think most people running small- to midsize-business (SMB) and enterprise networks would echo NASA’s Gene Kranz (former director of mission operations at the Johnson Space Center in Texas) and say, “Failure is not an option.” But getting there may not be easy.
I recall years ago, when IBM Corp. offered some early network products on its PC platform, hearing the representative boldly state that the MTBF was 10 years. This on a box that had only been shipping for a year or so. But he was serious — “Yup, 10 years and we are in the second year of the test.” At that time, at least, a part of the process was actually to run a set of devices in the vendor lab to parallel what customers were doing.
Most often, though, industry-accepted formulas organizations such as BellCore/Telcordia produce are used to predict the likelihood that any part in the assembly will fail. It is anyone’s guess how vendors treat MTBF internally, but from the outside looking in — by way of Web sites and spec sheets — it is a mixed bag indeed.
Only the so-called “industrial Ethernet” vendors and providers of service-provider-class gear seem to push MTBF. With most enterprise and SMB products, that information is either buried or often simply invisible. But SMB and enterprise buyers can ill afford to have infrastructure elements simply stop working.
A random check of Cisco showed that its Catalyst 2970 has an MTBF calculated to be 163,000 hours — more than 18 years. A 3Com 3800 is listed at 184,000 hours — or 21 years. (Cisco’s GigaStack GBIC is expected to fail once every 500 years!) Extreme’s Summit switches are rated at greater than 50,000 hours. (This list is obviously not comprehensive.)
Yet, for some other brand-name vendors, there was no MTBF information listed at all. It would be unfair to suggest any sinister motives for such lack of information, but it does invite questions.
IT architects have a right to this kind of information and would be well advised to start requesting it — which used to be provided as a matter of course — as a part of any bid. After all, there is nothing like unanticipated downtime to wreck your ROI calculations.
Tolly is president of The Tolly Group Inc., a strategic consulting and independent testing company in Boca Raton, Fla. He can be reached at firstname.lastname@example.org.