Put your IT eggs in different baskets

The terrorist attacks on the U.S. last September fundamentally changed the way some IT managers think about disaster recovery.

“It’s no longer a matter of planning what to do should fire or flooding prevent access to buildings,” said Bob Fucito, vice-president of crisis management and business continuity at investment banking firm BNP Paribas. Today, businesses have to prepare for the ultimate security risk: what to do when people and buildings are intentionally targeted and destroyed.

Fucito should know. His duties include managing disaster recovery for Paris-based BNP Paribas’ North American operations. And he said he’s thankful that his company’s executives supported the creation of a disaster recovery plan that emphasizes distribution of IT resources – two years before the Sept. 11 attacks. The company had to evacuate its New York City building after the attacks, but Fucito said having two separate data centres and a contract with a hot-site recovery provider put BNP Paribas in a better position to continue doing business.

BNP Paribas isn’t alone in thinking that having IT resources in one building or on a single network isn’t a good idea. Other major organizations, such as The Boeing Co., United Air Lines Inc., the Chicago Board of Trade and the U.S. Postal Service try to mitigate the risk to IT resources by distributing data, applications and network infrastructure. They also have redundant communications links at the ready.

All of those organizations have the same goal: to quickly recover or even seamlessly continue doing business when disaster strikes. But they have different ways to accomplish it. Here are four approaches that major companies are using to stay prepared.

Redundancy and multiple routes: UAL Loyalty Services Inc. in Schaumburg, Ill., an online customer service unit of United Air Lines parent UAL Corp., is installing duplicate systems at two company-owned and -operated data centres. Both are in the Chicago area, said Igor Rafalovsky, director of networking and security, but the facilities are geographically separated.

A metropolitan-area network capable of gigabit speeds, known as a GigaMAN, connects the two centres, Rafalovsky said. Moreover, each data centre is connected over T3 lines running to separate Private Network Access Points (P-NAP), which are Internet backbone connection points owned and operated by Internap Network Services Corp. in Seattle.

And even at the P-NAPs, traffic going to and from the two UAL data centres runs across multiple Internet backbones from different providers, such as Sprint Corp., WorldCom Inc. and others. A P-NAP may have up to six or eight backbone providers online and available at any given time.

Both UAL data centres host Web servers, applications and databases. Disk storage is synchronized in real time over the GigaMAN, and both data centres are online all the time. “In the case of a catastrophic failure of one data centre, the other one just picks up the traffic, in many cases without interruption…or manual intervention,” Rafalovsky said.

Outsourced hot sites

When BNP Paribas IT employees evacuated their building in New York in response to the terrorist attacks, they moved to the company’s other data centre in New Jersey to continue operations. Even so, Fucito said his firm also has a contract with New York-based SchlumbergerSema to provide off-site hot sites.

Hot sites duplicate the mission-critical parts of a company’s IT systems in secure buildings miles away from the primary sites. IT workers can go to hot sites to initiate recovery or simply resume work.

John Kersley, SchlumbergerSema’s vice-president of business recovery, describes how it works: A corporate customer configures its own data centres to automatically mirror data and applications to the appropriate hot-site recovery centre (or centres). That company’s IT employees are assigned physical positions (desks and workstations) at a specific centre and instructed on how to get there if there’s a crisis. When the company’s workers are in place at the recovery centre, it becomes a matter of patching the data through to the off-site desktops.

Hot sites are especially appealing to financial services organizations like BNP Paribas and the Board of Trade Clearing Corp., the clearinghouse for the Chicago Board of Trade, which has a hot-site contract with SunGard Data Systems Inc. in Wayne, Pa.

The concept also has value for major retailers. For example, Leeds, England-based ASDA Group Ltd. – a chain of food and clothing superstores owned by Wal-Mart Stores Inc. in Bentonville, Ark. – has an agreement with SchlumbergerSema to send select members of its IT staff to a global business recovery centre if a disaster closes ASDA’s own IT facilities.

SunGard and SchlumbergerSema say the trend is toward using hot sites for disaster recovery. But Damian Walch, vice-president of consulting at T-Systems Inc. in Lisle, Ill., sees the trend heading in the opposite direction.

“Companies are looking at internalizing their disaster recovery systems and moving away from hot-site providers,” Walch said. However, he acknowledges that the hot-site idea won’t go away anytime soon and that disaster recovery strategies often involve a blend of approaches.

In fact, extremely large and diverse organizations, particularly those using mainframes in addition to PC servers, foster redundancy through a mix of multiple in-house data centres and mirrored hot sites.

Chicago-based Boeing, for example, has to consider the specific needs of business units and the communication challenges that come with having a multitude of far-flung locations.

“Distributed hot-site contracts tend to be more expensive with mainframe environments. We try to consolidate and centralize IT but also avoid the risk of too many megacentres . . . by having geographic separation [of IT facilities],” said Steve Guzek, Boeing’s program manager for disaster recovery.

Guzek maintains that focusing on networks is the key to eliminating single points of failure.