Disaster-proofing storage systems

When it comes to data storage, redundancy is never superfluous.

With the right amount of redundancy, you can make sure access to critical data is never denied. What you need is a tiered strategy comprising multiple network paths to storage systems and copies of data locally and off-site.

Making storage always available starts locally because human errors, not disasters, cause 90 percent of outages, says Steve Duplessie, senior analyst with Enterprise Storage Group Inc. The first step toward eliminating local problems is clustering servers so a single server outage won’t cut off all data access. Similarly, connecting storage devices to a redundant network will provide multiple data paths. That can be accomplished using network-attached storage (NAS) devices or a storage-area network (SAN).

You also want to store the local data in more than one place, which means replicating data on multiple storage servers, disk arrays, NAS appliances or storage subsystems on a SAN.

The purest form of replication is synchronous mirroring, in which you write the same data to two disks, says Jon Toigo, independent consultant and author of the book Disaster Recovery Planning. Synchronous means the write operation has to complete on both copies before the application driving the transaction will move to the next step. If either copy hangs, the transaction won’t complete. Because it has a low tolerance for latency, synchronous mirroring typically is performed only locally.

If your environment requires something less than pure synchronous mirroring, you have plenty of choices. Vendors such as Quantum Corp. and Network Appliance Inc. have mirroring software for their NAS devices, where only changed data is sent back and forth to keep latency to a minimum. Other vendors, such as Ultera Systems Inc., employ a SCSI controller to write data to primary and secondary drives. Some SAN switches employ a similar approach, writing the same data to multiple targets, Toigo says.

Protecting against wholesale disasters – what Toigo calls “smoke and rubble” scenarios – requires replicating data to a remote storage device. Here again, a number of tried-and-true techniques can apply. One is point-in-time copy, where snapshots of data on one storage system are replicated to another system at designated time intervals. The frequency of the snapshots depends on how much data you reasonably can afford to lose if the worst happens and how much storage capacity you are willing to dedicate to the snapshots. In some cases, a daily snapshot will suffice. In others, snapshots are needed every few minutes.

Another technique is remote copy, which amounts to writing data to local and remote storage devices at the same time. However, latency imposed by distance creates a lag between the time data is written locally and remotely. That time represents your window of vulnerability to having the two copies out of sync, Toigo says. If you’re dealing with data that doesn’t change often, such as customer profile information, the lag is probably not an issue, but it can be for transactions that change by the second.

Most major storage hardware vendors have products that perform one or more forms of data replication, such as EMC Corp.’s Symmetrix Remote Data Facility. In general, these products only work with the vendor’s own storage gear.

This compatibility issue is one reason tape is still the storage workhorse, Toigo says. Tape is portable and flexible. It can be used to back up or restore to any system, he says.

But Duplessie says that flexible software-based data replication tools are emerging as alternatives to proprietary hardware approaches. These alternatives, such as NSI Software’s Double-Take, allow data replication from any type of storage system to any other, over any type of IP network connection, he says.

Maharam Fabric Corp., a wholesale textile distributor in Hauppaugue, N.Y., chose software-based data replication as part of its disaster-proofing strategy. It is using SteelEye Technology Inc.’s LifeKeeper software to replicate data from its main operations center to a satellite office, says Sal DeAugustino, network administration manager for the company.

As a result of the Sept. 11 terrorist attacks, Maharam Fabric’s Park Avenue office in New York lost network connectivity and couldn’t access data stored in Hauppaugue. After that, the company decided to assess the effect of a full-fledged disaster striking the Hauppaugue facility, DeAugustino says.

Addressing the issue meant examining how the company functions day to day and determining what data it simply can’t live without – an 8G-byte enterprise resource planning (ERP) system used to track all aspects of Maharam Fabric’s business, from the cutting and shipping of fabric to accounting and purchasing.

The company’s new data protection plan includes keeping a copy of the ERP database on each of a pair of clustered Compaq ProLiant servers. Data is replicated between them using the LifeKeeper software, providing an exact copy of all data locally. Maharam Fabric also performs a point-in-time backup of its database, sending any changed data to an auxiliary server located in its New York City office every 30 minutes. The company figured it could re-create the six to 12 transactions – new or updated fabric orders – that might be lost during that window, DeAugustino says.

The SteelEye setup was far more cost-effective than systems from storage hardware providers, whose prices were out of reach, he says. The company paid about US$75,000 for the two Compaq servers, a Linux operating system and SteelEye software.

Prices vary dramatically, depending on what you’re trying to do. “There are various choices at every aspect, from cheap to ungodly expensive,” Duplessie says.

In all, Maharam Fabric spends about 25 per cent of its IT budget on storage, DeAugustino says, far less than the nearly 60 per ent that Toigo says is the norm. But users don’t have much choice.

“Data is an irreplaceable asset, so has to be protected,” Toigo says. “The only way to do it is through redundancy.”