How to bounce backup bloopers

CBL Data Recovery Technologies Inc., a provider of computer data recovery and business continuity services, recently released a list of top ten true “bloopers” on misguided recovery efforts based on about 1,500 data-loss projects.

“Truth, as the saying goes, is stranger than fiction,” said Bill Margeson, president of Markham, Ont.-based CBL. “Our recent analysis has shown that in a frantic effort to retrieve data, computer users under data-loss stress can make the situation even worse.”

The identities of those who committed the bloopers were omitted because, Margeson said, what happened to them could happen to anyone.

One company bought a UNIX system and put about 300 workers in place to manage it. Backups were done daily, but no one thought to put in a system to which the data could be restored.

Another organization bought an IBM system, but not from IBM. Instead of following set configuration procedures, the manager decided to configure the system uniquely. As a result, when something went wrong, it was impossible to recreate the configuration.

Margeson said that in a crisis people tend to do irrational things.

For example, an office of a civil engineering firm was destroyed by floods. The owners sent 17 soaked disks from three RAID arrays to the data recovery lab in bags, but for some unknown reason, someone had frozen the bags before shipping. As the disks thawed, even more damage was done.

He said companies need to formulate a response plan to deal with a crisis because mistakes are made as a result of employees who aren’t trained in crisis management.

“Don’t panic,” he said. “When they get to us the problem is magnified. Assume the data is gone and send the media to an independent party.”

While CBL, on average, recovers 83 per cent of lost data, Margeson said it is important that companies take a targeted approach, and be selective about what data is the most important to retrieve.

Margeson said that companies also need to take precautions to prevent data loss. This includes having a good backup system; an engineered backup, an executive backup and a disaster response backup.

The engineered backup backs up data daily, while the executive backup backs up data monthly. The disaster response backup contains everything, and should be kept off-site.

In another example cited by CBL, a regional ambulance monitoring system suffered a serious disk failure. At that point the organization discovered its automated backup wasn’t running – a tape had jammed in the drive and no one had noticed for 14 months.

Warning signals of an impending hard-drive crash include new repetitive ticking noises, grinding, clunking and a long initialization period, Margeson said.

Sometimes, there are no warning signals. Bruno Cywinski, president of Markham, Ont.-based Tricom Graphics Inc., had no warning that the company’s system was going to crash. His small graphic design firm was working on an extremely large program, about 50 hours into an important project, when the system wouldn’t respond.

Cywinski said Margeson came in person to Tricom and took the hard drive back to CBL. The next morning, Cywinski’s team was back working.

“We were not missing anything,” he said. “It’s an invaluable service. I don’t know what we would have done…if he couldn’t have recovered that data.”

Now, Tricom backs up their data weekly, and burns it all onto CDs that Cywinski keeps at his home.

“We’re cognizant of threats,” he said.

However, according to CBL’s analysis, it’s the people, not the computers, that are the problem. About 15 per cent of all unplanned downtime occurs because of human error.

“It happens,” said Margeson. “Even the best guys on a bad day can push a button and get something wrong.”