Taking a Lesson from World Events

Recent world events – particularly September 11th – should have taught IT organizations of all sizes an extremely important lesson: always expect the unexpected.

Information is the genetic material of business. Except for people, information is the only irreplaceable asset. And since information is growing at such an alarming rate, its availability and integrity have to be protected from innumerable calamities such as human error, severe weather, disruptions of electrical or communications networks, natural disasters, and – as we have learned only too well – terrorism.

It’s critical, therefore, for every organization to equip itself for both planned and unplanned events, or risk suffering heavy losses. How? By implementing a solid infrastructure that protects the organization’s information lifeblood from disaster.

Makes sense, right? Then why aren’t Canadian organizations doing it? According to a recent Ernst & Young LLP report (see CIO Canada, April, pg 8) entitled ” The Fabric of Risk,” which surveyed 80 CEOs and CIOs at Canadian companies, 26 per cent of respondents admitted they don’t have a business continuity plan in place, 25 per cent have no computer disaster recovery plan, and 41 per cent have no overall crisis management plan at all. At the same time, 34 per cent of respondents cited computer system failure as the most significant risk to business continuity, ahead of other threats to business operations such as recession, commodity prices and exchange rates, natural and man-made disasters, and terrorism.

InformationWeek/Pricewaterhouse-Coopers Research, too, surveyed 250 IT and business managers in November 2001 on the topic of business continuity and, again, the results were staggering. Nearly six out of 10 managers surveyed contended that time constraints prevent them from instituting an effective business continuity plan.

Time! Forty per cent of companies expected it to take days or longer to bring records back online if disaster wiped out their company’s main data centre. While 14 per cent said they could cut over to a hot-backup or standby system instantly, seven per cent conceded it would take longer than a week – if they were able to get the data online at all.

Given these numbers, the chances are high that many organizations will fail the business continuance test should disaster strike.

The Times, They Are A-Changin’

In the mainframe environment 30 years ago, the issue of business continuance was quite simple: business hours ran from 8 a.m. to 6 p.m. and the backup window took all night. You saved your data to tape and that was about it.

We’ve come a long way since then. Just about every critical business process has moved to the network. Well, almost every process. Backup and recovery procedures are mostly still done offline.

But that’s changing, too. Go into any enterprise IT environment and there’s a wide variety of platforms, operating systems, databases and applications. More of these applications have to stay online, all the time – from online transaction processing to file-sharing applications such as computer-aided design.

Still, many customers don’t know how tightly integrated these databases are. Be forewarned: they’re dependent on one another and they rely on access to the same information.

Experience has shown that in times of outages or disasters, most organizations are fairly adept at obtaining and retrieving transactional information – visible, revenue-generating and customer-centric data that allows the company to function – because it’s usually the first to be backed up.

But collaborative information (such as email) and knowledge capital (data that allows companies to develop new products and services) is more difficult to reproduce after the data is lost. The company’s inability to retrieve the data can have a negative impact on product development, customer services and even

corporate image.

Valuable IT Lessons From 9/11

Think of 9/11 in IT terms and we learn valuable lessons:

Lesson 1: Distance is key.

On 9/11, people were unable to travel from their disaster recovery vault to the recovery site because of road and airport closures, clearly demonstrating

that access to a second site can be restricted.

Lesson 2: Tape as a medium of recovery may not be effective.

On 9/11, access to tape was restricted or eliminated, and recovery was slow. Even when files could be accessed and restored from tape, many were found to be corrupted or unreliable.

Lesson 3: All applications are critical.

On 9/11, businesses found that proposals in process, agreements for trades, and the ability to document transactions and agreements were all contained only in their email systems. And if content of other information assets are lost in

underlying or tertiary applications, it can affect higher-order applications such as customer relationship management or enterprise resource planning.

Lesson 4: Inconsistent backup is no backup at all.

On 9/11, many businesses realized that different backup schedules and strategies for different applications means that information necessary for broad-based business processes cannot be matched up or reassembled.

Lesson 5: People-dependent processes do not suffice.

On 9/11, IT systems that performed best were those that could automate the task of recovery and limit the need for human intervention and manual activities such as tape transport and loading. When employees are tired and over-stressed, they’re prone to errors, extending the recovery process significantly.

Lesson 6: Two sites may not be enough.

On 9/11, even those companies with a second site were left completely exposed following the disaster as business processes became dependent upon a single facility. Service providers were overwhelmed, and organizations faced the prospect of functioning below their set policy levels for protection and business continuance for an extended period of time.

Lesson 7: Trust nobody but yourself.

On 9/11, disaster recovery providers were met with a huge, unanticipated demand, and some couldn’t always deliver. That’s because they plan for only a percentage of their customers to require services simultaneously. When there’s a sudden, massive demand on their capabilities, their resources can be severely taxed.

Lesson 8: People are irreplaceable and so is information.

On 9/11, once the safety of personnel was ensured, information was the one asset businesses found they could not replace fast enough. And without it, the most diligent employees were hindered in their ability to re-establish business operations.

Taking Action

From a business perspective, September 11th illustrated that business continuity needs to be treated as part of the applications development and deployment process. Although the necessity for a comprehensive business continuity solution has now become a foregone conclusion amongst IT managers, many CIOs now want to know how they can deploy a solution that is an active asset for the company. The term “mission critical” is being redefined and broadened to include the applications and information necessary for the customer’s business to be backed up and running immediately. Email itself has become “mission critical”.

Personnel issues are also being reviewed. Backup teams are being put in place to relieve the first responders. IT managers are recognizing the long-term need to ensure the availability of fresh staff after extended hours of non-stop, highly pressurized work. Sometimes the hardest work will be done days or weeks after the initial disaster – and no one wants to overwork their already overtaxed IT professionals.

Dual Protection

Redundant systems for communications may be as vital as backup computing power and storage. A virtual private network or dial-in capacity is invaluable to allow workers dispersed by an emergency to work online from anywhere. Organizations should be looking at backup equipment such as satellite telephones, just-in-time electricity generators or on-call technical support to guarantee 24/7 access to information when disaster strikes.

Still, some organizations choose to invest in redundancy that’s never used until a disaster occurs. However, after recovery they face the difficulty of trying to switch operations back to their primary site. A server-centric recovery strategy also lacks cross-platform consistency across multiple hosts to enable a rapid restart when the system fails. The servers don’t know the relationships between the databases, and the backups aren’t synchronized.

Remote mirroring capabilities and storage management software allow an organization to seamlessly run its operations across both sites or systems simultaneously from a single point of management. That way, there’s dual protection that is productive from the get-go, and there’s no waiting for lighting to strike to realize a return on investment.

According to John Pistilli, Canada Life’s Vice President of Operations and Enterprise Risk Management, “Data mirroring has become a much more effective means of recovery than the traditional tape backup system. We have reduced our backup time to initialize our mainframe environment from 30 hours to five hours. We are now able to copy our data very quickly, dramatically improving restore time and our disaster recovery capability.”

Making Business Continuance Part Of The Plan

CIOs should ask themselves the following questions to be sure their enterprise passes the disaster recovery test. What happens if our data facility is at ground zero and off limits for days? What if the IT people are out of reach? What if employees have to work remotely from home? What if it’s impossible to ship tapes or new equipment when airports, roads, bridges and tunnels are closed? What if recovery depends on a vendor response we can’t control?

“Business leaders need to understand that the other organizations they are connected to – those that enable their business – can also disable their business,” says Bill Demers, Canadian Leader for E-Business at Ernst & Young. “‘It won’t happen to me’ thinking needs to be replaced by, ‘It may happen to me and it will happen to one of my business partners.'”

Think disaster and remember this: don’t leave anything to chance.

Geoff Haydon is Managing Director at Toronto-based EMC Canada, a wholly owned subsidiary of EMC Corporation, a provider of information storage systems, software, networks and services. Mr. Haydon can be reached at haydon_geoff@emc.com.