Keeping systems up is not just IT

Disaster recovery planning (DRP) is no longer the sole responsibility of the IT department, an Ernst & Young LLP consultant told a room full of attendees during a recent seminar in Toronto.

“[The task] used to be called DRP and only dealt with recovery of computer systems, but it’s changed dramatically from recovery to availability,” said David Johnson, manager of security and technology solutions at Ernst & Young LLP in Toronto. “It must be a business-driven approach, not technology-driven.”

Not only is it crucial that networks are available 24 x 7, but any downtime or business interruptions must go unnoticed by clients, Johnson said.

Now, to reflect the wider scope of DRP, the term business continuity planning (BCP) is being applied. Johnson explained it as a plan that addresses any interruption in business operations that might harm an organization.

Whether it’s a natural disaster such as the Quebec and Ontario ice storm of 1998; a flood in a computer room; or, as in 9/11, the destruction of entire branches of companies, businesses must be able to devise a plan whereby they could restart from scratch – or at least identify the minimum requirements for their business to run.

Johnson, who works in tandem with Telus Corp. to provide its managed workplace solutions, identified processes to develop a successful BCP. He said having such a plan should be mandatory, and added that for it to be successful, the process must be ongoing, permeate the entire organization and be endorsed by senior executives.

However, he said commitment by upper management is lacking in most organizations, and that many will complete the planning stage, but go no further.

Companies must perform risk assessments to determine what vulnerabilities exist, such as where a computer network would be vulnerable to hacking or – for example, if a warehouse was destroyed – whether all inventory would be lost. Then a loss scenario analysis must be conducted to identify the potential impact of each scenario on business operations, and what is required to maintain minimum levels of operation. A strategy must then be developed in order to deal with each risk. This is the stage a lot of organizations never get past, Johnson said.

Next, a systems recovery plan must be established. This is a plan for recovering essential systems and data at alternate locations. Afterwards, a guideline for how to get a business to resume operations needs to be set up.

“There’s a reluctance for business units to get involved here,” Johnson said. “But they have to get involved because they understand the process and will be the people executing the plan.”

The next step involves identifying a crisis management plan to help deal with bad publicity, as well as human resource issues such as the cross-training of employees, and who would replace executives in case of widespread death.

Next, the plan must be validated and maintained – this includes testing, training and setting up agreements with service providers to obtain, for example, enough bandwidth to handle more traffic at a given location. But the process doesn’t end there.

“When you get to one end, you must go back to the beginning,” Johnson said. “You must continue to reassess risks and fine tune plans.”

Craig Richardson, assistant vice-president of Telus hosting and managed applications in Calgary, said Telus offers a variety of BCP solutions ranging from full network hosting to partial outsourcing solutions. Telus runs three climate-controlled secure data centres throughout the country – two in Toronto, and one in Calgary.

While BCP is a complicated process, the principle behind it is simple. Peter Pereira, CIO of Telus in Vancouver compared it to this, something he said to his nine-year old.

“I will come and pick you up from school today, and if I can’t make it, then your mother will.”