Research in Motion Ltd. reported last month that software that was designed to optimize caching capability on its network triggered the widespread BlackBerry wireless e-mail service interruption on the night of April 23.

The outage lasted about 12 hours overnight for BlackBerry users mainly in North America, RIM and users reported. RIM said a fail-over system designed to stop the impact of such a problem did not work as expected, either. The company apologized to its 8 million users. RIM added that security and capacity issues were not the cause of the outage.

“RIM has determined that the incident was triggered by the introduction of a new, non-critical system routine that was designed to provide better optimization of the system’s cache,” RIM officials said in a statement. “The system routine was expected to be non-impacting with respect to the real-time operation of the BlackBerry infrastructure, but the pretesting of the system routine proved to be insufficient,” the statement said.

The new system routine “produced an unexpected impact and triggered a compounding series of interaction errors between the system’s operational database and cache,” according to the statement. “After isolating the resulting database problem and unsuccessfully attempting to correct it, RIM began its fail-over process to a backup system.”

RIM described the backup system inadequacies this way: “Although the backup system and fail-over process had been repeatedly and successfully tested previously, the fail-over process did not fully perform to RIM’s expectations in this situation and therefore caused further delay in restoring service and processing the resulting message queue.”

RIM also apologized and said it would bolster its testing, monitoring and recovery processes as a result of the problem.

“RIM apologizes to customers for inconvenience resulting from the service interruption. RIM’s root cause analysis and system enhancement process with respect to this incident is ongoing, and RIM has already identified certain aspects of its testing, monitoring and recovery processes that will be enhanced as a result of the incident and in order to prevent recurrence,” the statement said.

Would you recommend this article?


Thanks for taking the time to let us know what you think of this article!
We'd love to hear your opinion about this or any other story you read in our publication.

Jim Love, Chief Content Officer, IT World Canada

Featured Download

Featured Articles

Cybersecurity in 2024: Priorities and challenges for Canadian organizations 

By Derek Manky As predictions for 2024 point to the continued expansion...

Survey shows generative AI is a top priority for Canadian corporate leaders.

Leaders are devoting significant budget to generative AI for 2024 Canadian corporate...

Related Tech News

Tech Jobs

Our experienced team of journalists and bloggers bring you engaging in-depth interviews, videos and content targeted to IT professionals and line-of-business executives.

Tech Companies Hiring Right Now