Security and reliability aren

Ron Scott

18 years ago

Historically, reliability and security had two very separate communities of interest, but this may be changing.

A major cause was the view that “reliability” dealt with probabilistic events while “security” dealt with deliberate attacks. This view has changed with the recognition that there is a probabilistic nature to measuring security, due to inherent uncertainties such as a particular attacker’s knowledge and the time of the next successful attack. It does not mean that the probabilistic models for security will be the same as those used for reliability. However, it focuses attention on the need for measurement and estimation, as in the world of reliability.

The term “reliability” captures down-time due only to faults, and the term “availability” captures the broadest measure of down-time, which usually includes the impact of faults and maintenance. But, if security failures become significant causes of down-time, they must also be included in such measurements. From an availability perspective, it makes no difference if, for example, a network is out of action for two hours due to a security breach or a hardware failure. The need to include security failures in measures of availability will be determined by the significance of such failures relative to other causes. Our ability to include them will be determined by our competence at estimating their impact and probability of occurring. There will be sceptics, but how else can we make judgements about how good our security is, particularly if it is a significant cause of failures. We cannot expect any community of interest not be measured and held accountable — eventually.

Another influence on reliability and security is the tendency towards more integration and convergence, which inherently oppose the principles of redundancy and diversity. This requires compensation in network design, which might not have been required before. Redundancy and diversity are widely used to improve reliability; however, they are often misunderstood and their value to security has been a source of much debate. Again there has been a meeting of minds over the last several years, at least in the research community, but misunderstandings remain in the real world of enterprise IT.

It is useful to consider the terminology. Redundancy is the replication of facilities. Diversity comes in two main flavours. First is space diversity, which intends to improve on simple redundancy by increasing the distance between the redundant facilities — the larger the distance the better, at least with respect to physical problems. Second is design diversity, which intends to reduce the probability of all redundant elements failing due to the same design weakness. For example: different software on redundant nodes could ensure different failure behaviours and also different susceptibilities to breaches of security.

The bottom line is that different threats may require different types of diversity, and that reliability requirements may look similar to, or different from, security requirements. Reliability and security need a coordinated approach and the importance of both will increase as convergence creeps further into enterprise IT. Convergence has a tendency to reduce redundancy and diversity and there is a cost to avoiding this effect. Betterto have an architecture in which both elements are harmonized.