As we move forward, corporate information will need to become more modular, less tied to applications, and freer to move. But as data mobility increases it becomes progressively harder to tighten the security reins using present methods and technological thinking. The solution is to embed security in the data itself.

According to the Privacy Rights Clearing House, a non-profit American advocacy organization, from mid-February 2006 to the end of August, more than 300 American companies had security breaches that involved private customer data being compromised.

More than 90 million data records were exposed. Even factoring in for widespread overlap, more than one in ten Americans had his or her personal information compromised in less than seven months. Though our track record with data security north of the border is solid, there is no statistical evidence to assume the numbers are substantially better in Canada.

Part of the problem is due to corporations being faced with the need to increase data mobility in order to increase business agility and value creation. As data mobility increases it becomes progressively harder to tighten the security reins using present methods and technological thinking. These growing pains are a direct result of the move from a systems-centric world to an application and content-centric world. While application-centric databases are often the key to business success, they have also become a corporate Pandora’s Box – if they are opened to the public, hell reigns supreme.

There is clearly a need for proper planning to ensure the increased mobility of data does not become the bane of thousands of Canadian IT security personnel.

It’s all about
data management
The ability to manage, leverage, and secure data is the foundation for a successful business; and truly smart businesses ensure they have the ability to derive greater value from their data and information. Semantic organizations are those that have made it an imperative to extract more and more meaning out of data and information. They know that being able to make sense of the data makes cents. Though this has always been the case, historically data management has been embedded in systems and applications. As we move forward information needs to become more modular, less tied to applications and freer to move. Data mobility will become the basis of future IT implementations, where data resiliency and security is embedded in the data itself.

Today, data relationship methodology is not top of mind for most executives. For example, many companies store customer credit card information with privacy and security as a top priority, as they should. But often what is attached to it – less private, often transactional, data – is treated with similar reverence. Companies rarely parse out all the elements of customer information that have a low degree of confidentiality. If I buy a book online, my profile and credit card information are private, though not to the same degree. To a lesser extent – the US Patriot Act notwithstanding – the name of the book ordered is private. The fact it has been ordered and will be shipped to Montreal is strictly transactional information, with no associated confidentiality. Securing all the data similarly can be prohibitively costly and increasingly ineffective as a business model since, due to application security restrictions, the transactional data (in this case often high value data) is prohibited from being shared with other applications.

Parsing out the various types of data sets is one way to enable data mobility and thus increase its value within an organization. Additionally, data masking techniques such as delinking or anonymizing certain types of data so it is no longer private information means it can be moved to a lower level of security categorization and can now be shared with a variety of other applications, both internally and externally.

There are many organizations that do this successfully. Everything from medical organizations to customer loyalty programs have become quite adept at anonymizing data. The data, once cleansed of personal information, is repackaged and made available to other applications. The granddaddy of them all is Statistics Canada, an organization that has been collecting private information about Canadians for decades. Its terabytes of personal information are anonymized, pseudomized, clustered, categorized, and made available to companies and researchers coast–to-coast.

These types of data transformation are a move in the right direction in the management of business, customer, client, and personal information. But they have their limitations. In some cases once the data is anonymized or delinked we have no way of relinking it to its original owner. Some types of transformations are only one-way.

Intuitively, one might assume this isn’t a problem. But many new business models are now focused on both content and content packaging. To be ultimately successful the data traffic needs to flow both ways. From a security perspective, non-private data such as a billing address would ideally be stored separately from confidential and private data, like the name of the person and their credit card information, and rejoined when needed by an application or business process. The credit card information itself could be delinked with the expiry date and name stored separately from the 16 digit number.

Unfortunately most of today’s applications are not designed to handle data in this manner.

The increased traction of service oriented architecture (SOA) is in part due to organizations wanting to overcome application interoperability issues. With SOA, application and business service models are designed to deliver against actual business requirements rather than some preconceived IT notion of what business services should be. The goal is to define architectures to support the use of business information, where business is built on content rather than technology. SOA infrastructure, with its associated applications, could theoretically be designed to allow for increased data division and recombination.

data relationships
Almost all transactional and organizational information has the potential to be a monetized product with associated values and costs. The key is to answer two strategic questions: what do these data relationships mean and who would be interested in knowing about them?

But before a company can even start to think about completely monetizing its data, it needs to understand data relationships. To do this, one must understand how specific data relationships alter value, security classification, and quality, as well as how this in turn impacts the cost of controlling data integrity. At some companies this process started with the establishment of corporate data rules that define all dataset and the metadata relationships — however this process is nowhere near granular enough for 21st century business.

Deeper understanding and more fine-grain control of data and data relationships is the basis of future IT security. Security has to move from being technology-centric to content-centric, and the way to do that is through data and information management.

For example, on its own a social insurance number is just nine random numbers. Without context it has few security implications. By itself, its security level would be quite low. But if a name and date-of-birth were to be attached to the numbers, its security classification would go from low to confidential. Today, the traditional solution is to store all of the data together, and accept the high security implications that go with this solution. Though encryption is effective for many kinds of confidential data, it is a costly and inefficient solution for highly mobile data.

If sensitive data could be broken apart and stored separately, associated risk and cost would diminish. At the same time its value would increase because more and more applications would have access to it. This scenario, of course, can only work as a solution if rejoining the data is seamless and the success rate is 100 per cent.

This is where advanced data tagging comes in, and with it essentially the ability to do something akin to reverse hashing – to join anonymous data back together to create value and identity.

Ideally data and information should be able to move up and down the security classification ladder. But in order to do this, applications need to understand the specific security ramifications of certain dataset combinations. Since the combinations could theoretically be limitless, the solution is to pass the workload of security awareness to the data itself.

XML tagging drives
data control
There has been a great deal of talk recently about the power of XML tagging and its ability to define data in an almost limitless fashion. Its goal is to empower data so it can tell the application what can and cannot be done with it.

This is done today, to some extent, within the SOA Architecture and with the use of XML policy-driven firewalls. Corporations have already created policies and rules that can be used to filter out data that meets certain criteria. In the future, XML data tags could be used to oversee data relationships. When a date-of-birth is combined with the person’s name this new combination of data would not be able to pass through the fire wall or be accessed by certain applications. On their own, however, each dataset could move freely. Dataset movement and access would only be restricted when certain combination thresholds are met. In essence, a credit card number could move freely without a name or expiry date attached. If an application tried to rejoin them without proper security clearance it would be denied. In many respects this would be equivalent to establishing security syntax and a grammar for various datasets.

But before going down the path of retroactively tagging information, a company needs to standardize the terminology it uses to define information. Surprisingly this is not as common as one would assume. Ask around to see how many different definitions your company has for its customers. At the enterprise level it is often a dozen or more. Though one single definition is generally an unattainable goal, minimizing the definitions to a few will help. When a new application is built, its customer definition would come from an existing one rather than creating a new one.

Some organizations, specifically those with vast digital assets have gone so far as creating a new corporate position: chief taxonomist. The Walt Disney Company and Harvard University both have corporate nomenclature taxonomists. Part of their job is to create consistent terminology for documenting, tracking and cataloguing all digital media and content. This work now encompasses meta-directory information as well as operational organizational entities and concepts.

Corporate executives must decide who is accountable for the quality, security and taxonomy of the information used regularly to make business decisions. In some companies the CISO takes the lead on this, in others it is the information management group. Regardless, one can no longer expect that it is the applications that are accountable for data quality. There must be specific individuals who have responsibility, ownership and ultimately control over the costs and risks associated with the content. Since it is a corporate governance issue, and one with a great deal of responsibility, making it a corporate issue is a good start. A chief privacy officer, chief security officer and a chief information assets executive make an ideal governing triumvirate. They understand the implications of failure, but more importantly the benefits of success.

Down the line almost all data has the potential to be stored anonymously, and confidential information leakage will be less a risk and cost than it is today. But this will only happen once data quality and data security are truly seen as valuable corporate assets and indexed and managed with due consideration of their potential value and dangers.

QuickLink: 073241

In memorium

At press time, we learned of the untimely death of Dr. Robert Garigue, the author of this article and a leading figure in information security in Canada. At the time of his passing, Dr. Garigue was Vice President for Information Integrity and Chief Security Executive for Bell Canada. Prior to this, he spent several years as the Chief Information Security Officer for the Bank of Montreal and was instrumental in the creation of several information management and security governance best practices. Dr Garigue was the first chairperson of the National Public Sector CIO Council Sub-Committee on Information Protection. He also spent 23 years with the Department of National Defence and was the first Director of the Canadian Forces Strategic Network Vulnerability Analysis Centre. In this capacity, he was instrumental in developing the strategic framework on Information Warfare in the Canadian Forces. Dr Garigue was also one of the Canadian delegates to the G8 on Security and Trust of the Internet Initiative. All of us at CIO Canada extend our condolences to his family and friends.

Related Download
The truth about information governance and the cloud Sponsor: IBM
The truth about information governance and the cloud

Register Now