The Strategic Value of Data Warehouses

Speed, flexibility and foresight are the primary characteristics that distinguish successful organizations in this information age.

Businesses today must have management processes in place to monitor and control the organization, while at the same time decentralizing decision making in order to react to competitive changes and take advantage of unexpected opportunities.

One central element that supports this balance between control and flexibility is shared knowledge. Such knowledge, derived from both internal and external data sources, is converted to information that can be readily interpreted.

If knowledge and intellectual capital are becoming the key drivers of competitive advantage, then the intelligent organization is the one that can modularize, standardize, and broadly share its knowledge, both internally and, in many cases, externally. These are the companies that continually transform their knowledge bases in the context of a changing business strategy.

Information systems play a role in creating and distributing that knowledge. Specifically, the data warehouse, a central repository of subject-oriented data originating from the companies’ transactions systems and external data sources, becomes a critical information system. The successful implementation of a data warehouse can have a significant effect in fostering a culture of knowledge sharing.

While business’ quest for knowledge has never been greater, slightly less than half of data warehousing efforts have failed to deliver. On average, companies spend 5 million dollars implementing data warehouses. Of those, approximately 60% will say the effort did not successfully provide the expected return on their investment. Data warehouses, utilized correctly, can provide insight into customer behavior, product or campaign performance, profitability, cost structures, etc. But, like any well-grounded research effort, the data that is collected and organized must relate specifically to the area of knowledge to be attained. Peripheral areas of study or interest must be pursued later and in stages. Many organizations have missed the mark by not setting specific objectives for the data warehouse efforts that relate to the overall business strategy.


In Intellectual Capital: The New Wealth of Organizations, Thomas A. Stewart captures the essence of one of the biggest problems with data warehouses today. He writes “Knowledge assets, like money or equipment, exist and are worth cultivating only in the context of strategy. You cannot define and manage intellectual assets unless you know what you are trying to do with them .. It’s important to separate trivial and transitory information from important intellectual assets, especially in an era of numbingly rapid change.” Three key reasons for failure of enterprise data warehouses include:

1. Their purpose is not defined and prioritized in the context of the overall strategy.

2. They attempt to collect all information, including that which may be “trivial or transitory,” or peripheral to the real need.

3. They take too long to deliver, so that the business has changed out from under them.

In other words, they collect first and understand needs later. Ideally, the data warehouse should be managed as a company asset. It should be continually reviewed in the context of the investment objective, and it should have high level resources dedicated to managing it.

The problems found in data warehousing projects today are nearly ubiquitous:

    Required data is not collected or not accessible.Initial database scope was too broad – trying to contain too much information too soon.Not enough time was spent prototyping or understanding the real business needs in depth.Besides the initial project approval, senior management did not provide much direction in terms of priorities, resulting in a disconnection between the data needed and the data gathered.

Given the high rate of failure of data warehousing efforts, an alternative methodology is required. By revisiting the data warehouse in the context of specific business agendas, many organizations can reorient the data warehouse, establish targeted applications via the use of data marts and add value to the business.


Data alone is like inventory, which is expensive to maintain. Carrying excess inventory is inefficient and costly. Likewise, data alone is useless and costly to maintain until it is turned into a finished product in the form of delivered knowledge to the business users. As in manufacturing, design begins with an end product in mind.

Research is performed to determine which of a variety of raw materials results in the best final product. These raw materials (data transactions) are then used to generate components (subject oriented data warehouses). The components are assembled into products (data marts) which are distributed via various means (Internet, Intranet, reports, and PC applications). The products are then continually reviewed, enhanced, upgraded or discontinued.

With this in mind, five important steps in approaching data warehousing are outlined below:

1. Align the data warehouse with the business objectives

A data warehouse must be driven by a specific need. The most successful data warehouses are often developed in industries undergoing significant change. Some examples are: in health care as managed care increases; utilities as deregulation is pending; or telecommunications as long distance, local, cellular and cable services are merging.

In many of these companies, major business change is inevitable, and while the answers are not clear, the business direction is well articulated.

If a specific objective is outlined, then the data warehouse has an identified set of users and a specific set of business questions which it is designed to answer.

The effect is to raise the bar of knowledge and prepare the organization to react quickly. Hence, significant value has been added to the data warehouse commodity.

2. Create a business driven information architecture

An important step in developing a data warehouse is the development of an information architecture plan that aligns the business objectives with the data that is required. Three essential deliverables from the architecture plan should emerge:

    What data to include in the warehouse based on specific business relevance,What infrastructure changes must be made to support the data warehouseWhat the delivery mechanisms will be.

The inclusion of data in the warehouse should be justified by its utility in supporting the objectives. This is also the point at which some “modeling” may be appropriate.

In typical marketing data warehouses, for example, it is commonly assumed that all demographic or other available attributes should be included in the warehouse, since it is difficult to determine which attributes correlate to certain purchasing behaviors.

However, a more efficient means to develop the warehouse would be to use some modeling tools on samples of data to determine those attributes most likely to influence the stated business goal. These should be the high-priority items for inclusion in the first phase.

This simplifies the database structures and greatly decreases the initial size of the warehouse. Benefit will be realized sooner and performance will not be inhibited. Other attributes can be added later as needed.

Similarly, organizations often rush to include all possible detail in the initial data warehouse implementation when it is perhaps unnecessary. By creating an information architecture plan, the trade-off of including certain pieces of data can be assessed.

Next, recommendations relative to modifying or enhancing aspects of the infrastructure can be evaluated.

Decisions related to replacing obsolete transaction systems that are inhibiting the ability of the organization to perform or which no longer collect relevant or sufficient data can be made soundly, based upon the priorities of the information architecture. Also, decisions related to further expansion of the communications infrastructure for the purpose of proliferating intranet usage and improving the capacity to support the implementation can be based on a well thought out projection of number of users, type of queries, and frequency of use.

Finally, the information architecture plan will provide a basis for selecting the application software for delivery of the information from the warehouse. Typically, multiple products are required to meet the needs of a broad spectrum of users. Analysts need more robust tools to “model” the data. Some specialized tools, such as data mining tools, may be required. Others, such as managers and executives, simply need a web interface to get access to basic information any time from anywhere.

3. Be focused – build the foundation in a modular fashion

Homebuilders do not lay down 20,000 square foot foundations to support 3,000 square foot houses. They also do not build all the foundations in a new development and then complete the houses. They establish a plan that lays out the lots and placement of homes, determine the infrastructure requirements and build the housing development incrementally.

This is the approach that should be adopted in building data warehouses. Organizations that try to build the enterprise data warehouse to meet all users’ needs at once inevitably fail. The building and maintenance of an enterprise data warehouse is a time consuming, difficult, and never-ending task. First planning overall design for the enterprise data warehouse, but actually building it in increments can put more focus on the data mart or delivery mechanisms for high-priority business issues. Users derive business value quickly, which will fuel momentum for expanding the warehouse.

4. Create dynamic data marts

Data marts are focused applications utilizing a subset of the information from a data warehouse and embellishing that data by applying a rich set of business rules and logic to generate a targeted analysis. A data mart is not necessarily a “departmental” application. Churn analysis, a common application in telecommunications among other industries, is utilized not just by marketing, but by customer service, sales and finance. These organizations need to determine what causes churn, how much financial impact it has on the company, and how the sales and customer service areas may be able to prevent churn.

In Knowledge Management of Inquiring Organizations, Yogesh Malhotra writes: “Knowledge management solutions characterized by memorization of ‘best practices’ may tend to define the assumptions that are embedded not only in information databases, but also in the organization’s strategy, reward systems and resource allocation systems. The hardwiring of such assumptions in organizational knowledge bases may lead to perceptual insensitivity of the organization to the changing environment. Institutionalization of `best practices’ by embedding them in information technology might facilitate efficient handling of routine, `linear,’ and predictable situations during stable or incrementally changing environments. However, when this change is discontinuous, there is a persistent need for continuous renewal of the basic premises underlying the ‘best practices’ stored in organizational knowledge bases.”

Business changes very quickly, and the underlying business rules in the data marts must also change in order to continue to provide value. In other words, the data mart of today must be transformed to look at different issues.

A common practice today is to build too much into a data mart application. Marketing applications are notorious for building dozens of data marts to try and measure the effect of demographic, payment history, or other factors on the propensity of a customer to purchase. Tremendous effort is expended in order to determine which attributes are important largely by trial and error. A typical justification is that since the users cannot articulate exactly what they want, the information systems group will provide everything.

An alternative, more efficient and expedient method would be to use data mining and statistical techniques to determine which attributes are likely to be predictors of purchasing behavior, and then build data marts to monitor those impacts. These pre-data mart analyses can be regularly reviewed to determine if new drivers have emerged or others have changed.

Data marts, then, should be a series of constantly changing, targeted analytical applications built on a solid data warehouse foundation that evolve and change as the business changes. Their cost justification should be based on the assumption that they will require continual evolution and maintenance, or that their useful life is limited.

5. Manage expectations about data quality and compromise

Providing valuable information need not be an all or nothing proposition. Often, a sampling or polling of information is enough to provide insight to a user – an educated guess is often better than nothing.

Investigate ways to use data that is clean today and extrapolate or estimate for elements that still require cleansing. From the outset, management should be briefed on what data is available and where the shortcomings are. Prototypes can be an extremely valuable tool to assist in establishing a vision, explaining data integrity issues, gaining consensus and laying out a phased approach.

White paper provided by Decision Support Technology.