A reader wrote to tell me that he’s witnessed various enterprise information integration (EII) proponents (mostly vendors) alluding to the notion that EII is a replacement for data warehousing. I’ve also encountered articles and advertisements pushing the idea that EII technology somehow lessens or removes the need for organizations to use a data warehouse. Although EII is definitely an important enabling technology for BI, it is not a replacement for data warehousing. Rather, EII and data warehousing should be considered complementary technologies.
Enterprise Information Integration
EII is hardly a new development — several years ago it was typically referred to as “virtual database” or “virtual data broker” technology. EII functions by employing virtual schema builder — caching and query optimization technology allowing organizations to obtain a federated view of information distributed across multiple data sources without requiring that the data first be physically moved into a centralized data store — as is the case with a data warehouse.
This ability to offer “on-the-fly” access to distributed data makes EII well suited for providing real-time reporting with ERP and other operational systems. Obviously, this capability is important for BI and other business performance measurement applications — especially for analyses that demand the most current information.
The EII model has a number of drawbacks, however. For one, relying solely on operational systems to generate a virtual or federated view of data typically does not provide the historical perspective that a data warehouse offers — a view that is critical for contrasting current measurements with historical analyses. This is because it is a common practice for companies to remove historical data from their operational systems before it can build up to the extent that it impedes transaction-processing performance.
EII also tends to suffer from data quality problems — to an even greater extent than the data warehouse model. Unlike data maintained in a data warehouse — which is extracted from operational sources and then standardized and cleansed — EII tools generate a virtual view of the data they assemble from various operational sources, which typically contain unscrubbed data. This is especially true when it comes to customer information systems — for example, one system’s database may indicate customer gender or age using a number (male “1,” adult “3,” child “1,” etc.); another may use a code (“M” or “F,” etc.).
Different business definitions are another problem. For instance, various departmental databases frequently use different definitions of “revenue” or account for the use of multiple currency measurements differently. In contrast, data warehouses are designed specifically to provide a platform for consolidating various data — acquired from different systems and lines of business — into a single database.
Enterprise Information Integration + Data Warehousing
On the other hand, combining EII and data warehousing does offer exciting possibilities. By combining data warehousing and EII, organizations can extend their data warehousing architectures to enable their BI applications to access new and existing data sources, while taking advantage of the data warehouse’s transformation engine and (we hope) data cleansing functionality. This will enable end users to access additional data not in the data warehouse, join it with warehouse master data, and deliver it in the form of real-time reports, digital dashboards, scorecards, and other BI applications. In effect, combining data warehousing and EII offers organizations a more complete view across disparate systems while at the same time extending data-access capabilities to a broader range of end users within the organization.
EII technology is best suited for providing real-time access and reporting with operational systems. Consequently, organizations should consider EII as but one component of their data integration infrastructure — a component designed to work in concert with the historical view afforded by the data warehouse.
When combined, the two technologies can provide a more complete view across an organization’s disparate systems while extending data-access and analysis capabilities to a broader range of applications and end users. For these reasons, I see the use of EII combined with data warehousing especially attractive to organizations implementing business performance management applications.
— Curt Hall, Senior Consultant, Cutter Consortium