Integration of data is critical to the success of a data warehouse, according to Rob Armstrong.
Armstrong, director of technical marketing for San Diego-based Teradata, said that in order to bring a business together, organizations must bring the data that holds their business together.
“It’s the hard aspect of data warehousing. People will say, ‘Isn’t there an easier way to do this?’ There are vendors who will say that there is. ‘We’ll give you navigation software. You don’t need to keep data consistent. We’ll figure that out later with meta data.’ Well, it doesn’t work. it’s not navigation tools that are the problem, it’s consistency,” he said.
Armstrong is one of three co-authors of Secrets of the Best Data Warehouses in the World. He said this idea of data integration is one of the important points in the book.
“A two-carat diamond is not the same as a two-carat total weight diamond. If I have 20 little pieces and put them together on a ring so that they look like they’re one, I’ve broken the inherent bonds between the diamonds and lost all the value,” he said.
“Data warehousing is the same way. If I break my data up into little functional pieces and then pretend they’re one by putting software around them, I’ve broken the inherent bonds of the business function. And the inherent bonds in business are what drives the value,” he said.
Paul Thompson, product manager, software for MapInfo Canada in Toronto, agreed, noting that integrating data into a data mart gives a centralized view.
Thompson added that data hygiene is very important to his business. “You need clean data. People must consolidate their data to rid the system of unnecessary duplication.”
Armstrong said data redundancy creates data confusion.
“I can guarantee you it’s not redundant data all the time,” he said. “I have people who say, ‘Of course I have redundant data,’ and then I ask, does the data mean the same thing wherever it is located? They say, ‘Well, no, of course not.’ Then it’s not redundant.”
He added people are duplicating data because they are moving data to the processes, instead of bringing the processes to the data.
Armstrong, and his co-authors; Rolf Hanusa, who implemented Southwestern Bell’s data warehouse; and Tom Coffing, the vice-president of Coffing Data Warehousing, identified three key messages they hoped readers will take away from their book.
“The first message is that business is the purpose. The data warehouse is about solving business problems, not overcoming and making elaborate elegant technical solutions. If it doesn’t drive business, then it doesn’t really matter,” Armstrong said.
The second message is the foundation businesses lay today determines the rate and distance of their success. “The wrong foundation can limit or (the right foundation can) completely free you.
“The third one is that once implemented, users need to be encouraged to embrace new capability and new insight as opposed to a report-generating type environment.”
A recent META Group study backs up much of what Armstrong is saying. The report: Factors to Address for Data Warehouse Success 2001, states that near-term, organizations should strive to internalize a majority of the best-practice factors that render current data warehousing/analytic initiatives successful, including business alignment, end-user involvement, experienced staffing and external accessibility.
The report notes business-driven initiatives are far more likely to be successful.
In the opening of Secrets of the Best Data Warehouses…, CRM, business intelligence (BI) and data warehousing are all grouped into one, “whatever you want to call it,” category.
Armstrong said this is because if organizations really look at what CRM, data mining, business intelligence are, they are extensions of a very well articulated data warehousing strategy.
“You cannot have a CRM application in the absence of a data warehouse. But people think they can buy a CRM package to solve all of their CRM problems. It will only solve a small sliver of what CRM really is.
“They will buy a contacts management package and call that CRM, not understanding that my inventory control process has more impact on my customers then how often I contact them,” he said.
Armstrong said he expects data warehouses to get bigger and faster and to go deeper. “People need to have data more accessible. They need it more frequently at a lower level so they can run those CRM and BI applications. There are a couple of things people are doing to make this happen. They are starting to question the operational process. Most operational processes were built in the era of the batch cycle. People were doing everything on nights or weekends.”
He said they put up artificial barriers to loading data in the warehouse because during the day they would stop that process for some unknown reason.
“So people are looking at the processes and asking how they can get the data to go from the creation point to the data warehouse more quickly.”
He cautioned people to find out the value of getting the information sooner. What would they do with it? What is the real response time between when an event happens and when users have access to the data – to analyze and interrogate it.