Of all the hot areas in enterprise software technology, perhaps the hardest to master is predictive analytics. Still, it’s an increasingly important area to understand, for sellers and buyers/consumers alike, so let’s give it a shot. The meaning of the marketing buzzphrase ‘predictive analytics’ is still mutating fairly rapidly. But in essence it’s a replacement phrase for ‘data mining’ and roughly equates to ‘applications of machine learning and/or statistical analysis to business decisions’. Text
First, we need a working definition. The meaning of the marketing buzzphrase “predictive analytics” is still mutating fairly rapidly. But in essence it’s a replacement phrase for “data mining” and roughly equates to “applications of machine learning and/or statistical analysis to business decisions.”
In most current and near-future applications, the business decision is some form of small-group marketing. (In the ideal case, the group size is one, and predictive analytics is used to make wholly individualized marketing offers.) Questions that predictive analytics attempts to answer include.
• Which of my customers are likely to churn?
• What kinds of offers will persuade my customers to stay or new customers to buy? Price? Service options?
• Which potential customers are likely to be highly profitable? Which are likely to commit fraud and actually cost me money? Which are likely to soon be threats to churn, causing me to make lowball bids to keep them?
• What should I show this surfer when I serve the next page?
The answers to these questions are then reflected in specific choices of call center scripts, direct-mail sublists, Web site personalization and the like.
Data used to answer such questions can come from a variety of sources. Most obviously, there’s transactional data recording what customers bought, how much they spent and so on.
There also are other customer contacts, such as call center logs, incoming e-mail (text data mining is red-hot) and any forms or surveys they filled out. Industries with loyalty programs, such as airlines and gaming, have huge amounts of additional data to mine. So do companies whose Web sites produce site logs.
Finally, vast amounts of third-party data can be added to the analytic mix. Indeed, credit bureaus maintain more than 1,000 columns of data on consumers that can be rented by anybody planning a marketing campaign.
The real complexity lies in the mathematical techniques used to answer predictive questions. Usually, the problem is formalized as one of classification or clustering. For example, “Divide prospects into two classes: those likely to commit fraud and those unlikely to.” Or, “Divide customers into no more than 10 groups, aligned according to which kind of marketing promotion they are most likely to respond to.” More precisely, an “answer” is an algorithm that will assign each customer or prospect to one of a limited number of buckets. The evidence used to construct this algorithm is data on previous customers and prospects. That evidence may include information as to which bucket they best fit into.
Finding such algorithms is hard and goes far beyond normal statistical methods. Techniques involved include neural networks, an improvement on neural networks called support-vector machines and some pretty sophisticated linear algebra. It’s rarely obvious at the outset what algorithm is best for a specific problem, and the best one is often a complex hierarchy of “elementary” algorithms that themselves are difficult to comprehend. (A couple of vendors claim that one-size-fits-all algorithms are right around the corner. Don’t believe them.) Consequently, professional statisticians almost always get involved early in the classification process.
The good news is that once the statisticians have worked their magic, they produce a black box that works for a specific classification task — data in, customer ratings/groupings out. Mathematically adept marketing managers can use it to test hypotheses, plan campaigns and so on. Increasingly, such systems are used iteratively — plan and implement a quick campaign, observe the results, recalibrate the conclusions and try again. And they perform well enough to be used inline (i.e., in real time) to personalize Web pages and call-center scripts. (The key standard for inline analytics is Predictive Marketing Markup Language.)
Indeed, all of this works smoothly enough that in a few areas, predictive analytics apps are available from statistical software vendors such as SPSS and SAS Institute and from ERP/CRM vendors such as Oracle and SAP.
Don’t miss related articles, white papers, product reviews, useful links and more! Visit our Spotlight on Data Management.