Big Data analysis the stuff of heroes


There’s a new kind of hero in the popular consciousness. Think TV shows like Numb3rs, CSI: Crime Scene Investigation or NCIS. Think Brad Pitt and Jonas Hill in the movie Moneyball.

“The passionate, rational problem-solver is emerging as a heroic archetype in popular culture,” said futurist and author Thornton May. “The hero of the next age will be the person who effectively manages Big Data.

May’s presentation was a highlight of a Tuesday event, Conquer Big Data with Big Analytics, sponsored by SAS Institute Canada Inc. The event also featured Gartner Inc. research vice-president for information management Merv Adrian, Bryan Harris, CTO of SAS’s VSTI division, as well as a panel discussion including Yves DeGuire and Tom Kari of Statistics Canada.

Given the volume of information we have at our disposal, May said, “there is nothing we cannot know with the tools we have today if we choose to be heroes.”

Linear thinking won’t get us there, May said. He pointed to the example of the shipping industry, which was moribund in the 1950s. The USS Savannah, a nuclear-powered launched in 1962, was aimed at saving the industry. But a trucker from North Carolina, Malcolm Mclean, saved the industry with a completely different approach: containerized shipping. Today, it’s 90 per cent of all global freight and the cost has dropped to two per cent of 1950s rates.

Likewise, in the early days of television, programming was largely a radio show with a picture. That’s where we are with Big Data now, May said.

May said in executive workshops around the world, he has isolated two kinds of curve, on each of which companies place from laggards to best-in-breed. There’s the curve we’re on, and the next one. We’ve ridden the existing data analytics curve as far as we can, and it’s time to “curve-jump.”

May cited a real-world example of curve-jumping. On January 15, 2009, Chesley Sullenberger was at the helm of US Airways Flight 1549 when both engines went out. Instantly, his goal changed from managing a multi-million dollar aircraft to saving 115 lives. He safely landed the plane in the Hudson River off New York City. Listening to the cockpit recording , May said, “not once did you hear him say, ‘Let’s run this by legal.”

Gartner’s Adrian likened the problem of defining Big Data to that of the U.S. Supreme COurt trying to define pornography: “I don’t know what it is, but I know it when I see it.”

In the early days of computing, data was define in the program itsekf. Then came the database; applications didn’t have to recreate the data. This schema was based on the assumption that data would always look the same.

“Those days are over,” Adrian said.

There’s a huge amount of information available through unstructured, social media sources, Harris said, but the vast majority is irrelevant. It’s easy for an organization to develop a hoarding problem.

“We can’t simply store data without evaluating it,” he said. “Hoarders cannot apply a value function on that stuff.”

Harris said running analytics on inhouse data – reports, catalogues, wikis, e-mail and calendar items, for example – can produce a template to constrain the stream of information from the World Wide Web that goes into an organization’s in-memory analytics by discovering the ontologies and taxonomies that are relevant to the organization.

Adrian compared that sort of data to a telco’s dark fibre – fibre that’s already in the ground, but isn’t carrying traffic. “We have a lot of unactivated data in our organizations,” he said. Though we’re not using it now, someone will figure out how to use these “preserved conceptual constructs.”

The dark fibre analogy particularly resonated with StatsCan’s DeGuire.

“We have dark data,” he said. “Not only do we have new data coming in, our old data is becoming usable.”

The challenge, he said, is how to incorporate that until-recently dark data.


Please enter your comment!
Please enter your name here