When it comes to data processing, there’s an increasing demand for speed. Whether it’s solid state drives or doing everything in memory, getting the right hardware is the easy part. Measuring what “faster” actually means is another story.

It’s hard not to think that spinning disk is on its way out. Fusion-io Inc. is one of the companies leading the charge to put solid-state drives (SSDs) at the heart of every data centre. It has frequently cited what it calls the “industry inertia” around spinning disk and the wrongheaded approach some take in deploying SSDs in a network storage array. According to Gary Orenstein, vice-president of products at Fusion-io, enterprises are missing out on the benefits of the technology by doing so. Part of the reason is due to measuring the wrong thing.

“There are a whole slew of tools that people can use to measure network latency and storage performance, but we actually encourage customers to measure application performance improvement,” says Orenstein, “because you could drive yourself crazy and end up pulling your hair out if you spent all your time measuring these lower level bits and pieces.”
Measuring latency involves testing how many I/O operations can be executed back and forth to a disk.  But this gives a misleading pictures of what’s going on.
“They’ll spend all this time trying to optimize for this workload generation tool and then they’ll say, ‘Well, it doesn’t seem like it’s such great performance.’
“And we say, ‘Well, just try it with your database or with a copy of your database.’ And then all of a sudden they’ll see a five times or a 10 times improvement in the database transactions per second. And that’s really what counts here.”
Andrew Reichmann, principal analyst at Forrester Research Inc., says the prospect of accurately measuring data processing speed can be daunting. “There’s buying the right tools, there’s designing the performance analysis reporting tool to show what you want to show and then understanding the information,” he says.
“It’s a complex science. It’s just not easy. The tools are expensive, they’re hard to use and they’re not that widely deployed.”
Among the few companies that do have the money to spend on such tools and the expertise to use them is a major Canadian bank. On the electronic trading floor, says one of the bank’s IT managers, who asked that his name and that of his company remain confidential, success or failure can be determined in milliseconds. Reacting to data on the market has to take place in fewer than 30 milliseconds, he says.
The bank has more than 120 tools to measure performance from two ends: the IT section, which maintains the system and the another that analyzes the financial impact of it.
The bank uses Fusion-io flash storage for I/O intensive operations, sending data to a structured database hosted on flash drives. Other banks, he says, do everything in-memory, but this requires a major overhaul of the entire system, which can be incredibly complex. Each one has its own secret mix to maintain its competitive advantage.
But this is an extreme example. What should the average company, one that can tolerate latency over 30 millseconds but nevertheless wants to speed up their systems, do? Orenstein says in the absence of sophisticated tools to tell you what’s going on, a good plan and then trial-and-error testing is your best bet.
First of all, he says, companies should “be very clear about their requirements. [Whether] the data is very read-oriented or write-oriented is an important question to ask.”
After that, they should go into their RPO and RTO requirements (“how much resiliency needs to be built into the solution.”) Then they have to figure out, as best they can, what their own tolerance for latency is. Attaching SSD to server CPUs will speed things up, but at the expense of making storage a “captive resource” for that particular server.
“That’s a hard question to answer,” he says, “But if a small boost is going to make a big difference, then you might want to stay with something closer to what you use now compared to, if this is a really extreme performance hog of an application, then you might need to build more for performance and worry less about ease of management or utilization.”