EMC uncorks CAS

Eyeing a growing market for technology that speeds the delivery of large files stored in colossal databases, EMC Corp. on Monday unveiled a new storage architecture and a new type of storage server at a launch event here headed by EMC chief executive officer Joe Tucci.

EMC introduced its CAS (content addressed storage) architecture, based on a new storage server called Centera. The Centera systems use a method called content addressing to tag what are commonly referred to as reference files – medical X-Rays, check images, video media – with the equivalent of a digital fingerprint, making it easier for the Centera storage server to retrieve them.

“The idea is that CAS is a complimentary storage platform to SANs (storage area networks) and NAS (network attached storage) optimized specifically around managing large amounts of fixed content,” said Barry Burke the director of integrated solutions for networked storage platforms at EMC, based in Hopkinton, Mass.

Thirty-two rackable Centera storage server nodes fit within a six-foot rack, and the system scales in clusters to over 1120TB, according to EMC.

CAS technology and the Centera server platform position EMC to take advantage of what experts predict will be a booming market for unchanging “reference” data.

“It’s a whole new ball game, and the market opportunity is huge,” said Steve Duplessie, a senior analyst with the Enterprise Storage Group in Milford, Mass.

Reference files will make up more than half of all stored corporate and government data by late 2004, according to the Enterprise Storage Group.

The offspring of EMC’s 2001 acquisition of storage software company FilePool, CAS and Centera reflect a NAS lineage in that the storage system hangs off an Ethernet network and serves files directly to applications. A number of unique NAS-style startups such as Zambeel have recently appeared with similar ideas, and companies such as Digital Fountain and Interwoven have delivered reference data servers and data tagging tools, respectively.

But EMC’s size and influence could both disrupt and throttle the reference data storage market as EMC’s CAS architecture requires the cooperation of application vendors to write APIs to Centera.

“(Centera) doesn’t provide a file system, and it’s not a database engine per se; you don’t deal with it with SQL, you actually write software to it using an API over an IP network,” explained Burke.

Because CAS and Centera networks improve the performance of applications by making it easier for them to access reference data faster, application developers should be eager to partner with EMC on APIs, stimulating the growth of CAS networks, said Duplessie.

“An application today that manages check images off of a generic disk array has to be very smart. It has to optimize the way it writes things in order to be able to sort and retrieve them later,” explained Duplessie. “Centera adds a layer of intelligence right on the disk images themselves, and the application simply hands off to the Centera, and therefore does not need to be as ‘smart,’ or take up excess resources.”

Connected Corp., an enterprise PC life-cycle support company based in Framingham, Mass., has deployed a Centera server along side the company’s EMC Symmetrix servers to assist in the storage, backup, and recovery needs of Connected’s thousands of clients, according to company representative Tom Hickman.

“The individual client files range in size from infinitesimal up to 4MB per archive, but we end up with about 20 TB of data behind each server,” said Hickman, who added that Centera relieves Connected main storage servers from redundant file serving tasks.

“There’s a lot of information being created that needs a better place to live,” said IBM’s Burke.

An entry-level Centera system with 16 server nodes and 10TB of storage starts at US$210,000, software included.