Start-up scouts for redundant data

Deepfile Corp. CEO Jeff Erramouspe said that half the files on a typical corporate network are unaccounted for, either because they are redundant or haven’t been accessed for a long time. Not only does this waste storage capacity, but it’s a difficult environment to manage and can leave a company exposed to security risks, he said.

Deepfile’s answer to the problem will come in the form of two appliances designed to help companies search for redundant or unused files and take action on them. The company expects to have both products, which will work with Windows NT/2000 file servers and network-attached storage devices, available as soon as April.

The first product, Auditor, grazes a file system, pulls back metadata on every file and saves it to a database on the appliance. Each file is given a unique signature – a checksum, per se – that lets the appliance compare files for sameness even if they have different names and locations. It then reports the files, their locations and characteristics to the second product, Enforcer.

Based on rules the IT manager sets, Enforcer will cull duplicate files and directories and migrate older, still-useful files to less-expensive storage or tape.

“From a business point-of-view, I was interested in knowing how our storage was being allocated between high-cost storage, medium-cost storage and our least-expensive, direct-attached storage,” said David Graham, director of IT operations for Web-based content management vendor Vignette Corp. “I wanted to make sure Vignette was using our storage resources to the highest and best use.”

Graham has installed a Network Appliance Inc. filer that has about eight terabytes of data and uses early units from Deepfile to monitor it.

“[Deepfile’s appliance] scans a very large file system in our case and provides detailed statistics about the makeup of that data,” said Darren Johnson, senior IT administrator for Vignette. “We knew we were filling up the file server but didn’t know what exactly made up the data…. With Deepfile, we found that as much as hundreds of gigabytes of data is duplicated.”

The products are implemented as 1U-high servers, which connect to the network via a 10/100/1000M bit/sec Ethernet port.

Jamie Gruener, a senior analyst with The Yankee Group, said Deepfile has elegantly combined technologies often found in separate products.

“They combine policy-based management with data management – it’s very simply ‘Capacity Planning and Management 101’ for file-oriented data,” he said.

Deepfile is similar to another young company called Arkivio in that it collects metadata in the same fashion, Gruener said. Deepfile differs in that it handles both Windows Common Information File System and Unix Network File System data and provides more data management capabilities.

Deepfile Auditor and Enforcer have Web-based interfaces for local or remote management. Auditor is available now starting at US$10,000 for the initial two terabytes of data managed per year. Enforcer will be available in the second quarter starting at US$20,000.