Data over the event horizon

Ever heard of black holes? These cosmic curiosities appear when a massive star more than eight times the size of our sun runs out of gas. Yep, once there’s no more nuclear fusion, the inexorable forces of gravity take over.

Now depending on all sorts of cosmological factors – what it’s made of, how hot it is, whether any astronomers are watching – the ginormous mass might become a white dwarf; a dim, super-heavy (one teaspoon of its material weighs about 6.5 tons) object. Pretty strange. Such is the nature of Sirus B, the companion of the star Sirius, the brightest star in the sky other than the sun (the star is also called Canicula, the Dog Star, Aschere or Alpha Canis Majoris).

Alternatively, it might become a neutron star. Neutron stars are the result of these honking big stars collapsing and then exploding. If what’s left is more than 1.4 times the mass of our sun, it collapses under its own gravitational force into what is one big neutron. Very strange and damn heavy: One teaspoonful weighs around 100 million tons.

But if what’s left after the explosion is more than about two solar masses, then the collapse keeps going and you wind up with what the chaps in white coats call a “singularity” – an infinitely dense mass that occupies zero volume. And around such a thing there is something called an “event horizon.”

Anything, even light, that crosses the event horizon is sucked into the black hole never to be seen again, and consequently nothing ever comes out of a black hole. Just about as strange as things get without delving into quantum mechanics…

I am telling you all this fascinating astronomical stuff because NASA has huge data archives that have crossed the IT equivalent of a black hole’s event horizon.

It turns out that much of NASA’s unbelievably huge horde of data from its various programs is on magnetic tape, and that tape has a shelf life of no more than 10 years. The problem is that the labor required to move the old data onto new, more stable media is so vast that even if NASA could copy tapes at an impossibly fast rate, it would still take more than 10 years (already more than the remaining life of the tapes). And, as NASA is accumulating data at an incredible and exponentially increasing rate, the problem is not going to be solved.

Thus it is guaranteed that irreplaceable observational and experimental data will be lost forever (although it does raise the question of whether, given the huge number of tapes, anyone would have ever looked at the data anyway).

What struck me about this problem is that NASA is not alone (sounds like an “X-Files” episode). Out in the far reaches of the corporate world are hundreds of companies accumulating data at rates that approach and occasionally exceed NASA’s. What’s happening to this information?

For legal and operational reasons, data cannot be disposed for periods that range up to decades. While I’m sure that many companies routinely clean their house of useless and obsolete data once the operational and legal requirements have passed, I’ll bet that a significant proportion is retained in perpetuity and mostly in obscurity.

Already we have countless acres of storage holding archived material that will never be used or even remembered, and we are adding to the pile at an exponentially increasing rate.

Is all this data doomed to languish in obscurity in vaults of rotting tapes? Will the rotting tapes be replaced in the fullness of time with dusty stacks of delaminating, cracking CD-Rs and DVD-Rs?

You should be concerned about the cost to your company in all this. The longer you take to get a handle on the problem, the more you will pay and keep paying for data that has quietly and irrevocably slipped over the storage event horizon.

Gibbs is a contributing editor at Network World (US). He is at