Seven tales of IT foul ups – Part 2


True IT confession No. 4: What can Brown do for you?

Here’s one of those rare backup mishaps in which data did in fact get backed up. But what it got backed up to is where things go sour.

Twenty-seven years ago, David Guggenheim had just gotten his first “real job” as biological data manager at an environmental consulting firm in Southern California. At that time, the firm’s hardware consisted of a PDP-11 and a time-share IBM 360 mainframe in Los Angeles, accessed via dial-up.

“It was time to archive an important project from the IBM mainframe, so I cracked my knuckles and began pounding out the JCL [Job Control Language] necessary to write our data to tapes that would then be shipped to our office,” he says. “I submitted the job, satisfied that our data would be safely backed up.”

Dumb and dumbest – 10 dumbest mistakes network managers make

A few days later a UPS driver poked his head in the door at the firm’s office and shouted, “Is there a David Guggenheim here?”

The UPS truck was filled floor to ceiling with boxes, all of them addressed to Guggenheim. He opened the first one. It was full of punch cards. And so were all the rest of them.

“It was our data from the IBM mainframe,” he says. “To my horror, I realized that instead of specifying output to magnetic tape, I specified output to punch cards. I can’t remember my JCL very well any more, but as I recall, it was the difference between specifying ‘=0’ versus ‘=1.’ I was absolutely humiliated.”

It gets worse. A few days after the entire staff got involved clearing enough floor space for the mountain of boxes, the bill arrived. The cost of a punch-card backup job was nearly $1,000 (and remember, we’re talking about 1982 dollars here).

“I had blown our budget out of the water, killed a forest, and still failed to back up our data onto tape,” says Guggenheim, who’s now Dr. David Guggenheim, Ph.D., president of 1planet 1ocean, and a senior fellow at The Ocean Foundation. “I’ve spent my career since then doing environmental work, so hopefully I paid penance for the dead trees.”

Lessons learned? 1. Little mistakes can cause huge problems, so keep checking until it hurts. 2. Immediately own up to your errors; humility is a great teacher. 3. Take the time to appreciate the humor of a colossal screw-up, says Guggenheim. “It does wonders for the sting.”


True IT confession No. 5: Unplug at your own risk
Back in the mid-’90s, Jan Aleman was interim IT manager for a major telecom company in the Netherlands. He was called in to replace a CTO who’d left under less-than-voluntary circumstances. Before the ex-CTO got canned, though, he’d ordered a $300,000 IBM failover system for the company’s mission-critical billing engine.

“A very good IBM salesman had sold them this overpriced hardware, assuring them that if the primary system failed it would rollover seamlessly to the secondary one,” says Aleman. “He said it was completely redundant, that nothing could go wrong. I said, ‘All right, let’s see if it actually works.'”

So Aleman yanked the power plug for the primary system out of the wall, right in front of the IBM salesman. All the company’s core systems went dark. The critical billing engine was down for the rest of the afternoon. The phone switches still worked, but nobody in the back office could get anything done.

Though the failover system was installed and running, nobody had bothered to test it. So the next thing Aleman did was institute biweekly tests of the system on weekends.

“I unplugged the company,” says Aleman, who is now CEO of Servoy, a developer of hybrid (SaaS and on-premises) software. “Needless to say, they were not very happy, but nothing bad ever happened to me. I’m still not sure how I managed to pull that off.”

SOA White paper – Available with free subscription: Driving business agility throu SOA connectivity and integration


Lessons learned? 1. Always test systems before you bet the company on them (repeat as needed). 2. Think twice before you yank that power cord.

True IT confession No. 6: Never let another be the master of your domains
Back around 2003 or so, “Fred” (not his real name) was the IT manager for a regional cable company in the Midwest. At the time, the company had about 35,000 subscribers. To boost its business services, it decided to become a domain name reseller for Network Solutions.

As part of the transition to domain name sales, the company redirected all domain renewal notifications to a person in its business support unit. “We assumed only our customer’s domain notifications would go there, and not the company’s own domains,” Fred says. But as the saying goes, assuming makes asses out of everyone.

Sure enough, one night around 10 p.m. everything at the ISP stopped working: DNS, e-mail, the company’s own Web sites, and the sites it hosted for its business customers — all simply went poof.

The problem? The ISP had neglected to renew its own domains. The person in business support assumed Fred was also getting notified about the renewals (he wasn’t) and Fred assumed that since he wasn’t being notified, everything was hunky dory (it wasn’t).

“By the time we diagnosed the problem — because you rarely think to check whether your own domain has expired — we had fallen out of the root servers and it took a full 24 hours before everything was restored,” says Fred. He adds a variant on the old MasterCard commercials:

Lessons learned? 1. Always have multiple people receiving import alerts. 2. Register your domains for 10 years and it will most likely be the next guy’s problem.

True IT confession No. 7: Don’t ask, don’t tell, and don’t let them make you take a polygraph

Four years ago “Paul” (not his real name), an independent data analyst in (yes) the Midwest, was working with a governmental client on a $20,000 analysis project. After two months of hard work, he delivered a preliminary draft to the client, then went off on a week-long business trip.

Before he left, Paul burned a disc with all the project data on it so that he could finish it up in the hotel during his trip. And as was his usual custom at the time, he deleted all 4GB of project data from his hard drive to free up space.

Then, of course, he lost the disc: $20,000 worth of work gone in a flash. What did he do? What any smart consultant would do: He billed the client for the entire project, in full. And promptly received a check.

“Six months went by and I didn’t hear back from the client,” says Paul. “I thought that was incredible, because I expected to receive comments and changes on the draft. A year went by, and nothing.”

Finally, two years after delivering the draft, the dreaded call finally came.

“‘Are you going to ever finish this project?’ I heard on the other end of the phone,” says Paul. “I said, ‘There’s no way that I can stand by that original data and recommendations, since two years have gone by. None of the information is valid anymore.’ Of course, I knew full well I could never provide any updated data or updated recommendations based on the original data. Fortunately, the client accepted that explanation and then proceeded to discuss what fees I’d need for some new work.”

In his defense, Paul says the preliminary draft was 95 percent complete, and the client told him they’d already implemented many of the recommendations he’d made.

These days, Paul is a self-proclaimed “data backup nut.”

“On any given day I have about 10 copies of all current project data, and can completely restore every project data file that I have worked on during the last three years within about five minutes,” he says. “I learned a hard lesson that I certainly won’t forget anytime soon.”

Lessons learned? 1. You can never have enough backups. 2. It’s a good idea to also keep hard copies on hand, just in case. 3. If you do lose all your data before you’ve delivered the final product, try to make sure you’re working for the government at the time. They might never notice.