So Google was down today. I actually did not notice. I like most people, have a very busy anddemanding day job that doesn’t include Google mail as an essentialservice – heck, even Google search I can do without now that Bing! is a “good” option, and getting better each day.

The notable note here, in my opinion, is how Google communicated – taking accountability and with full transparencyto their customers.  This is significant, and I challenge you toconsider the impact of what they have done this time from a trust andrelationship perspective with their customers.  How do you handle such situations when something happens in your Data Center or Applicaton Stack?

Google’s definition of a Down Time Periodis 10 minutes.  Anything less (I guess) is not down time. This initself is interesting, and a classic case of a measurement that islikely not very meaningful to most of its customers.  Certainly from anIT perspective, none of us would ever define such a metric withoutasking our customers first? Right? But we’ll forgive Google this one(for now) – they are a free service after all, so we’re lucky it’s upat all, one could argue…

But for Google’s paying customers (i.e. Google Apps),there is a guarantee of 99.9% uptime, before they need to start payingback time to their customers — which means lost revenue. That meansthey are “done” for the month, far exceeding the ~45 minute alloweddowntime in the month of September with their 100+ minutes of downtime.

When you notify your users that “services have been restored”, doyou have a policy around transparency? Do you even measure “at fault”service disruptions separately? And if so, do you share this with yourorganization, business leaders, anyone? Do you hope no one asks?

It really becomes an issue of leadership and accountability.  So you know where I stand.

Now does that mean you need to communicate to everyone? Does that mean you release the name of whodunnit?Well No, and (of course) No – but should there be accountability withinthe team and all the good learnings and acceptance of personalresponsibility, if applicable? Of course.

So if you get into the elevator with your president and they askyou, “So what did happen with that outage or issue.” — what do you say?I know what I would say (IFI had to be asked — hopefully I would havehad the opportunity to deliver the news prior to being approached — butthat’s not always possible…or necessary”

“Well, ,we dropped the ball. The problem was on our end.  The individualsinvolved are sorry it happened, they understand where the failurepoints were, and we have corrective actions to put in place to helpprevent this from happening again.”

That’s what Google did today, and from a brand and reputation perspective, have to give them their props.

-Pedro



Related Download
Designing for enterprise automation Sponsor: IBM
Designing for enterprise automation

Register Now
Uncategorized