Throwing out garbage with validity checks

There’s an ancient law of computing that goes something like this: the results of any calculation can be no better than the inputs that calculation uses — or more familiarly, “garbage in, garbage out.”

Everyone knows this law well. But if that’s the case, then why do so many of us do so little to keep garbage from accumulating in our databases? The math is scary: a typical worker in a large corporation might fill in a certain on-line form a dozen times each day, and a single field that fails to check for a preventable error can thus potentially introduce a dozen errors per day into your database. If you think that’s bad, multiply this by the number of people who make the same mistake, day in and day out.

Modern databases offer many tools for incorporating “validity checks” so you can catch errors before they ever reach your database. For example, it’s easy to define “typed” fields that will only accept a specific type of data (for example, to restrict numeric fields to integer values and text fields to letters). Similarly, a field that checks input data against lookup tables can restrict workers to entering a limited range of permitted values; in an address database, a lookup table that contains only 13 entries for the field “province” (don’t forget our three territories) can eliminate a whole class of data-entry errors.

Better still, why make workers type data at all when they can choose it from a list? Pop-up menus, radio buttons, scrolling lists and various other forms of “pick lists” let the computer control the input while simultaneously making life easier for your workers. Lists work particularly well because they let workers focus on their job (picking the right value) rather than deciphering arcane computer conventions (“was that month first, or day?”). If you provide them with all the options, they no longer need to rely on faulty memory to retrieve the permitted values.

But lists are just the tip of the iceberg. Could you build a simple calculator into the database so the computer does the calculations for them? Could you use the input from one field to pre-define the data in another field (for example, a city’s name defines the telephone area code prefixes that are possible)?

Work with your fellow employees to understand how they enter data so you can determine the best way to present their choices; they won’t forget who’s responsible for their improved accuracy and speed, particularly around performance appraisal time. Of course, you’ll also earn your own manager’s gratitude once you’re no longer wasting time fixing preventable errors.

Best of all, you may start noticing unforeseen benefits for your corporation. For example, most of us desperately avoid hunting through reference books if we have someone we can call, assuming we don’t just guess and enter a “close enough” answer. If the information is available onscreen, right where it’s being used, there’s no guesswork, and we won’t waste time on the phone, waiting for overworked tech support staff to answer — time they could be spending on more complex problems. Everyone’s stress levels drop and their effectiveness goes up. And that has side-effects too: customer satisfaction inevitably improves as the error rate drops and response time improves.

It certainly takes time to build validity checking into data-entry forms. Under deadline pressures, that time may seem like an unreasonable investment, particularly since you’ll never prevent logic errors (“You mean Montreal isn’t in Ontario?”), but an hour spent bugproofing a form will be repaid with hours of time saved hunting down subtle — or not so subtle — errors in the data.

Hart ( is a translator, technical writer, editor and a senior member of the Society for Technical Communication. He lives in Pointe-Claire, Que., where he works for a forestry research institute.