The spammographer

Saying Raymond Chen is a developer is like saying that guy Ahab was a ship’s captain. Technically accurate, but contextually flimsy. For Chen is one of the most respected Windows developers (and speakers on development) at Microsoft. He blogs too, not only about the arcana of coding but also about, say, the movie version of Pride and Prejudice or the etymology of the phrase “traffic light.” Chen is also a certified pack rat. He has kept all of the junk e-mail (spam) he’s ever received, and charted it on a graph. Type “spam visual” into Google and his graph is the number-one hit.

CSO: Did you know your spam chart, which is quite pretty, would generate so much traffic?

Chen: I knew it would be big, but not as big as it got. I’m just some guy who needs some sunshine who decided to plot spam.

What compels someone to save seven years’ worth of spam?

I am a pack rat, and it’s easier to amass information than, say, Hummel figurines. Cheaper too. The spam was bugging me, so I started writing a keyword-based spam filter. Of course I needed a corpus of data to test my filter against.

A corpus. Of course.

I had already built a sort of prototype that read my e-mail for me, and this spam filter would obviously be a plug-in to that.

You built software to read your e-mail?

It decided by keywords what would be important for me to see and what I could ignore. It didn’t work that well.

Is there any software that you buy and don’t write yourself?

Personal finance software. I think I’ll stay away from writing tax-prep apps. The truth is, I’m a minimalist. I don’t install a lot of software. It baffles people that I’m technologically averse, but you know, people in the meat-packing industry are probably sausage-averse.

Back to the chart. It demonstrates the groundbreaking idea that spam has gotten worse over time. You trailblazer!

Right, to me charting the size of the messages was really interesting. You see more spam, but the average message gets bigger over time too.

You’ve wrestled spam trends to the ground. What’s next?

I’ve built some software to automatically sort the spam by whether it’s a 419 scam (i.e., the “Nigerian fortune” scam), a phishing scam, a virus. That way I’ll be able to see when, say, 419s really took off. That’s for Spam Chart 2.0.

You’re a minimalist pack rat. Hmm. Is it true you don’t ever throw away packaging?

I used to keep all the packing material from anything I bought. But I’ve learned to let that go. I still hoard cardboard boxes, but I’ve found an outlet for that. Whenever I learn a friend is moving, I give them some boxes. That allows me to keep hoarding boxes.

Hey, you could offer the boxes through direct e-mail marketing. You know, spam.

What would I pack them in, though?

Related Download
3 reasons why Hyperconverged is the cost-efficient, simplified infrastructure for the modern data center Sponsor: Lenovo
3 reasons why Hyperconverged is the cost-efficient, simplified infrastructure for the modern data center
Find out how Hyperconverged systems can help you meet the challenges of the modern IT department. Click here to find out more.
Register Now