Tue, Jan 14, 2003 (anti-spam software)

Find of the day yesterday: probably nothing new to others who are familiar with the spam industry, but while working on a different problem, I noticed yesterday that spammers are embedding HTML comments in the middle of almost every word in some of the spam email I receive! (At some point, why don't these people look in the mirror, and see that they're stooping so low, and doing something so wrong?)

Looking at the raw text of one message I received, sent to my "unix" email account, the raw text looked like this:

He<!--unix-->llo un<!--unix-->ix,
D<!--unix-->o y<!--unix-->ou wa<!--unix-->nt t<!--unix-->o
ma<!--unix-->ke mo<!--unix-->re mo<!--unix-->ney?

Because anti-spam software often analyzes phrases in email messages to determine if they are spam, these HTML comments can work to "hide" these common phrases. This has probably been going on for a while, and I just noticed, but it amazes me how far these people will go to try to shove spam down your throat.

For a day or two this will work for them, but let's say that they've helped me find the motivation to make some overdue changes to my anti-spam software. FWIW, these changes include stripping all HTML from messages during the analysis process, and adding regular expression capability to the software.