January 2nd, 2005

The Trouble with Comment Spam

Yesterday, the Ping entered manhood – practically on the eve of its fifth birthday. We got visited by a comment spammer.

The good news is that the damage wasn’t horrible. Empty comments everywhere. The bad news is that Ryan and I need to take some action on this, but we wanted to get the opinions and thoughts of regular Pingers. We’re going to have to do something in order to thwart these schmucks who want to tell us all about free online poker.

We’re kicking around a few concepts but want to hear yours. Possibilities include, but aren’t limited to, comment throttling (anyone can only comment every x minutes,) email confirmation of your posts (it doesn’t go on the site until you confirm it via email,) and full-on registration (you need to be a Ping member to post.) Obviously any option involving registration could change the whole tenor of the site… that’s why we want to handle this with kid gloves.

What do you think?

Posted in Technology

FROM: Chris
DATE: Sunday January 2, 2005 -- 9:30:23 am
How about email confirmation for non-members? Those of us that post here daily don't want to have to confirm every time.

Another idea - add an extra input field like this guy did.

I don't think throttling will stop them. I've noticed that when I get hit there is often multiple IP's hitting the site with the same spam.

DATE: Sunday January 2, 2005 -- 9:31:59 am
Paul or Ryan - fix my comment - I must have not closed my anchor tag and it has hosed the Ping.

I've hosed the Ping, I'll never this this down.

FROM: Paul
DATE: Sunday January 2, 2005 -- 10:50:44 am
But now there's no proof, Chris, so your secret's safe with us. :)

Insofar as throttling goes, it'd be comment-wide throttling... not IP-based.

FROM: Dave Heinzel [E-Mail]
DATE: Sunday January 2, 2005 -- 11:01:02 am
The guestbook and comments on my site were plagued with the same problem, and I'll tell you what I did. I use PHP for my site, but I'm sure there's a way to do it in most languages.

When someone makes a comment, there's a regular expressions check to see how many links were in the comment (spammers only benefit from including links, or plain URLs). If there are more than 8 links, it denies posting the comment but does not return an error.

If there are more than 4 links and less than 8, it displays a helpful message that the user needs to cut back to under 4 links and provides an opportunity to repost the comment.

I have it send me an email every time a post is blocked, so I can test the efficiency. So far around 100 posts have been blocked, and not one time has the spammer modified the post to reduce the link count to get their message posted.

If you'd like some help with regular expressions, let me know. This is a problem that CAN be dealt with in a way so that your visitors will not have to jump through any extra hoops.

DATE: Sunday January 2, 2005 -- 1:47:13 pm
I wouldnt mind being a member but if we have to sign in evertime it would suck, the nice thing about the daily ping is that it is all at the main page and you dont need to go to a different page to post. Throttleing would work, i would put it at like 6 a day. The link thing works too or just have all comments with link sent to you before they are publically posted. I mean this is a once in 5 years thing that happened so it probably wont be that big of a deal in the future.

FROM: Maria
DATE: Sunday January 2, 2005 -- 2:53:05 pm
What if it had one of those things they use to prevent automatic signups on sites like Yahoo--don't know how hard it would be to implement, but if we had to type into the field what we saw in the box next to it, which would be a slightly warped version of, say, "6AgN" that would work, right?

FROM: Merle [E-Mail]
DATE: Sunday January 2, 2005 -- 4:21:24 pm
I had wondered about that. There seemed to be a whole lot of comments just yesterday...

The problem with registration or email confirmation is that you might not get some of the commentary you would otherwise have. I do not know if *I* would be posting here if I had to register first -- I might just lurk instead.

On the other hand, moderating really bites. It takes a lot of time.

Comment throttling is not bad. I've often thought that if ISPs just limited users from sending more than a hundred emails a day that spam would just shrivel up and die. Sure, mailing lists have to be taken into account -- but something like that should not be too bad. I cannot imagine needing to send more than a hundred emails in a day -- or to post more than a few comments an hour.

But it still won't stop everything. Some online things (sorry, I refuse to call them "blogs") I read have three or four spam-posts a day, often just with one link in them. Usually they are not egregious spams, but links to other online things of vaguely similar content... but it's still spam.

FROM: Cat [E-Mail]
DATE: Sunday January 2, 2005 -- 4:23:26 pm
Are you rolling your own here, or using a CMS? If it's yours, just changing the name of the comment script should do the trick for a good long time. Though I can tell you this will *not* work if you're using a common CMS.

Sadly, one of the most effective tools is closing comments on old posts. That would certainly change the tone of the place, though. The Ping is one of the few websites where comments on old posts can be relevant and entertaining.

What I've had to do on my sites is combination therapy--blacklist, forced preview, and screening.

There is also a tool that does what Maria suggested. I've considered this, and it may work. But I do know there's a spammer tool to defeat it; it's ingenious, they auto-submit the gif to porn sites where a human responds to them, then pull the response back to where they want to spam!.

FROM: poo head
DATE: Monday January 3, 2005 -- 5:04:10 am
i didnt read ur posts

FROM: Paul
DATE: Monday January 3, 2005 -- 7:05:27 am
Poo head's invaluable comment would probably never happen with some sort of registration or confirmation. Think about it!

FROM: Dave Heinzel [E-Mail]
DATE: Monday January 3, 2005 -- 9:08:16 am
Another thing I forgot to mention is that I wrote an automatic URL parser for my guestbook, so all people have to do to create a link is to type a url in plain text. If there is a "a href" code in the post, it's rejected with a simple message to correct the problem.

Spamming robots (and people) usually have their code pre-written and most of the time it's posted automatically, so if the form gives an error when detecting an html link, that stops most spam with no problem.

Again, it's your site so do what you want, but there is a way to deal with this automatically on the server, not burdening yourself with a registration system and hampering would-be first-time posters.

FROM: Joseph
DATE: Monday January 3, 2005 -- 10:50:38 am
I'm not that computer savvy, so I can't offer any suggestions regarding technical aspects.

I doubt that having to sign in would negatively impact the site that much. I belong to another discussion group that I have to sign into to use, but once I sign in, I stay signed in.

Apparently, the moderator of that group, nevertheless, has to do quite a bit of regular screening to keep out spam and inappropriate comments.

FROM: Heather
DATE: Monday January 3, 2005 -- 11:42:04 am
I'm all for throttling spammers in the good ol' sense: THROTTLE 1 a (1) : to compress the throat of : CHOKE (2) : to kill by such action

FROM: Robert [E-Mail]
DATE: Monday January 3, 2005 -- 8:54:12 pm
Let us regular Pingers register because I'll be damned before I put up with another Robert posting!

FROM: jk
DATE: Monday January 3, 2005 -- 10:41:02 pm
I am not too well-versed in how it's actually done, but I know you guys are: my friend Tim's blog requires a password and it's saved on my computer via has certainly limited posting to that of just his close friends, so maybe this is too severe, but it is nice that I don't have to enter the password each time I post. I truly would not even mind if you had to use some sort of email confirmation because.....I love you guys. Sniff.

FROM: Dave Walls [E-Mail]
DATE: Monday January 3, 2005 -- 11:58:21 pm
I'm kinda for registering...Granted, not having registration leads to some good-old-fashioned Ping Battles (See the Michael Jackson Ping or the Freedom
battles), or some classics filled with dumb-ass goodness (Gwen Stefani or Progeria anyone?), but overall, it starts to be a hassle to read through all the reeeeeally lame comments. Plus, omment spam is such a problem, I guess it's inevitable that some form of registration takes place.

jk: I love you guys. Sniff.

Awwww....someone needs a hug!!

FROM: Dave Heinzel
DATE: Tuesday January 4, 2005 -- 2:31:42 am
I wasn't going to comment on this anymore, but it's an important issue. There is the fundamental principal that if you give the general public any way to comment on your site (via registration or otherwise), there's no way to prevent certain comments from being made. You will always have people who make stupid comments.

The major reason comment spamming exists is to get people to visit their sites. For this reason, a comment with links to many external sites has a really good chance at being spam. A post with a few links could either be spam or a genuine post. If there's just one link, it's probably just fine.

Yes there are exceptions, but you have to realize that no system is perfect, and the spammers will always figure out a workaround. But spammers usually don't visit sites in person - it's done by spamming robots. So if you use a unique spam blocking method, the spam robots won't know how to figure out a workaround.

Since any method you use can be broken by spam robots eventually (with varying difficulty), there's not much sense in making your posters do a lot of extra work for very little added protection.

Your pings are the cake - comments are the icing. Other users' contributions to this site make it enjoyable.

I hate to sound all George-Bushy, but this reminds me of what he said about terrorists. If they cause us to change our way of living, they have won. Less dramatically, if the spammers cause this site to get locked down, it's a sad day for freedom of speech (or 'ease' of speech).

FROM: Paul
DATE: Tuesday January 4, 2005 -- 7:40:17 am
Dave, I do like your thoughts so far... but don't worry about the "stupid comments." We like all the comments.

No system is perfect, indeed. Ryan and I are discussing all the options in the Ping Conference Room right now, so stay tuned. The last thing either of us wants to do is "lock down" the Ping, honestly.

FROM: jk
DATE: Tuesday January 4, 2005 -- 9:49:07 am
I personally dislike the bad spelling more than the stoopid comments.

FROM: Ryan [E-Mail]
DATE: Tuesday January 4, 2005 -- 11:30:10 am
Me to.

FROM: Monica
DATE: Tuesday January 4, 2005 -- 3:07:31 pm
Dave--you forgot the Raven ones!

FROM: Dave Heinzel
DATE: Tuesday January 4, 2005 -- 3:32:02 pm
I wasn't familiar with the Raven posts. Even after checking the thread I'm not sure what you mean - at least if you're talking about spam or just dumb comments.

FROM: Michael S.
DATE: Thursday February 3, 2005 -- 3:15:24 pm
I personally wouldn't post if I had to register... I don't like registering for things much. The best idea I've seen is the four-digit confirmation code where the user must enter the characters he or she sees. This, added to the write-your-post-here area at the bottom of each page, would thwart rapid-fire spammers and still permit the user as much anonymity (sorry about the spelling) as they prefer.

FROM: Merle [E-Mail]
DATE: Saturday February 5, 2005 -- 2:48:07 pm
Michael S.: there's a sad problem with those "four-digit confirmation codes", or similar things.

I say "sad". Really it's a cool, impressive idea someone had. But it's sad to us.

Suppose you wanted to crack a whole bunch of those -- so you could spam entire forums. So you write a multi-phase spam bot. On the first phase, it loads the page with the image. Then it posts the image somewhere that a gullable human -- one of the billions out there -- will answer the question for it. Then it completes the original post, with that answer.

And who would be gullable enough to answer a question like that? Well, someone trying to access what they think is a porn site! Yes, along with the "are you really 18?" question they prompt you with one of those "type in the digits you see" questions.

Free porn sells. Well, no, but the site owner gets decent ad revenue plus they have cooler spam bots that can get through such questions (including things like signing up for Yahoo email accounts). And odds are the human is going to be pretty accurate.

Really. It's been done.

