Reputation on the web: puzzling persistence of comment spam

Reflections on the pingback and comment flood from 2 weeks ago. Most of the spam has been removed. In retrospect, three issues stand out:

  • WordPress should have stopped this in the first place. All of the track-backs point to the same website, there are multiple ones for each post. That screams “spam;” this was not a subtle attacker trying to stay under the radar.
  • Removing the junk is tedious. Even in mass-edit mode, only 20 at a time are displayed and there is no option to “check all” before hitting the delete button.
  • Marking the comment as spam seems to have no effect on deleting other comments from the same source. This is perhaps the most fundamental problem. Ordinary users do not switch between adding witty comments on one blog to hawking cheap printer cartridges on the next. If one track-back had been flagged as spam by the blog author, chances are 100% are. They should have been removed automatically. In fact if multiple unrelated blogs all flagged the same source as spam, this is a strong hint that future comments need to be blocked.

This is another case of the non-existence of online “reputation”.  It’s as if actions by the same person have no connection to each other. There are no consequences to having a comment tagged as spam or even being black-listed from a blog– miscreants are free to continue doing the same, on a different post.

Lack of a strong identity system is often cited as the reason reputation has not taken hold. A persistent ID is required to attach a reputation. Ability to get a new ID and start from a clean state when things go wrong is not good for accountability. (This is why black-listing email addresses was a pointless anti-spam feature to start with, at best window dressing dreamed up by email providers to comfort annoyed users. Email addresses are  easily acquired/fabricated. Black-listing IP addresses or entire domains is more effective.)

But in this case all the comment spam pointed to the same source. WordPress logs the originating IP address  for comments and links to a whois query, supposedly to trace spam back to its source. Detection and response capabilities are all good but blocking is far more effective.


Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s