An End to Spam on the Maison...?Posted on 2008/07/23 10:40:57 (July 2008). I have finally got around to hooking Akismet into SBM.
SBM is the homegrown engine that drives the blogs here on the 'Maison. Akismet is a service which will check each comment posted against a big database of known spam and give a spam / no spam verdict.
Thanks to Simon Steele for making me aware of the existence of Akismet in the first place.
Thanks also to Fabio Zendhi Nagao for writing this article about a class he had written to use the Akismet API in ASP. I made use of this class in hooking SBM into Akismet. I did have little bit of monkeying about to do, as his class was written in VBScript, and SBM is (slightly unusually) ASP, but using Javascript. There was a bit of hackery to get Javascript code to be able to use VBScript code, but I got there in the end, and once I had got over that hurdle the integration was painless.
There was already some very primitive spam checking in SBM, basically just looking for URLs and HTML in the comments. Following a simple experiment, recording a day's spam, it looks like this was actually catching about 98% of all spam comments already.
However, it transpires that the 'Maison is bombarded with almost one thousand spam comments every day. So even after the primitive filtering in SBM ditched the vast majority of these, there were still a highly irritating 10 or 20 a day trickling through.
The downsides of using Akismet are that it will now take longer to post each comment (as internally SBM has to make a round trip to the Akismet server), and also there will inevitably be some false positives. I will try to make it clear when that happens, and Akismet does have a process for reporting "hams" (genuine messages falsely identified as spams), so we ought to be able to keep training it over time.
Comment 1
ROCK ON! Very good implementations John! :D Can you give us more stats of how the spam is fought here at la maison?
Posted by Lox at 2008/07/23 12:39:41.
Comment 2
There goes any hope of turning the maison into a brokerage for industrial talc.
Posted by dsp at 2008/07/23 17:29:16.
Comment 3
Indeed, Lox. I agree. I would love to see some of the stats on how much spam assaults the board and how Dr. Hawkins fends off the attack.
Posted by Travis at 2008/07/24 15:08:41.
Comment 4
Travis: this only affects spam directed at our blogs, not the board.
I currently have the scripts storing every spam comment that is rejected, whether it be buy the primitve spam detection in SBM, or by Akismet.
Since midday on the 22nd (almost three days ago) there have been 2600 rejected comments. Just 38 (about 1%) of those were not picked up by SBM's spam detection, and we had to resort to Akismet.
So it looks like we're faily consistently getting almost 1000 a day, and 99% of those are caught by SBM alone.
Posted by John at 2008/07/25 09:13:47.
|