Introducing Snooker: Lightweight spam detection for blog comments

If you’ve been following the Magnificent Walrus dev logs, you’ll have read that I opted for the “Snook Algorithm” instead of adding an external dependency in the form of Akismet.

I’ve dubbed it the “Snook Algorithm” because as far as I know, it doesn’t have an official name. It’s the points-based system Jonathan uses (or perhaps used at this point) for filtering spam on snook.ca.

Each comment starts with 0 points. Points are then awarded and deducted based on a variety of rules. If a comments final score is greater than or equal to 1, the comment is considered valid. If the comments final score is 0 then it’s considered to be worth moderating. If the comments final score is below 0 then it’s considered to be spam.

The first beta of my implementation of the “Snook Algorithm” was released earlier today. You can get it via the usual platforms: crates.io and GitHub.

Example

Snooker gives the example comment below a score of -10 based off of the following patterns:

use snooker::{Comment, Snooker, Status};

let comment = Comment {
    author: Some("Johnny B. Goode".to_string()),
    url: Some("http://my-free-ebook.com".to_string()),
    body: String::from("
        <p>Nice post! Check out our free (for a limited time only) eBook
        <a href=\"http://my-free-ebook.com\">here</a> that's totally relevant</p>
    "),
    previously_accepted_for_email: None,
    previously_rejected_for_email: None,
    previous_comment_bodies: None,
};

let snooker_result = Snooker::new(comment);
assert_eq!(snooker_result.score, -10);
assert_eq!(snooker_result.status, Status::Spam);
· #snooker