PHP Spam Pre-Filter

I wrote this script many years ago and have published it elsewhere, but I thought I’d put it here since I implemented it on another site this morning.

What it basically does is discards mail that is not sent from the contact page, without cookies (other than the session cookie) or idiotic CAPTCHA’s. It does this by creating a file in the /tmp directory in the form script, and looking for it in the processor script.

It also filters out submissions that are too fast to be human.

The spammer is always landed on the success page. This is by design so human spammers will think they succeeded and go bother someone else.

This example also includes samples of other filters that can be built into it just so the script makes sense.

This would go on the form page:

<?php
session_start();
date_default_timezone_set('America/New_York'); // replace with server time

$startTime = time(); // gets start time
$startBrowser = $_SERVER['HTTP_USER_AGENT']; // gets browser
$startIP = $_SERVER['REMOTE_ADDR']; // gets IP address
$startHash = md5($startTime . $startIP); // creates a unique hash
$startReferer = $_SERVER['HTTP_REFERER']; // gets referring URL, if available

$tempFileContent = "<?php " . "\$startTime =\"" . $startTime . "\"; "
                    . "\$startBrowser =\"" . $startBrowser . "\"; "
                    . "\$startIP =\"" . $startIP . "\"; "
                    . "\$startHash =\"" . $startHash . "\"; "
                    . "\$startReferer =\"" . $startReferer . "\"; ?>";

$tempFileName = "/home/serverusername/tmp/" . session_id() . ".php"; // defines the path and file name
    file_put_contents($tempFileName,$tempFileContent); // writes the file

/* housekeeping for abandoned contact visits*/
$oldFiles = glob('/home/serverusername/tmp/*.php'); // get all file names ending with .php
    foreach($oldFiles as $file){
        $lastModifiedTime = filemtime($file);
        $currentTime = time();
        $timeDiff = abs($currentTime - $lastModifiedTime)/(60*60); // one hour
        if(is_file($file) && $timeDiff > 96) // checks if file is more than 96 hours old
        unlink($file); // deletes the file
    }
?>

And in the processing script:

<?php
session_start();
date_default_timezone_set('America/New_York'); // replace with the server's time zone
$submitTime = time(); // gets submission time;
$submitBrowser = $_SERVER['HTTP_USER_AGENT']; // gets the current browser
$submitIP = $_SERVER['REMOTE_ADDR']; // gets the current IP address
$submitReferer = $_SERVER['HTTP_REFERER']; // gets the URL of page that sent the form data, if available

$checkFile = "/home/serverusername/tmp/" . session_id() . ".php";

/* discards the message and lands sender on the success page if $checkFile is not found */
if (!file_exists($checkFile)) {
    print "<meta http-equiv=\"refresh\" content=\"0;URL=https://example.tld/success.php\">";
    die;
}

include("$checkFile"); // include the checkfile

/* reconstruct the hash and do some spam tests */
$testHash = md5($startTime . $startIP);
if  (
    ($testHash !== $startHash) || // checks hash values
    ($startBrowser !== $submitBrowser) || // checks whether browsers match
    ($submitTime - $startTime < 4) // checks the time used to complete the form
    )
    {
        /* discards the message and lands the spammer on the success page if any of the above fail */
        print "<meta http-equiv=\"refresh\" content=\"0;URL=https://example.tld/success.php\">";
        die;
}

unlink("$checkfile"); // deletes the temporary php file

/* get form values */
$name = $_POST['name'];
$email = $_POST['email'];
$phone = $_POST['phone'];
$subject = $_POST['subject'];
$message = $_POST['message'];

/* examples of some additional possible spam tests */
$spam = 0;
if(preg_match('/http/i|/https/i|/www/i', $name)) { $spam = $spam + 20; } // why a URL in the name box?
if(preg_match('/yahoo.com/i|/gmail.com/i|/hotmail.com/i', $email)) { $spam = $spam + 2; }
if(preg_match('/business loan/i|/business lender/i|/quick approval/i', $message)) { $spam = $spam + 10; }
if(preg_match('/seo/i|/search engine optimization/i|/first page of google/i', $message)) { $spam = $spam + 20; }
if($spam >= 20) {
    print "<meta http-equiv=\"refresh\" content=\"0;URL=https://example.tld/success.php\">";
    die;
}
if($spam >= 10 ) {$subject = "**SPAM** " . $subject; }

$mailTo = "someone@example.tld"; // assign the recipient email address

/* concatenate data for message */
$Body = "This is a Contact Form response from:";
$Body .= " ";
$Body .= $name;
$Body .= "\n";
$Body .= "Email Address: ";
$Body .= $email;
$Body .= "\n";
$Body .= "Phone: ";
$Body .= $phone;
$Body .= "\n";
$Body .= "Subject: ";
$Body .= $subject;
$Body .= "\n";
$Body .= "Message: ";
$Body .= $message;
$Body .= "\n";
$Body .= "Sent from IP address ";
$Body .= $submitIP;
$Body .= "\n";

// send email
$success = mail($mailTo, $subject, $Body, "From: robot@example.tld
Reply-to: $email
X-Mailer: PHP/" . phpversion());

// redirect to success or failure page
if ($success){
    print "<meta http-equiv=\"refresh\" content=\"0;URL=https://example.tld/success.php\">";
    die;
}
else{
    print "<meta http-equiv=\"refresh\" content=\"0;URL=https://example.tld/failed.php\">";
    die;
}
?>

It’s ancient code. I think it was written for PHP3. Or maybe PHP5. I forget. But it eliminates a huge percentage of spam from bots that attempt to bypass the contact page, while avoiding cookies or CAPTCHA images.

On servers where I have logging enabled, this stops about 95 percent of submissions before they ever get to the actual mail server. I’m also assuming zero false positives because few, if any legit senders would bypass the contact page (and if they do, I don’t want to deal with them anyway).