PHP Email Encoder – Prevent Spam Bots From Collecting Email Addresses

By  on  

Police have criminals. PETA has Michael Vick. Bud Selig has Barry Bonds. Programmers have spammers.

Email spam is probably the most annoying part of my job. Whenever I'm placing email addresses on a page or coding another web form, I have to expend extra time preventing spammers from exploiting the information I put on the page. Spammers are my adversary and the war seemingly never ends. I do have a quick PHP script that I use when putting raw email addresses on a page:

The Function

function encode_email($e) {
	for ($i = 0; $i < strlen($e); $i++) { $output .= '&#'.ord($e[$i]).';'; }
	return $output;
}

Usage

echo(encode_email('user@davidwalsh.name'));

The above function takes a string input (the email address), loops through each character replacing the letter with the character's ASCII value, and returns the encoded email address. That's all you need to do!

I realize that this is not a bulletproof solution. A good (and dedicated) spammer would take a page's code, turn every character of the code into ASCII, and proceed to parse the newly ASCII'd code. If you have a page with hundreds of addresses, the page download will be bloated. Do your contacts a favor though -- use this script!

Recent Features

  • By
    7 Essential JavaScript Functions

    I remember the early days of JavaScript where you needed a simple function for just about everything because the browser vendors implemented features differently, and not just edge features, basic features, like addEventListener and attachEvent.  Times have changed but there are still a few functions each developer should...

  • By
    LightFace:  Facebook Lightbox for MooTools

    One of the web components I've always loved has been Facebook's modal dialog.  This "lightbox" isn't like others:  no dark overlay, no obnoxious animating to size, and it doesn't try to do "too much."  With Facebook's dialog in mind, I've created LightFace:  a Facebook lightbox...

Incredible Demos

  • By
    Face Detection with jQuery

    I've always been intrigued by recognition software because I cannot imagine the logic that goes into all of the algorithms. Whether it's voice, face, or other types of detection, people look and sound so different, pictures are shot differently, and from different angles, I...

  • By
    MooTools Image Preloading with Progress Bar

    The idea of image preloading has been around since the dawn of the internet. When we didn't have all the fancy stuff we use now, we were forced to use ugly mouseover images to show dynamism. I don't think you were declared an official...

Discussion

  1. You think it would work better if you did, name at domain (dot) com in conjunction with this script?

  2. @Mark: I agree, but my customers wouldn’t allow me to use a “name (at) domain (dot) tld” format. They don’t understand how spam prevention works, or even care for that matter. The English-ized email format is good for personal websites.

  3. As David points out this solution is not bullet-proof as a dedicated spammer could run a bot against pages to convert ASCII to text.

    I wonder if encoding the email addres using Javascript would be a better alternative, although if Javascript is turned of on the vistor’s browser (as unlikely as that is) then they get no email addresses.

    Also, adding to David’s comment about the client not really understanding spam prevention, a lot of visitors would be confused by an email address like user at davidwalsh (dot) name although I’ve seen such uses online, usually in forums where spam is more common and users are more likely to understand the email address in that format.

  4. Wartin

    Thanks for the function! I use a Javascript that translates an address from hex.

    <!--
    var hexa = '%74%69.......';   //mail in hex
    var desh = unescape(hexa);
    document.write('' + 'Here' +'');
    -->
    
    • Hi, don’t forget that spammers don’t see javascript.

      It’s good method to encode at server’s side.

  5. Thanks for the function, just what I needed.

    One note: a good (and dedicated) spammer is not really needed to break this, the reverse of this function can be one line long and applied by any bot to all the pages it visits. it is just “the only thing” to do except using a script (be it AS3 or JS or Java) .. So don’t count on this to really protect other people sensitive information (like lists of addresses). Use it, but don’t count on it.

    @Mark: converting an address to name at domain (dot) com is quite useless too, just answer yourself, can the reverse process be automated?

    $HTML=str_replace(' at ','@',$HTML);
    $HTML=str_replace(' (dot) ','.',$HTML);
    //  Done.
    

    Too easy. You could invent many different, colorful, ways to describe your email address, but in the end you’ll confuse more humans than machines.

    @Wartin: writing your email in hex inside the javascript is not much more secure than writing it encoded in HTML (just a little more). Remember you are not protecting it from a human (he’ll just read the output page and jot it down on a post it). The best thing to protect from a bot is stupid random code like:

    var MyEml='fran'.toLowerCase()+ 'cesco';
    MyEml+= ''+unescape('%40')+'serv';
    MyEml2+='er'+'.'+'co';
     MyEml +=MyEml2+'m';
    

    The important thing is not to make the address unreadable, but to not standardize on a format, so that an automated process cannot find it. Of course you may also also try to undo this by PHP code.. (thanks to the magic of REGEXP and a lot of replaces) but if anybody uses a different pattern, the success rate gets so low that coding to catch just a few cases gets useless.

    Thanks,

    Francesco

  6. I’ve been looking for a solution for this, but am not really happy with any of the solutions that take this route. I will probably implement a captcha > httpRequest > PHP > back to client approach.
    Inconvenience is a small price to pay, but I feel captchas are currently the best line of defence.

  7. Jose Cuervo

    I’m a noobie programming and was looking how to do this in php. This is just what I was looking for. Thank you! In my extreme boredom I also managed to port this code to perl. See below:

    #! /usr/bin/perl

    # Perl email encode v 0.2
    # by Javier, nycjv321@gmail.com
    # ported from snippets of code found at http://davidwalsh.name/php-email-encode-prevent-spam

    use 5.014;
    use warnings;
    use strict;

    #declarations
    my $string; # string to be encoded
    my @string; # array of split $string
    my $x; # counter
    my $output; # stored output from encode function
    my $encode; # final encoded string

    # encode_string used to convert human text to their html counterparts
    sub encode_string {
    while ( my($index, $value) = each @string ) {
    $output .= '&#' . ord($value) . ';';

    }
    $_[0] = $output;
    }

    # prepare used to prepare array from human input
    sub prepare {
    chomp($string = );
    @string = split(//, $string);
    }

    &prepare;
    $encode = &encode_string(@string);
    say $encode;

  8. Thank you for the script very useful, when using it however it produced a PHP error as the variable $output is not defined before being used.

    I solved the error by inserting
    $output =”;

    as the first line in the function.

    Thanks again for the script very useful

  9. Another two functions I found around the web:

    function hideEmail($email)
    {
        foreach(str_split($email, 1) as $character)
        {
            echo '&#' . ord($character) . ';';
        }
    }
    
    function protectMail($s) {
        $result = '';
        $s = 'mailto:' . $s;
        for ($i = 0; $i < strlen($s); $i++) {
          $result .= '&#' . ord(substr($s, $i, 1)) .
            ';';
        }
        return $result;
      }
    

    The last one is very similar to yours but it uses substr.

  10. Ashbringer

    Function gives error. $output value should be defined before loop.

Wrap your code in <pre class="{language}"></pre> tags, link to a GitHub gist, JSFiddle fiddle, or CodePen pen to embed!