PHP Email Encoder – Prevent Spam Bots From Collecting Email Addresses

By  on  

Police have criminals. PETA has Michael Vick. Bud Selig has Barry Bonds. Programmers have spammers.

Email spam is probably the most annoying part of my job. Whenever I'm placing email addresses on a page or coding another web form, I have to expend extra time preventing spammers from exploiting the information I put on the page. Spammers are my adversary and the war seemingly never ends. I do have a quick PHP script that I use when putting raw email addresses on a page:

The Function

function encode_email($e) {
	for ($i = 0; $i < strlen($e); $i++) { $output .= '&#'.ord($e[$i]).';'; }
	return $output;
}

Usage

echo(encode_email('user@davidwalsh.name'));

The above function takes a string input (the email address), loops through each character replacing the letter with the character's ASCII value, and returns the encoded email address. That's all you need to do!

I realize that this is not a bulletproof solution. A good (and dedicated) spammer would take a page's code, turn every character of the code into ASCII, and proceed to parse the newly ASCII'd code. If you have a page with hundreds of addresses, the page download will be bloated. Do your contacts a favor though -- use this script!

Recent Features

Incredible Demos

  • By
    Create a Spinning, Zooming Effect with CSS3

    In case you weren't aware, CSS animations are awesome.  They're smooth, less taxing than JavaScript, and are the future of node animation within browsers.  Dojo's mobile solution, dojox.mobile, uses CSS animations instead of JavaScript to lighten the application's JavaScript footprint.  One of my favorite effects...

  • By
    HTML5&#8217;s window.postMessage API

    One of the little known HTML5 APIs is the window.postMessage API.  window.postMessage allows for sending data messages between two windows/frames across domains.  Essentially window.postMessage acts as cross-domain AJAX without the server shims. Let's take a look at how window.postMessage works and how you...

Discussion

  1. You think it would work better if you did, name at domain (dot) com in conjunction with this script?

  2. @Mark: I agree, but my customers wouldn’t allow me to use a “name (at) domain (dot) tld” format. They don’t understand how spam prevention works, or even care for that matter. The English-ized email format is good for personal websites.

  3. As David points out this solution is not bullet-proof as a dedicated spammer could run a bot against pages to convert ASCII to text.

    I wonder if encoding the email addres using Javascript would be a better alternative, although if Javascript is turned of on the vistor’s browser (as unlikely as that is) then they get no email addresses.

    Also, adding to David’s comment about the client not really understanding spam prevention, a lot of visitors would be confused by an email address like user at davidwalsh (dot) name although I’ve seen such uses online, usually in forums where spam is more common and users are more likely to understand the email address in that format.

  4. Wartin

    Thanks for the function! I use a Javascript that translates an address from hex.

    <!--
    var hexa = '%74%69.......';   //mail in hex
    var desh = unescape(hexa);
    document.write('' + 'Here' +'');
    -->
    
    • Hi, don’t forget that spammers don’t see javascript.

      It’s good method to encode at server’s side.

  5. Thanks for the function, just what I needed.

    One note: a good (and dedicated) spammer is not really needed to break this, the reverse of this function can be one line long and applied by any bot to all the pages it visits. it is just “the only thing” to do except using a script (be it AS3 or JS or Java) .. So don’t count on this to really protect other people sensitive information (like lists of addresses). Use it, but don’t count on it.

    @Mark: converting an address to name at domain (dot) com is quite useless too, just answer yourself, can the reverse process be automated?

    $HTML=str_replace(' at ','@',$HTML);
    $HTML=str_replace(' (dot) ','.',$HTML);
    //  Done.
    

    Too easy. You could invent many different, colorful, ways to describe your email address, but in the end you’ll confuse more humans than machines.

    @Wartin: writing your email in hex inside the javascript is not much more secure than writing it encoded in HTML (just a little more). Remember you are not protecting it from a human (he’ll just read the output page and jot it down on a post it). The best thing to protect from a bot is stupid random code like:

    var MyEml='fran'.toLowerCase()+ 'cesco';
    MyEml+= ''+unescape('%40')+'serv';
    MyEml2+='er'+'.'+'co';
     MyEml +=MyEml2+'m';
    

    The important thing is not to make the address unreadable, but to not standardize on a format, so that an automated process cannot find it. Of course you may also also try to undo this by PHP code.. (thanks to the magic of REGEXP and a lot of replaces) but if anybody uses a different pattern, the success rate gets so low that coding to catch just a few cases gets useless.

    Thanks,

    Francesco

  6. I’ve been looking for a solution for this, but am not really happy with any of the solutions that take this route. I will probably implement a captcha > httpRequest > PHP > back to client approach.
    Inconvenience is a small price to pay, but I feel captchas are currently the best line of defence.

  7. Jose Cuervo

    I’m a noobie programming and was looking how to do this in php. This is just what I was looking for. Thank you! In my extreme boredom I also managed to port this code to perl. See below:

    #! /usr/bin/perl

    # Perl email encode v 0.2
    # by Javier, nycjv321@gmail.com
    # ported from snippets of code found at http://davidwalsh.name/php-email-encode-prevent-spam

    use 5.014;
    use warnings;
    use strict;

    #declarations
    my $string; # string to be encoded
    my @string; # array of split $string
    my $x; # counter
    my $output; # stored output from encode function
    my $encode; # final encoded string

    # encode_string used to convert human text to their html counterparts
    sub encode_string {
    while ( my($index, $value) = each @string ) {
    $output .= '&#' . ord($value) . ';';

    }
    $_[0] = $output;
    }

    # prepare used to prepare array from human input
    sub prepare {
    chomp($string = );
    @string = split(//, $string);
    }

    &prepare;
    $encode = &encode_string(@string);
    say $encode;

  8. Thank you for the script very useful, when using it however it produced a PHP error as the variable $output is not defined before being used.

    I solved the error by inserting
    $output =”;

    as the first line in the function.

    Thanks again for the script very useful

  9. Another two functions I found around the web:

    function hideEmail($email)
    {
        foreach(str_split($email, 1) as $character)
        {
            echo '&#' . ord($character) . ';';
        }
    }
    
    function protectMail($s) {
        $result = '';
        $s = 'mailto:' . $s;
        for ($i = 0; $i < strlen($s); $i++) {
          $result .= '&#' . ord(substr($s, $i, 1)) .
            ';';
        }
        return $result;
      }
    

    The last one is very similar to yours but it uses substr.

  10. Ashbringer

    Function gives error. $output value should be defined before loop.

Wrap your code in <pre class="{language}"></pre> tags, link to a GitHub gist, JSFiddle fiddle, or CodePen pen to embed!