How I Stopped WordPress Comment Spam

By  on  

I love almost every part of being a tech blogger:  learning, preaching, bantering, researching.  The one part about blogging that I absolutely loathe:  dealing with SPAM comments.  For the past two years, my blog has registered 8,000+ SPAM comments per day.  PER DAY.  Bloating my database with that trash slows down my blog in every which way, and recently I decided I was done with it.  I was also tired of moderating comments and seeing loads of SPAM comment notifications in my email inbox.  Done.  And like a boss...I stopped it.  Dead.  Here's how I did it!

How I Was Getting Spammed

There's no way to tell, but I suspect bots detected that I had a WordPress blog, knew the form keys for submitting comments, and did so accordingly.  I was getting comments for Viagra, Cialis, Michael Kors, Nike shoes, and more.  Stuff only bots would spend the time on.  It all had to be a detected, automated attack -- nothing targeted.

What Wasn't Working

Everything.  I had used different WordPress plugins and continued to get SPAM.  Akismet is the premier WordPress SPAM protector and it wasn't able to prevent the problems -- and included 100KB+ of JavaScript which slowed down my site.  I never used a CAPTCHA utility because any barrier to commenting on my site should be a problem I deal with, not all of you.  In the end, I was let down by numerous utilities.  I was disappointed but refused to give in.

What Worked

The first step was removing all of the anti-spam plugins, as there was a good chance they were messing with each other and letting the SPAM in.  My solution was allowing the generic anti-spam solution:  adding an INPUT to the form which should remain empty during the submission process.  Empty in value but present via key:  the premise is that bots that read form inputs would populate the form field values with rubbish just to make sure submissions weren't rejected based on empty values.

How I Implemented Spam Protection

You can't simply add inputs on the server side -- they are output to the page and the bot can read those and populate (or not populate) them.  Creating those fields on the client side eliminates the simple bot / curl readers.  You can add said form fields with JavaScript (via any framework) and that's your best bet.  Before we do that, however, let's implement the server-side SPAM block check.

The WordPress PHP

Before accepting a comment on the server side, we need to check for the dynamic key presence.  All we need is an isset check:

// Fuck off spammers
function preprocess_new_comment($commentdata) {
	if(!isset($_POST['is_legit'])) {
		die('You are bullshit');
	}
	return $commentdata;
}
if(function_exists('add_action')) {
	add_action('preprocess_comment', 'preprocess_new_comment');
}

If the check fails, we reject the comment.  Of course this means that users without JavaScript support will have their comments rejected, but the chance of being spammed is probably greater than that of users without JS support so I'm fine with that.  If the key isn't set, we outright reject the comment all together.  This is a chance, of course, but statistics show it's worth it.

The JavaScript

The easy answer here is using basic JavaScript to inject the form field, and since every JS framework has its own syntax, I'll pass on providing code for all of them.  Adding said field upon page load seems a bit suspect to me, as an intelligent bot may be able to detect that.  In the case of my blog, I use MooTools and submit comments via AJAX, so I simply append the secret field within the JavaScript code upon submission:

var form = $('comment-form');

new Request({
    url: form.action,
    method: 'post',
    onRequest: function() {},
    onSuccess: function(content) {},
    onComplete: function() {}
}).send(form.toQueryString() + '&is_legit=1');

Adding that key upon submission has proven safe to this point.

WINNING

After two weeks I've received 0 SPAM comments.  Zero.  None.  Nil.  Absolute zip.  I went from over 8,000 per day to none.  Better than Akismet, better than any plugin.  Take WordPress SPAM prevention into your own hands -- use client-side magic with a small PHP function to make your life easier!

Recent Features

Incredible Demos

Discussion

  1. Dustin

    Wow, amazing. Anxious to give this a try. Disqus was one possible though less perfect solution, however, I’m going to share this with others and maybe give it a shot!

  2. I use a hidden input with a nonce that is checked against $_SESSION, along with a custom header in my AJAX requests. I’d claim that it has been effective, but I have yet to enable comments to begin with.

    I think I might combine this strategy with my own in having the hidden input done through JavaScript instead, possibly with an empty input as well. Considering your results, I’m fairly certain that spam won’t be the issue I was afraid it would be.

  3. Should this not be standard on all forms? Its a pretty simple thing to do I’m surprised it not more common.

    Also combining this with a nonce so bot can’t just replay the same submission would be advantageous.

    • GottZ

      now guess how spam bots are made. if everyone uses the same spam prevention bot makers will just change their code.

  4. Barry

    Sounds like you ought to wrap that up into a WP plugin David :)

    • Yeah, that’d make it easier for regular Joes with the same problem.

  5. Very well done. I just wonder, why it does not work server-side? So the bot does not get the field altogether, I get. But if it has to be empty, it is still a trap, isn’t it?!?
    Regards,
    Doc Asarel
    https://liebdich.biz Email SIGNATURE for Love & Biz.
    https://github.com/DocAsarel for my code projects!!

  6. A nice touch would be a random key, possibly a nonce value but as the key rather than value. This would prevent a bot that was targeting you specifically and gaming your method. This would only matter if they were taking the time to attack you specifically, but would also be necessary for a generic use in forms done by a plugin or the WordPress core. I’m sure there would be ways that could be manipulated too, but this is really exciting as a simple technique to deal with a big problem.

    Congratulations on two weeks of being spam free!

  7. Very slick. I’ll have to use that on my contact form. On my blog I just use Disqus and I don’t seem to have a problem. I wonder if it has something to do with the fact that entire comments section is rendered via JavaScript, so a normal Curl doesn’t find the form.

  8. Jeremy

    Be careful. This sounds like “security through obscurity” to me. It might work for a while, but don’t be surprised if it stops being effective in the future. Especially if other people adopt the same convention.

    • +1

      What about tokens istead of a simple boolean? It could be improved.

    • Would work too!

    • Vishal

      Dear David

      Thanks for nice workaround. I just started Disqus and need to stop Spam comments. I believe your solution will work on “WP + Disqus ”

      Would it be possible for you to let me know which files i need to modify in wordpress 4.1?

      Thanks

    • MaxArt

      I think this will be safe for a _long_ while, since that woudl imply understanding what the Javascript does – and _why_ it has to be done. Not an easy task at all if the change has been hand-made (like David did).
      Of course, this won’t block a per-site spam attack, which means someone must decode what this site does with comments and adjusts the spambot to be effective again. But that’s not how spammers work.

    • Isn’t all SPAM prevention “security through obscurity”, in a way? I can change the key/value any time I want, and even use a date or timestamp to make it generic.

      This isn’t by any means perfect, and probably could be strengthened, but it works great for me!

    • Hello David,

      Nice Idea, using your script I tried improving the security with the code. I think I agree with @MaxArt that static JS security is not really a big thing to break. but the thing is your idea and that works.

      here’s the working code for me

      //------------------------------------------------------------------------------------------------
      //      Spam Comments Checker
      //------------------------------------------------------------------------------------------------
      function get_the_user_ip() {
          if ( ! empty( $_SERVER['HTTP_CLIENT_IP'] ) ) {
          //check ip from share internet
          $ip = $_SERVER['HTTP_CLIENT_IP'];
          } elseif ( ! empty( $_SERVER['HTTP_X_FORWARDED_FOR'] ) ) {
          //to check ip is pass from proxy
          $ip = $_SERVER['HTTP_X_FORWARDED_FOR'];
          } else {
          $ip = $_SERVER['REMOTE_ADDR'];
          }
          return apply_filters( 'dm_get_ip', $ip );
      }
      
      function preprocess_new_comment($commentdata) {
          $userIdesntity = md5( get_the_user_ip().time() );
          if( !isset( $_POST['is_valid_comment'] ) && trim( $_POST['is_valid_comment'] )== $userIdesntity ) {
              die( 'You are bullshit' );
          }
          return $commentdata;
      }
      
      if( function_exists( 'add_action' ) ) {
          add_action( 'preprocess_comment', 'preprocess_new_comment' );
          add_action( 'comment_form_after', 'comment_spam_prevention', 20 );
          
      }
      
      function comment_spam_prevention(){
          $userIdesntity = md5(get_the_user_ip().time());
          ?>
          
          var cForm = jQuery('.comment-form');
          
          cForm.find('input[type=submit]').on('click', function(e){
              e.preventDefault();
              jQuery.ajax({
                  url: cForm.attr('action') + '?' + cForm.serialize() + '&is_valid_comment=',
                  method: 'post'
              }).done(function( data ) {
              })
              .fail(function() {
                  alert( "error" );
              });
          });
          
          
          
          
          <?php
      }
      
      
      //------------------------------------------------------------------------------------------------
      //      EOF Spam Comments Checker
      //------------------------------------------------------------------------------------------------
      
      
    • If it makes a difference, I’ve been using a similar technique for maybe 5 years and it hasn’t failed me. I have the input in my html and just hide it with css. If the field isn’t empty, a bot filled it out and reject the comment. Seems to hold up pretty well so far.

  9. TM

    If someone uses a phantomjs based not to simulate a browser and submit spam comment using that, it will still get through, no?

  10. TM

    s/not/bot

  11. One of the slicker exploits I have seen in recent times that actually solve a problem. Thanks for sharing!

  12. You’ve wrote about 100KB of JS for Akismet. I’ve stopped spam with Akismet, but at server side, Akismet has good API for spam checking.

  13. David

    Great stuff. Can’t wait to give it a spin.

  14. Fredrik Larsson

    And now, every bot maker will update their bots accordingly

  15. dcsturm

    Hey David,

    thanks for sharing your thoughts about this Problem. Sounds a Lot like the “honeypot” method, having a form field not to be filled. One point towards the “no-javascript-user”-problem: why Not having a field hidden by a (not “hidden”-named) CSS class, probably in a container element, which bots wouldn’t see and so fill it? Could be at least a fallback, which could be removed per JS…

    Cheers,
    Daniel

    • Would work too. Since my use case is sending via AJAX, I don’t need the extra form field.

    • dcsturm

      Yeah, in your case not :) But I wanted to suggest this one as a no-javascript version.

      Cheers

  16. foxmask

    Hi,
    Personnaly I use WordPress Conditional Captcha which avoids on my blog ~12.000 spam.
    If you take a look at it 2min you will see how it works in details. This captcha does not show any additional fields on the comment page. If It detectes a potential spam in comment, then at submission, a 2nd page is displayed to the visitor, which, if this one is not filled correctly, the comment goes to trash. “Humans” (almost) never see this 2nd page. Just bot go to this dead-end :P
    Regards.

  17. You could be interested by this post about how spammers are using headless browser (like PhantomJS) to bypass this type of form security : http://blog.vamsoft.com/2014/07/09/headless-browser-use-in-web-forum-spam/

    tldr : using headless browsers, bots are executing javascript.

    A better trick is to have a hidden field in your form, and change the value of this field when a user focus one of the field. This is working quite well for me, for years, on a great number of forms.

  18. David, if you’re willing, would you be interested in doing a little experiment? We developed a WordPress plugin with a lot of focus on comment spam, I’d be interested to know if by replacing your current method with the “WordPress Simple Firewall” how effective that is against the volume of spam you’re receiving.

    We’re having really good results, but we’re always interested in improving the process. Maybe if you’re method proves to be more successful we’ll look at implementing your solution as an option.

    What do you think? Would you mind to give it a go?

    Cheers!

  19. Hey I am too facing with this comment spam issue from last week. I think I need pass to my WordPress developer to fix issue of our customer. Thanks for sharing… :)

  20. // Fuck off spammers

    ^ This is why I read this blog! I know a lot of form plugins will add an empty “do not fill” input, and this is something I do a lot as well.

  21. Nice tricks, but why did you used die before the wp_die function which will not be executed ?

  22. Probably a bit overboard but I’m that pissed. Want them to know they’re bullshit TWICE

  23. Classic web dev practice but surprisingly it’s still as effective today as it was back in the 90’s :)

  24. @David Walsh
    That is a good solution, but it would only prevent the spam bots and not the people who want to mess with you. I have implemented such system many years ago, but its quite impossible to stop someone from spamming you!

    • Dustin

      It’s still good progress considering over 60% of web traffic are bots.

    • Dustin, of course it’s a good solution. But if we could come up with some solution to end the spams for ever, it would be golden achievement. I have tried many things, but still it could not guarantee 100%

      Maybe one day…

  25. Lars

    What has worked best for us in the past is the honeypot method, using a “not hidden”, but a field that is hidden either by moving it offscreen in CSS or even just display:none; in the CSS.

    The field name can be somehow smartly named, so you use a name that’s is used often, like “website” if you don’t have a website field in your form or “address” (or changing using a list of names, by week, day, month if you have full server side control) if you don’t have an address. This solution is actually super simple in the end, and can be combined with other methods, but the recent solutions we did, we only had this method in use and not one spam message went through. For fun, we pre-launched a system not long ago without this honeypot method, and even though the form was posting with AJAX, it was being attacked by spammers. We implemented our simple honeypot method and since that day not one spam message went through.

    This solution is so simple, yet have proved to be working for a long time and is extra nice, because it does not have false positives when it comes to marking messages spam, that should not have been.

  26. Lars

    In addition David, no need to let the spammers know that they failed exactly. You can just say “thank you” every time, no matter if the message goes through or not. That is making it even harder for the spammers to figure out when they are “getting through”.

    You simple but effective solution is public here, but in general spammers won’t know what’s going on behind the scenes, so if you always say, “all good”, then it’s is harder for them to know when “they are good”. Obviously seeing the comment posted or not posted is way for them, but some bots don’t check that and they cannot know if the comment is being delayed somehow.

    It’s the small things that does the difference. Your solution is working well, but won’t be able to run as a generic solution as the major bot spammers will adjust faster than you think. You will need to implement some local randomized methods that makes it hard for the bot to figure out “per site”, this could include where the honeypot field is rendered in the DOM by the server in combination with random naming and even random type of field.

  27. Dustin

    It looks like you can achieve a similar effect, or better, with a plugin. There’s one called Honeypot Comments, Anti-spam and one other that looks very good called, WP-SpamShield.

    http://wordpress.org/plugins/wp-spamshield/

    • Hi Dustin,

      I have gotten a boat load of spam lately on my site. Thanks a lot for the link.I really hope this helps me. I am getting tired of all these spam comments.

  28. We use this “honeypot” method at Blue Bay Travel and have done for a long time. Although they’re getting smarter, a lot of bots will just fill in every input field within a form – this captures 99% of spam for us easily.

    The extra benefit to honeypots – and the main reason we use it – is to not use CAPTCHA. We believe that using CAPTCHA is an awful decision because they’re not friendly at all.

    • Agreed — CAPTCHA is simply unacceptable.

    • Luca R.

      What do you say about “logical” captchas, as in “how much is 2 plus four?”.

      Would those be acceptable IYO?

  29. I must say, the nicest Captcha I saw, is the one used by Yahoo! It is nice and funny ( moving ). Anybody knows where to get it? Regards

  30. I worry about killing the page altogether with die();. Why not allow your javascript to do some more elegant error handling?

    It doesn’t matter too much to the bots – they’ll get stuck in a loop hell where the form is never submitted, but your non-JS users will see a nice, pretty “Please enable javascript” message.

    With all of the “THE GUB’MINT AN BIG OIL ARE WATCHIN YER EVERY MOVE!” floating around these days, a lot more people are turning off JavaScript because they got a chain email that said it would somehow keep them anonymous and secure online. Just a thought.

  31. I solved my problem with disqus: http://blog.marcomonteiro.net/post/why-disqus

    However, I don’t really use wordpress anymore. So that is not really a problem for me these days.

  32. I wonder what the accessibility ramifications are of having (visually) hidden fields in forms like this. Will screen readers read them out?

    • Screen readers ignore visibility: hidden; and/or display:none; so that’s not an accessibility issue. Headless JS browser bots could however detect this by running getComputedStyle(); on each input to bypass such techniques.

    • How is that a problem if the field is added dynamically on submit?

  33. Doesn’t this method simply imply “if no JS, you can’t send because you are probably a bot”? … and doesn’t it also mean that it is entirely ineffective against bots with JS capabilities?

    • Agreed . . . I’m wondering if I’m missing something, but I feel like the first half of the post describes a different method than the code actually shows? There’s talk of a form field that must remain empty to pass validation, yet given the php code block, the value of is_legit could actually be anything and pass.

      It seems like the code displayed just stops submissions without JS.

  34. It’s a honey pot + a JS requirement, so against headless JS-capable bots it is only as effective as the honey pot is. At least for now it should catch a fair chunk of unsophisticated opponents.

    • I don’t see where the honeypot is … It just adds a var “isLegit” on submit through JS, and that will occur for all JS-enabled bots. There is no honeypot field for bots to fill in. This is pure JS spam block.

  35. This is pretty common and generally referred to as a Honey Pot where you have a hidden form field and on the server side you check to make sure that it has not been set by a bot as a regular user would not see the hidden field and as a result a real person would not be able to fill the value in. What does seem to be a little different is using JavaScript to create the hidden field, that is interesting and I will have to test this out, thanks for sharing

    • It’s not a honey pot if there is no hidden field to populate for bots, not even if it’s added on submit (which neither humans or bots would populate anyway). It’s strictly JS block … If a JS-enabled bot clicks submit, it will be bypassed.

  36. Great idea! I get so tired of all those bloated spam plugins that never seems to work. Not to mention their spammy ads within the plugin settings page. Built a quick plugin based on what you did here. Instead of MooTools, it uses jQuery.

    https://github.com/bmarshall511/wordpress-zero-spam

  37. Great solution, I’m going to implement this on a couple of my websites later today.

    I signed into a couple of client websites a few weeks ago after no one had touched the comments section in 2-3 years. Close to 45,000 comments!

    I’d say the WordPress development community should do something about this, but I’m sure it’s a matter of the spammers almost immediately catching up to the latest WP updates and exploiting them anyway.

  38. WordPress just approved the Zero Spam plugin I wrote based off what you’ve got here: http://wordpress.org/plugins/zero-spam/

  39. I implemented a similar method in 2009 and I’m really happy someone else came up with a similar, yet close, solution!
    Here’s my version http://blog.plastical.com/2010/10/24/zerostring-antispam-easier/, but can’t wait to try yours.

  40. Jacob

    This technique is also known as the “honeypot”.

  41. I believe you loathe (feel intense dislike or disgust for) dealing with spam comments, not just loath (reluctant; unwilling.) it.

    I certainly loathe the spammers, and it’s wonderful you are fighting back in such a great way. Thank you for sharing this!

  42. p

    I wrote a very lightweight anti-spambot plugin and I’m using it in all my sites with great success. It even works without Javascript support (with fallback):

    http://wordpress.org/plugins/stop-spam-comments/

  43. Chris

    Hey David,
    that sounds a lot like a honey pot – I used something similar for a while but eventually they started getting through, and like you I refuse to use captcha.

    I started listening for events in Javascript such as keypress or touch as my method for determining humans.
    The other method I’ve seend used a lot lately is the two related input fields method and I imagine that it would do fairly well if the relationship as obvious to a human but not so obvious to a robot. (such as: what two colours make purple?).

    Anyway I’ll be interested to know in a couple of months if your approach holds up.

  44. Thanks for this and I hope to implement it on my site however, I’m not even sure where to start with this code. Which files to edit and where exactly? A sample or something along that would be great for people like me who don’t know wordpress too deeply.

    Much appreciated from anyone who can help.

  45. Well done. Please post a followup after a few months to report back on its integrity.

  46. First time I’ve seen the input injected with JS. I have seen the opposite: removing the input field from view with JS after the page has loaded, but I think it comes to the same result.

  47. Hi David. The solution looks nice, but it just hides the tip of the iceberg. You will be preventing bots from getting in your WP, but the 8000 spam intents are there. Yet. Hammering your website. You should deter them from keeping up. And a nice way is to capture their intents via htaccess and nullifying them, but don’t send them to a 404, because it will still run queries and php execution threads in your server. You gotta catch’em and send them to fly, to a non-existent url in a weird domain, like 239r9234r3.com. htaccess can filter them and bounce their access before they even request a php thread. This have worked for me, since the bots are usually connecting directly via GET (not POST) to the comment form -and that’s why major antispam solutions fails-, so you must catch’em by a combination of petitions:

    RewriteEngine On
    RewriteCond %{REQUEST_URI} ^(.*)wp-comments-post\.php*
    RewriteCond %{HTTP_REFERER} !^(.*)yourdomain\.com.* 
    RewriteCond %{HTTP_REFERER} !^http://jetpack\.wordpress\.com/jetpack-comment/ [OR]
    RewriteRule (.*) http://98jmtyxj2z9r3rhj920.com.ar/$ [R=301,L]
    

    This will block all the direct hits to wp-comments-post.php, independently of using GET|POST, if they don’t come from your website or from jetpack comments (in case anybody use it), and if the rule don’t match, they are forwarded anywhere, obtaining a real 404 error, not generated by my server.

    It’s and aggressive approach of what wordpress.org recommends, because they published the same rule but it jumps in only when POST headers are detected, but I noticed bots are using GET sometimes, thus bypassing the filters.

    Additionally, you may want to deactivate pingbacks/trackbacks from the settings AND in one and every post entry.

    This just work, and reduced my spam pest from 500/day to ZERO.

    Hope it helps :)
    All the best

    • Oh, and I forgot to mention: This method helped me to reduce the cpu load from average daily 2.8 to 0.7.
      Peace of mind is priceless. :)

  48. David, what about spam user registrations? Do you use a variant of your method for registrations?

  49. Jim Thome

    David, one successful method I’ve used is to include a hidden field with a MD5 hash of the current date: PHP code-> md5(date('M-d-Y')). If the value doesn’t match the actual date, the form is rejected. This stops bots from auto-filling form fields while transparently allowing legitimate submissions. Occasionally spam will come through but not on a regular basis.

    I’ve also had success limiting spam by submitting the form through AJAX, and configuring the AJAX script to only work if the correct AJAX headers are sent (thanks to your website for instructions on how to do this). http://davidwalsh.name/detect-ajax

  50. Great idea.

    But you’ve kind of given a reason why you don’t need javascript. Why not create the dummy input field and then use CSS to hide it?

    Bots will fill it with content, and you can check server side if any content is there.

    That nulls the javascript dependancy nicely and will still have the same effect.

  51. Have read your rants on Twitter, but haven’t read what your secret is until now.

    Before going full-blown sceptical mode, I’d like to jump high five you (\o/*\o/) to express that I think you totally rock for being original. In fact, I might give you a little kiss.

    Sceptical mode on.

    I’ve written many bots in the past for educational purposes. And fun of course. Most bots are relatively simple. They parse HTML and that’s it. However, as spam continues to play a large role in the deeper bits of the internet, I don’t think it’ll take long before seeing advanced bots that implements open-source engines Chromium, Webkit and Gecko to avoid client-slide anti-spam measures. It’d be far-fetched and come with a major performance penalty, but it’ll be effective.

    If a bot I’ve just described comes across your site, it’ll just be a browser, running your JavaScript and your prevention will (unfortunately) fail. But there’s more…

    Considering the popularity of your site, you may encounter a site-specific bot sooner or later. By just looking around (or reading this article), they can easily see what security measures you took and work around them. Same thing goes with the honeypot method, where you add an extra field that is supposed to remain blank. But there’s even more… :(

    Because you’re awesome, it’s likely other developers will adept this very same method and will publish similar articles describing this anti-spam measure. Eventually, many WordPress (and other) sites may use this method. As it gains popularity, it’ll be likely bots will adapt a countermeasure.

    What I’m really trying to say: please understand that it’s only a matter of time until the spam slowly comes back. This could be a couple of days to several years from now as the method described is original for now, but not difficult to implement into any existing bot.

    • That’s all security though, yes?

    • Sooner or later all security will fail, yes.

      I believe the prevention of spam consists of three major vectors. Complexity (∝) to implement a solution, uniqueness (∃!) of the anti-spam measure and time (τ).

      Complexity makes the threshold of implementing a solution higher. Take Captcha for example. It’s not unbreakable, but it took quite a while and it’s still hard to resolve the text programmatically successfully.

      Uniqueness reduces interest of implementing a solution. If you’re the only one using method x, then most hackers will see no point of implementing your anti-spam measure.

      But over time, sooner or later a bot will come across your path that will beat your system.

      Just two of these vectors are under your control. The complexity to work around your described measure is fairly easy, just add the extra parameter to the URL. The uniqueness however balances that. By publishing this solution in the public however, I suspect the many developers that read this may adapt it and only causes the uniqueness vector to decay rapidly, making everyone increasingly vulnerable to attacks.

      As I said said, it could take years before a bot actually implements this method. I just like to point out that though this works, it’s really not that safe.

      PS. Don’t get me wrong. I’m not ranting or telling you’re wrong. I’m really just philosophising. I feel like a fortune teller, warning you for the obvious ;)

      “Give me your hand. YOU ARE IN GRAVE DANGER!”

    • Yes programmatically it’s hard to solve, but the services out there make it so no spammer has to program anything to solve it.

      Captcha is actually pretty easy to solve. GSA Captcha breaker + deathbycaptcha account. What GSA doesn’t catch (only about 15% of all captchas) the humans in third world countries will (99% of the remaining 15%). leaving an incredibly small percentage of captchas that can’t be solved by bots.

      Captchas are by far and wide completely useless. I can make 1,000,000 yahoo!, hotmail, (etcetera) Accounts with the click of a button and a couple hours of a program running. I’m not sure why the big guys like Microsoft use them on their sites. I would never subject genuine users of my website captchas and I suggest nobody else do it either.

      “Uniqueness reduces interest of implementing a solution. If you’re the only one using method x, then most hackers will see no point of implementing your anti-spam measure.”
      – That’s somewhat true, but depends on the PR of your website. Magic Submitter works off this type of machine learning as does Scrapebox (plus can run css/javascript in learning mode) so a spammer will more than likely spend the time teaching the software if the payoff is big enough. But it also depends on the volume of spam as well. SEO companies would likely try to spend the time to do it. However a self employed SEO will more than likely just submit manual comments.

  52. Found something that I was looking for quite sometime now. Have tried couple of different things, but now going to give a try to the points you have mentioned, I hope it will be great implementing this.

    Thanks for sharing.

  53. Can anyone please let me know if there exists any built-in plugin for this purpose? I really like the post David! Thanks!

    • Erik

      Hi John,
      Yes, I found that goodbyecaptha plugin has an advanced implementation of this idea. It uses encrypted tokens with double protection instead of empty field. I discovered it few weeks ago and since I installed it no spam at all. That’s simply excellent plugin.
      https://wordpress.org/plugins/goodbye-captcha/
      Hope it helps,
      All the best

  54. I use Akismet to prevent spam comments and recently I removed the URL field from comment section and It dramatically decrease spam comments from my blog.

    • I also use Akismet to prevent comments spamming and yeah, removing URL field can be great for stopping comment spam because this is the damn reason behind comment spam.

  55. vincenzo

    Hi, do i need to put the php script in the file functions.php? Thanks for your help

  56. Thanks @Abu Zafor for that! Will see

  57. Awesome! I’m gonna get rid of all the spam preventing plugins. Though captcha worked in my case still i’ll try this. I think it’s better solution who wants the dependency on unknown, bulkier scripts. Thanks man, you’re just brilliant.

  58. Hello,
    I was wondering if anybody tried to test for interaction on the submit button.
    If button is clicked, touched, or activated with enter key then it is most likely human.
    Would that work (js activated of course) and would that pose any accessibility problem?

  59. I removed the website URL field and comment form allowed tags which combined with Akismet stopped 100% of spam.

  60. Alex

    I use this plugin
    https://wordpress.org/plugins/cleantalk-spam-protect/
    It is simple and effective and no problems

  61. Hello, I’ve created a ruby gem for rails websites to prevent spam based on your solution :)

    https://github.com/rogeriochaves/anti_spam

    I’ll use on my projects! Thanks.

  62. I use the plug Stop Spammers which is best for me. Why complicate life itself some implementation code captcha typing characters, as it is no longer effective and can not protect against spammers. Downloading, installing, you have.
    I would recommend the plug, because it is sensational. Check themselves and to convince.

    https://wordpress.org/plugins/stop-spammer-registrations-plugin/

  63. I got some good ideas for anti-spam here, thanks. I am thinking that Javascript Ajax submission combined with the Honey Pot hidden field is the way to go.

  64. Well, you can learn a lot about spam prevention by just following OWASP guidelines.

    This goes all along from what is allowed in querysting, cookies and form fields and those names – yes I have had bots trying to manipulatye form field names, even so they wouldn’t validate in W3C validator.

    Up till which methods are allowed when – should search engines be allowed to POST?

    And analyzing header fields, is MSIE 6.0 still used by real users? And how many browsers do actually not accept GZIP? Or provide acceptlanguage header field value?

    Including simple HTML entity checks, fx. substitute one char in a buttons value with the HTML equivalent and test at post time that it is actually translated to the original char.

    Time check, using Star Date, a spinner, to test that the one POSTing to a page is the same as GETting. Header fingerprinting and random numers sent through both a hidden form field and via cookie.

    Layered security, in other words. Using principle of least priviledge, white lists and aggressive input validation among other great guide lines.

    All bots are smart in a way, but I have never seen the one bot who is smart on ALL cases. Those bot programmers are lazy bunch, and they’re only human. Human makes mistakes, and analyzing your log closely will tell you when and what these mistakes are.

  65. runej

    With headerless bots, we might need to look into content analyzis in addition to already known methods, as the links themselves are essential to a spammer, no matter the spam method used.

    I can think of three places where it could be especially interesting; In the URL-part of the links in a message, in the link text, and in the actual content on the linked to page.

    There are surden key words you can look for, which combined should at least be considered suspicious. Like;

    1. buy,free,(no) prescription,online,genuine

    combined with

    2. drug(s),pill(s),viagra,medicine

    …I have a bigger black list running on my own home page on the same principles, so far though only for referer spam detection. (The disadvantage of a black list is obvious; it has to be updated manually one day or another.)

    A spammer would be able to defeat this by using other character sets with chars similar to those in the list. However this would only be possible in link text, not in the URL part, as that would give a wrong (or invalid) URL.

    Also I consider looking at the content-to-link ratio. As some bots just seems to deliver the links and not much content.

    Another approach to links in a post, could be to ask the user to fill a captcha, if there is one or more links in the message. Not a fan of that, but not hard to implement-

  66. Thanks for a great article, I didn’t thought it that way. WOW.
    That is really very clever to create a hidden field. and some of the comments here are helpful too. Definitely implement on my site.

  67. Touche!!!

  68. If the goal is merely to deny posters without Javascript, can’t we just create the “Submit” button itself dynamically via Javascript? If so, then bots won’t be able to submit the form in the first place, using up even fewer resources!

    • 100% Yep.

      This is something I noted in an earlier comment, and I find it strange the entire topic hasn’t been narrowed down to this. The only thing the method posted on this page does, is prevent non-javascript bots from posting … To be honest, it is very effective, and I use it for my own website without having seen any spam yet … Probably because the majority of spam still comes from non-javascript bots.

      There is another factor though. With WordPress, the post.php end-point for commenting and sending messages is “known”, so even if you force the submit button to trigger by javascript, bots will still be able to post to the WP end point. Therefore, from the perspective of WordPress, it makes sense to also implement the “isLegit” check into the PHP also, so that it will fend off bot posts.

      It’s doesn’t hurt to add two hidden honeypot fields either. One empty, and one populated, and both need to remain unchanged when posted. This will thwart JS-bots also.

      Essentially, you are right.

  69. The only problem with this is that you can’t reply to comments in the WP-Admin Comments section… anyone had the same problem?

  70. Interesting askimet works fine for me, rarely does comment spam get through. But this is good to know and honey pots like this definitely work.

  71. We seem to have Akismet working well, 0 spam. I think we will implement this solution also to harden things up.

  72. Had the same results by using a field called “email_v” and checking if it was filled in or not. I just hide the field visually via CSS. Bots fill it in because it has “email” in its name.

    I use this technique on all themes I develop. Zero SPAM. Bonus: accessibility.

  73. Mike

    Hello,
    I am a new WordPress user, but already after a few days I get spam comments.
    I’ve installed a plug-in for the captcha, but I hate chaptcha.
    I would like to use your method but I did not understood how to do. where i need to put the code?

    ps. sorry for my bad english

    Thanks
    Mike

    • Try this in your HTML right after the element

      Then check at the server, that epost doesn’t contain a @

      What required does is, that it makes it a required field, meaning it has to be filled out. However, disabled, means that it cannot be filled out by the user in a browser. The maxlength is just for testing (I forgot why, probably I wanted to know, if any browser would truncate the number), but doesn’t seem to hurt. So, depending on the browser, it will contain either the numbers or nothing if a legit user posts.

      Since the field is disabled, it will be discarded when tabbing.

      Also, in LYNX, you cannot fill it out.

      So it works even with CSS disabled, and it should work with screen readers as well, although this I have not tested yet. The aim here is to make the screen reader not say anything about it, like with a hidden field.

      It is somehow similar to making a hidden field which no bot would fill out itself, but because it is not hidden, and because it is required, most bots will fill it out. Also because of the name “epost”, which many bots recognize as an email address field.

      Hide it via CSS, or move it off screen.

      #epost{display: none}

  74. Colin

    Gravity Forms includes a checkbox to enable a honeypot – same idea, surely?

  75. Hey David, this completely got rid of all of my spam on one of my clients’ websites, so thanks! Now, this method doesn’t work 100% on some contact form plugins because the bots are smart enough to run javascript – but I found a way to block those too. Most spam bots are using PhantomJS or Selenium, and can be blocked using some javascript:

    function  isHeadless() {
    				    var documentDetectionKeys = [
    				        "__webdriver_evaluate",
    				        "__selenium_evaluate",
    				        "__webdriver_script_function",
    				        "__webdriver_script_func",
    				        "__webdriver_script_fn",
    				        "__fxdriver_evaluate",
    				        "__driver_unwrapped",
    				        "__webdriver_unwrapped",
    				        "__driver_evaluate",
    				        "__selenium_unwrapped",
    				        "__fxdriver_unwrapped",
    				    ];
    				
    				    var windowDetectionKeys = [
    				        "_phantom",
    				        "__nightmare",
    				        "_selenium",
    				        "callPhantom",
    				        "callSelenium",
    				        "_Selenium_IDE_Recorder",
    				    ];
    				
    				    for (const windowDetectionKey in windowDetectionKeys) {
    				        const windowDetectionKeyValue = windowDetectionKeys[windowDetectionKey];
    				        if (window[windowDetectionKeyValue]) {
    				            return true;
    				        }
    				    };
    				    for (const documentDetectionKey in documentDetectionKeys) {
    				        const documentDetectionKeyValue = documentDetectionKeys[documentDetectionKey];
    				        if (window['document'][documentDetectionKeyValue]) {
    				            return true;
    				        }
    				    };
    				
    				    for (const documentKey in window['document']) {
    				        if (documentKey.match(/\$[a-z]dc_/) && window['document'][documentKey]['cache_']) {
    				            return true;
    				        }
    				    }
    				
    				    if (window['external'] && window['external'].toString() && (window['external'].toString()['indexOf']('Sequentum') != -1)) return true;
    				
    				    if (window['document']['documentElement']['getAttribute']('selenium')) return true;
    				    if (window['document']['documentElement']['getAttribute']('webdriver')) return true;
    				    if (window['document']['documentElement']['getAttribute']('driver')) return true;
    				
    				    return false;
    				};

    Now you can just do if(isHeadless()) return;

  76. Hey David, just wanted to say thanks for this guide! It helped when making my own anti-spam plugin. The main thing I had trouble with was finding preprocess_comment, but on further testing I decided to go with pre_comment_approved which allows you to

    return 'spam';

    to send it to spam instead of completely blocking the comment.

    Here’s the plugin I created and submitted to the WordPress plugin repo:
    https://wordpress.org/plugins/anti-spam-zapper/

  77. Why not use javascript to add the comment form too?

    That way you prevent the odd non-javascript user from submitting a comment that will never be processed.

  78. That way you prevent the odd non-javascript user from submitting a comment that will never be processed.

  79. Sunfire

    Is this still an effective method for blocking spam, particularly on contact forms, in 2021?

    • My version of the implementation went from daily spam comments to maybe 1-2 spam comment a year.

    • René

      It does! :)

  80. James

    Shouldn’t your example selector be:

    var form = $('.comment-form');

    ?

    Great work!

  81. Omar

    Just change the default names for inputs and textarea, all these span requests come from some python script with predefined vars

Wrap your code in <pre class="{language}"></pre> tags, link to a GitHub gist, JSFiddle fiddle, or CodePen pen to embed!