Prevent Robot Indexing with Response Headers

By  on  

Every so often you have parts of your website that would be better off not indexed by search engines.  API calls, search result pages, PDF documents -- all examples of responses which may not have value outside of the current user.  No we all know we can signal to the search engines not to index pages using a META tag, but oftentimes service calls and documents don't get the luxury of a META tag.  Luckily you can add a header to prevent these responses from being indexed.

The header name is X-Robots-Tag should be easy to add using the server-side language you prefer.  For example, adding this header with PHP may look like:

header('X-Robots-Tag: noindex');

If you're using a Django-based python site, the could would look like:

response['X-Robots-Tag'] = 'noindex'

This header can also be set within your .htaccess or httpd configuration files:

<Files ~ "\.pdf$">
  Header set X-Robots-Tag "noindex"
</Files>

The truth is that there's no guarantee that something your server serves wont be indexed by a search engine, but small tweaks like this can ensure your search engine standing can improve and that users don't find their way to "dead" parts of your site via search engines.

Recent Features

  • By
    9 Mind-Blowing Canvas Demos

    The <canvas> element has been a revelation for the visual experts among our ranks.  Canvas provides the means for incredible and efficient animations with the added bonus of no Flash; these developers can flash their awesome JavaScript skills instead.  Here are nine unbelievable canvas demos that...

  • By
    Designing for Simplicity

    Before we get started, it's worth me spending a brief moment introducing myself to you. My name is Mark (or @integralist if Twitter happens to be your communication tool of choice) and I currently work for BBC News in London England as a principal engineer/tech...

Incredible Demos

  • By
    Create a Clearable TextBox with the Dojo Toolkit

    Usability is a key feature when creating user interfaces;  it's all in the details.  I was recently using my iPhone and it dawned on my how awesome the "x" icon is in its input elements.  No holding the delete key down.  No pressing it a...

  • By
    Fullscreen API

    As we move toward more true web applications, our JavaScript APIs are doing their best to keep up.  One very simple but useful new JavaScript API is the Fullscreen API.  The Fullscreen API provides a programmatic way to request fullscreen display from the user, and exit...

Discussion

  1. Chris

    I have a big problem with spam registration on an ExpressionEngine site I help manage. Could this help? I have no development experience, fyi…

  2. This a really old post, but for those who are using nGinx instead of Apache, you can do

    location ~* \.(doc|pdf)$ {
        add_header  X-Robots-Tag "noindex, noarchive, nosnippet";
    }
    

Wrap your code in <pre class="{language}"></pre> tags, link to a GitHub gist, JSFiddle fiddle, or CodePen pen to embed!