Prevent Robot Indexing with Response Headers

By  on  

Every so often you have parts of your website that would be better off not indexed by search engines.  API calls, search result pages, PDF documents -- all examples of responses which may not have value outside of the current user.  No we all know we can signal to the search engines not to index pages using a META tag, but oftentimes service calls and documents don't get the luxury of a META tag.  Luckily you can add a header to prevent these responses from being indexed.

The header name is X-Robots-Tag should be easy to add using the server-side language you prefer.  For example, adding this header with PHP may look like:

header('X-Robots-Tag: noindex');

If you're using a Django-based python site, the could would look like:

response['X-Robots-Tag'] = 'noindex'

This header can also be set within your .htaccess or httpd configuration files:

<Files ~ "\.pdf$">
  Header set X-Robots-Tag "noindex"
</Files>

The truth is that there's no guarantee that something your server serves wont be indexed by a search engine, but small tweaks like this can ensure your search engine standing can improve and that users don't find their way to "dead" parts of your site via search engines.

Recent Features

  • By
    Animated 3D Flipping Menu with CSS

    CSS animations aren't just for basic fades or sliding elements anymore -- CSS animations are capable of much more.  I've showed you how you can create an exploding logo (applied with JavaScript, but all animation is CSS), an animated Photo Stack, a sweet...

  • By
    Camera and Video Control with HTML5

    Client-side APIs on mobile and desktop devices are quickly providing the same APIs.  Of course our mobile devices got access to some of these APIs first, but those APIs are slowly making their way to the desktop.  One of those APIs is the getUserMedia API...

Incredible Demos

  • By
    AJAX Page Loads Using MooTools Fx.Explode

    Note: All credit for Fx.Explode goes to Jan Kassens. One of the awesome pieces of code in MooTools Core Developer Jan Kassens' sandbox is his Fx.Explode functionality. When you click on any of the designated Fx.Explode elements, the elements "explode" off of the...

  • By
    Introducing MooTools LazyLoad

    Once concept I'm very fond of is lazy loading. Lazy loading defers the loading of resources (usually images) until they are needed. Why load stuff you never need if you can prevent it, right? I've created LazyLoad, a customizable MooTools plugin that...

Discussion

  1. Chris

    I have a big problem with spam registration on an ExpressionEngine site I help manage. Could this help? I have no development experience, fyi…

  2. This a really old post, but for those who are using nGinx instead of Apache, you can do

    location ~* \.(doc|pdf)$ {
        add_header  X-Robots-Tag "noindex, noarchive, nosnippet";
    }
    

Wrap your code in <pre class="{language}"></pre> tags, link to a GitHub gist, JSFiddle fiddle, or CodePen pen to embed!