Disallow Robots Using Robots.txt
I develop customer websites on a publicly accessible web server so that my customers may check the progress of their website at any given time. I could use .htaccess to require username and password for the site but then I'm constantly needing to remind customers what their password is. My big concern is preventing search engines from finding their way to my development server. Luckily I can add a robots.txt file to my development server websites that will prevent search engines from indexing them.
User-agent: * Disallow: /
The above directive prevents the search engines from indexing any pages or files on the website. Say, however, that you simply want to keep search engines out of the folder that contains your administrative control panel. You'd code:
User-agent: * Disallow: /administration/
Or if you wanted to allow in all spiders except Google's GoogleBot, you'd code:
User-Agent: googlebot Disallow: /
What would you prevent the search engines from seeing?