Robots.txt

June 5, 2009 by Benjamin Christie 

Robots.txtThe robots.txt tells the search engines what pages they can include in the searches and what pages they can’t. It also points them to the location of the sitemap.xml file on your site.

Writing a robots.txt file could not be easier. It’s just a text file that you can create using notepad and you can disallow directories which you don’t want the search to read.

A sample robots.txt file;

—————————————
User-agent: *
Disallow: /controlpanel
Disallow: /downloads
Disallow: /images

# BEGIN XML-SITEMAP-PLUGIN
Sitemap: http://www.yourdomain.com/sitemap.xml
# END XML-SITEMAP-PLUGIN

—————————————

Once you’ve created the robots.txt file, place it in the main directory of your website. For example, if your domain is www.yourdomain.com, you will place the file at www.yourdomain.com/robots.txt

If you need more information, an in-depth outline of robots.txt file can be seen here.

Comments

2 Responses to “Robots.txt”

  1. stephen on December 7th, 2009 7:54 am

    I don’t understand why you would want to use robot.txt — what’s the logic behind excluding spiders?

    Thanks –
    best, Stephen

  2. Benjamin Christie on December 7th, 2009 8:02 am

    Great question Stephen.

    For the majority of people they don’t want to block search engines. However recently in the media the Wall St Journal announced they are planning to block Google from indexing their newspaper. Robots.txt would be the way they would prevent this.

    Cheers

    Benjamin

Feel free to leave a comment...
and oh, if you want a pic to show with your comment, go get a gravatar!







Copyright © 2010 Gourmet Ads Pty Limited