Subscription Options:

What I'm Doing...

Posting tweet...

robots.txt imageA robots.txt file is a text file that manages the search engines that listen to its instructions. 

Major search engines such as Google, Yahoo and MSN all listen to the robots.txt file and obey the instruction placed in it.

What kind of instructions can you place in a robots.txt file?  Well for one you can allow or disallow search engines from spidering your site.  

Other uses for a robots.txt file include:

  • disallowing search engine bots from spidering certain sections of your site such as private files

  • setting the pace of how fast a robot can spider your site (this can help reduce bandwidth sage)

  • completely banning certain engines from spidering your site while allowing other to spider all or part of it.

How do you set up a robots.txt file for your site?

Easy, you create a new text file and upload it to the root of your site (where you homepage index file resides).  In it you can place the following:

Placing this in your robots.txt file:

User-agent: *
Disallow: /

instructs all compliant spiders not to index anything in your site.

While placing this is your robots.txt file:

User-agent: *
Disallow:

allows all spiders to index your site.

You can also place this in your robots.txt file:

User-agent: *
Disallow: /tmp
Disallow: /logs

and it will instruct all compliant spiders not to spider the specified folders.

You can also be specific and place this in the robots.txt file:

User-agent: Googlebot
Disallow: /tmp
Disallow: /logs

and Google alone will not spider the specified folders.

Here is a list of the main search engines and their user agents:

AltaVista: Scooter
Infoseek: Infoseek
Hotbot: Slurp
AOL: Slurp
Excite: ArchitextSpider
Google: Googlebot
Goto: Slurp:
Lycos: Lycos
MSN: Slurp
Netscape: Googlebot
NorthernLight: Gulliver
WebCrawler: ArchitextSpider
Iwon: Slurp
Fast: Fast
DirectHit: Grabber
Yahoo Web Pages: Googlebot
Looksmart Web Pages: Slurp

To sum up, using a robots.txt file is yet another important tool a webmaster has at their disposal to manage the activities of search engines.

One Response to “Managing your Robots.txt File”

Leave a Reply

You must be logged in to post a comment.