Learn basic principles of website design.  Home Web Design Basics Design Principles The proper way to use the robots.txt file
Your Ad Here

The proper way to use the robots.txt file


The proper way to use the robots.txt fileWhen optimizing your web site most webmasters don’t consider using the robots.txt file. This is a very important file for your site. It let the spiders and crawlers know what they can and can not index. This is helpful in keeping them out of folders that you do not want index like the admin or stats folder or content that they can not index.

Here is a list of variables that you can include in a robots.txt file and there meaning:

1)User-agent: In this field you can specify a specific robot to describe access policy for or a “*” for all robots more explained in example.
2)Disallow: In the field you specify the files and folders not to include in the crawl.
3)# the number sign represents comments

Here are some examples of a robots.txt file for redball.com

User-agent: *
Disallow:

The above would let all spiders index all content.

Here another example

User-agent: *
Disallow: /cgi-bin/

The above would block all spiders from indexing the cgi-bin directory.

User-agent: googlebot
Disallow:

User-agent: *
Disallow: /admin.php
Disallow: /cgi-bin/
Disallow: /admin/
Disallow: /stats/

In the above example googlebot can index everything while all other spiders can not index admin.php, cgi-bin, admin, and stats directory. Notice that you can block single files like admin.php.

Author's URL: Jimmy Whisenhunt
Thank you for voting.
Rate this Materials:
Bad 
1 2 3 4 5 Excellent
print this page subscribe to newsletter subscribe to rss

Learn web design basics about usability, layout, colors, design principles and more. Take an in-depth look at what attracts a visitor to your site. More Web Design Basics: Most Popular Materials | Fresh Materials | Website Templates

Add comments to "The proper way to use the robots.txt file"

Only registered users can write comment

Reader's comments