HomeButton
Introduction
Curriclum
Programs
Project
Resources/Ideas
P16-Portal
EdSeek
Help
 

 
Robots Text File
 

     

You will need to create a text file called robots.txt and place it at the root level of your server; you can include syntax in this file to tell robots that they are barred from accessing all or certain parts of your server. Well-behaved robots that adhere to the robots exclusion standard will search for this file upon visiting your site.

Here's an example of what your robots.txt could include:

User-agent: *
Disallow: /tmp
Disallow: /personal/topsecret

In the first line, the asterisk indicates that these limitations are directed at all robots; you could also include the names of robots here if you only wanted to allow or disallow specific ones.

The second and third lines instruct robots that all URLs on the site matching the pattern /tmp or /personal/topsecret should not be visited.

To see how web sites use the robots.txt file, point your browser at any top level site, for instance:

http://www.whitehouse.gov/robots.txt
http://www.sun.com/robots.txt

To create your own robots.txt file, use a basic text editor (rather than a word processor) and follow the examples that you find on other web sites.

There is also a META Tag to control access to webpages at the individual file level. Not surprisingly it is called the Robots META Tag.

 
Media Player needed to view tutorials - http://microsoft.com/windows/windowsmedia/
 
 
 
Simply creating and The Robots.TXT text file in the top folder along with your index.html file gives you control of which spiders are allowed access and to which sections of your website. You can make is as open or restrictive as you like. It can be as specific and granular as you like.
 

 

Introduction
Curriculum
Courses
Projects and
Resources
P16-Portal
EdSeek
Help
ePortfolio for Terence Sullivan
Last updated 8/01/2002
tsulliva@comwares.net