If you wish to have some measure of control
over what is or is not indexed by spiders, and you don't wish to have
the global controlling features determined by the robots.txt file, then
the robots META attribute was designed to make your life easier.
In its complete form, it looks like the following:
<META NAME="robots" CONTENT="all | none |
index | noindex | follow | nofollow">
The default for the robot attribute is "all".
This would allow all of the files to be indexed. "None" would
tell the spider not to index any files, and not to follow the hyperlinks
on the page to other pages. "Index" indicates that this page
may be indexed by the spider, while "follow" would mean that
the spider is free to follow the links from this page to other pages.
The inverse is also true, thus this META tag:
<META NAME="robots" CONTENT=" noindex">
would tell the spider not to index this page,
but would allow it to follow subsidiary links and index those pages.
"nofollow" would allow the page itself to be indexed, but the
links could not be followed. For more information on the robots META
attribute, visit the http://www.w3.org/TR/REC-html40/appendix/notes.html#h-B.4
web for authoritative documentation on robots and the META tags associated
with optimizing pages for search engines.