A sitemap.html file is a simple webpage which contains traditional
links to all the subsections of your website. These are simple <a
href="http://site/folder/ebpage.htm/ "> links. Webbots
or the search spiders used by public search engines such as google,
altavista, teoma, etc. all use the same set of rules. They start at
the top of your website with the base page and follow "links"
from there to internal pages on the website. They do not follow script
navigators or even really like image maps. They are looking for simple
standard links. The result is that if your website has internal pages
that do not have links from the top of the website then the spiders
never find the page, never parses and index the words, and thus never
include it in the search database. Those pages are hidden from the world.
The key to making your site open and indexable by the spiders is to
give them a master key and site blueprint in the form they understand.
Give them a sitemap.html document and you give them access to your site.
Of course if you have content not intended to be shared then look at
the ROBOTS.TXT file and the ROBOTS META Tag discussions for limiting
access.