Submitted by Compugasm on April 21, 2006 - 02:09
Google has Sitemaps. This is a new way for tell search engines what is on your site. When I was on a windows server, my host had an automatic way to create a google sitemap. Now that I'm linux, there is no automated method. I found this pretty comprehensive resource from Sci7.com.
GoogleÃ¢â‚¬â„¢s website describes the service:
The Sitemap Protocol allows you to inform search engine crawlers about URLs on your Web sites that are available for crawling. A Sitemap consists of a list of URLs and may also contain additional information about those URLs, such as when they were last modified, how frequently they change, etc.
Google has made the sitemap available for use by other search engines. It can be incorporated into webserver and content management software.
Why use a Google Sitemap
- The sitemap enables you to indicate the relative importance of the pages on your site. Perhaps you have a product page in your catalogue that is linked to from lots of sites on the web, and therefore it has a high pagerank. This page may often appear above your home page in the search engine results page even for searches that are for your company generally, not that specific page.
- Do you have dynamic content that is not indexed by search engines? Because sitemaps are new, it should not be relied upon to ensure deep indexing. But the goal is to help find pages in your site, a search engine would normally not find.
- The sitemap protocol enables webmasters to suggest to search engine robots how often particular pages should be indexed. This could potentially reduce the bandwidth used by search engine robots on dynamic sites.
It is worth noting that much of this functionality has been available for a number of years, META tags can be used to indicate the frequency with which a page is expected to change, and RSS and RDF formats have provided lists of deep links to search engines. GoogleÃ¢â‚¬â„¢s Sitemap webpages note RSS and even a plain text file with URLs on each line as one of a number of ways of providing some of the information which can now be presented in a sitemap file to the search engines.
Telling search engines about a sitemap
Currently Google is the only search engine with a mechanism for allowing webmasters to request their sitemaps are read, and there are two options to do this. If you have a Wordpress blog, you can rebuild the sitemap. Check the Wordpress documentation on how to do this.
If you don't have Wordpress, you can signup for a Google account and add your sitemaps manually via this form on GoogleÃ¢â‚¬â„¢s site. This enables you to monitor how often your sitemap is being reviewed by Google and enables error notifications to be viewed.
- https://www.google.com/webmasters/sitemaps - GoogleÃ¢â‚¬â„¢s official site on the protocol.
- Google Groups - discussion on the sitemap protocol.
- Code for producing sitemaps:
There are a number of sources of information from which a sitemap can be generated. The simplest being the webserverÃ¢â‚¬â„¢s file system. However this is not appropriate for most modern dynamic database driven sites, for which generating the sitemap from the database will probably be the most appropriate course of action. A further source of information is webserver logs. Generating sitemaps from server logs opens up the possibility of assigning priority to particular pages based on their popularity.
- Sitemap generator for vBulletin
- Sitemap generator for Serendipity sites.
- GoogleÃ¢â‚¬â„¢s Sitemap generator - written in Python - can extract data from log files or a directory structure.
- Sitemap for Movabletype, and another with better handling of the priority and lastmodified fields.
- A Perl Module - establishing a Sitemap object - while this doesnÃ¢â‚¬â„¢t provide much functionality at the moment itÃ¢â‚¬â„¢s use will aid the interoperability of scripts producing sitemaps complient with the new standard.
- An ASP script for producing a Google sitemap based on the structure of the filesystem.
- Online tool for turning a list of links into an XML Google site map. As Google accepts a plain text file containing URLs as a sitemap this is of questionable usefullness.
Author: Sci7 Ltd. Cambridge UK.