Small Business Software

Contains a wealth of information and resources for small business owners and entrepeneurs.

Learn more…
   
Categories

Featured Items

Free Website Content

Website Spidering

Spidering Websites
By Sharon Housley

Website Spidering refers to the automated process of indexing a web site by a search engine. An automated program, known as a web crawler or spider, will go through a website following the links on each page, and will gather pertinent information from each page until it has properly indexed the entire website.

If a search engine is unable to spider a website, it may be a unable to index some or all of the content on that site. As a result, the website may not appear in the search results from that search engine, even when associated keywords are searched for. Potential customers may use search engines to seek out a product or service, but if a website does not appear in the search results due to missing or incomplete indexing, that website may be losing out on an opportunity. As such, it is very important to make sure the search engine spiders can indeed "crawl" and index your website.

There are a number of things that webmasters can do to improve the "crawlability" of their websites to make them more spider-friendly...


Display Using HTML


HTML is by far the easiest type of content for search engines to spider. If the webmaster uses scripting or flash to display some of the site's content, the search engine spiders may have a difficult time following the links.

Use a Sitemap

Sitemaps are simply roadmaps for a website. The sitemap will help insure that all the pages on the website are indexed by the search engine. Create a proper sitemap for the website, and then submit the sitemap to the major search engines.

Sitemap Details - http://www.small-business-software.net/ins-and-outs-of-sitemaps.htm


Robots.txt

A properly-formatted robots.txt file will help direct search engine spiders to the various parts of the website that should be indexed, as well as specifying any parts that should not be indexed. The robots.txt file should be included in the website's root directory.

Secure

Keep in mind that a search engine spider can not follow links behind a password or secure server (https). Any important web pages that require indexing should never be located behind a password or secure server.

Avoid ID=

Avoid using "ID=" or similar parameters in the webpage urls. Search engines will often ignore any URLs that include an "ID=" as a parameter.

No Frames

Avoid using frames if possible. Content that is contained in a frame cannot be spidered by search engines.

Consider implementing these few easy steps to increase the spiderability of your website, to help insure that the site will be properly indexed.


About the Author:
Sharon Housley manages marketing for FeedForAll http://www.feedforall.com software for creating, editing, publishing RSS feeds and podcasts. In addition Sharon manages marketing for RecordForAll http://www.recordforall.com audio recording and editing software.

**********************************************************

This article may be used freely in opt-in publications and websites, provided that the resource box is included and the links are active. A courtesy copy of the issue or a link to any online posting would be greatly appreciated send an email to sharon@notepage.net .

Additional articles available for publication available at http://www.small-business-software.net/free-website-content.htm

**********************************************************

 

 


FeedForAll
  |   RSS Specifications   |     Podcasting Software   |   Web Logos   | Stock Photos  | Web Templates | Business Cards | Wild Animal Gifts
Copyright (c) 2003-2012 NotePage, Inc. All rights reserved. Google