Thursday, March 5, 2009

Factors responsible for Non-indexing of a Web Page

  • Spam penalties. If you've been caught violating the search engines' terms of service (Spamming), they'll drastically scale back the pages in the index until you beg for reinclusion.

  • Hidden links.If the navigation to your site is hidden within JavaScript, Flash, or other non-HTML methods, the search engine spiders are unlikely to be able to follow them.

  • Dynamic URLs. If your URLs are excessively long, or have many parameters, or contain ID or session parameters, the search engines might elect not to index them.

  • Incorrect robots.txt file. Your robots.txt file tells the search spider which pages to include and exclude from the crawling--if you've coded the file incorrectly, you might be excluding lots of pages you meant to include.

  • Incorrect robots tagging. Just like the robots.txt file, a robots metatag tells the spider to include or exclude an individual page--you might be telling the spider to exclude the page by mistake.

  • Poor quality pages. If your page is excessively long, contains HTML coding errors, or uses frames, it's unlikely to be indexed correctly.
  • Improper redirects. If your page uses a meta refresh or Java Script redirect, spiders ignore them and don't index the page.
  • User interaction required. If your page launches a pop-up window, or demands that a form be filled out, spiders won't be able to comply.

No comments:

Post a Comment

 
Google PageRank Checking tool