Wednesday, June 15, 2005

Problems With Google Sitemaps

DMNews.com: "By: Stephan Spencer Netconcepts"

Outlines what he sees as two major problems, one of which he has a solution for sale....

"First, it doesn’t solve the duplicate pages problem that a great many dynamic sites have. Even the Google Store suffers from this....

Duplicate pages, on its own, may not sound like a problem for Webmasters as much as it is for Google itself, which has to dedicate additional resources to maintain all this redundant content in its index. However, it does have serious implications for Webmasters, because it results in PageRank dilution -- where multiple versions of a page split up the “votes” (links) and PageRank score that a single version of the page would aggregate.

This brings me to the second, related problem with Google Sitemaps: It doesn’t do anything to alleviate the phenomenon of PageRank dilution. PageRank dilution results in lower PageRank, which in turn results in lower rankings. For example, consider that the above-mentioned Google Store’s product page (the “Black is Back T-Shirt”) is in Google’s index five times instead of just once. So each of those five variations earns only a fraction of the total potential PageRank score that it could have earned if all the links pointed to a single “Black is Back T-Shirt” page"....

He suggests both of the above issues could be rectified: by extending robots.txt with some additional directives "that specify:

· Which parameter in a dynamic URL is the “key field.”

· Which parameter is the product ID and which is the category ID (specifically for online catalogs).

· Which parameters are superfluous or that don’t significantly vary the content displayed.

Armed with this information, Googlebot will be able to not only eliminate duplicate pages but also intelligently choose the most appropriate version to save in its index and then associate with that page the PageRank of ALL versions of the page. The days of session IDs killing a site’s Google visibility would be over! Google admits in its Sitemaps FAQ that session IDs are still a problem even with the advent of Google Sitemaps:

Question: URLs on my site have session IDs in them. Do I need to remove them? Yes. Including session IDs in URLs may result in incomplete and redundant crawling of your site. "



No comments: