Tuesday, November 09, 2004

Affiliate sites in Google: thread & A study of host pairs with replicated content

Anyone besides me not swallowed the "Hilltop" magic pill yet?: Posred by "caveman Nov 4, 2004 (utc 0) WRT Hilltop, there are two differnet areas of assessment that we have paid a lot of attention to:

1) affiliation, and its consequences

2) themed links, and their consequences

Tested: "One thing we did was to identify a pair of very similar sites in different categories. The sites were deemed similar by virtue of size, construction, PR, linking patterns, and performance in the SERP's. Call them site A and site B.

For site A we went and got 20 good backlinks (PR 6-7) from non-affiliated sites, in categories unrelated to site A's category. No help; the site stayed buried.

For site B we went and got 8 good backlinks (PR 5-7) from closely related sites (two hubs, six authority). Within four weeks site B had popped back to its former glory while most webmasters in the immediate post Florida environment were still bemoaning the disappearance of their sites...

caveman concludes: "post Florida the URL's were typically associated with authority sites. Before Florida, when we saw that, the URL's more typically reflected high PR pages. The assumption here is that a really important backlink is displayed, but that seems a good assumption to me.

On a related note, though I can't call this technically Hilltop, we have virtual certainty that links from unaffiliated, relevant pages that are tightly connected to our own topics perform better than identical links from unrelated pages, for certain kw searches"

Caveman later posts: "The way I read it, the Hilltop/LocalRank "affiliate" filter is quite subtle... ...would need a pretty heavily cross/interlinked domain farm targeting a single category with relatively few "outside" links for a dramatic drop in the SERPs.

ciml; "Monika Henzinger co-wrote an interesting paper on affiliation detection " A study of host pairs with replicated content

we define two hosts to be mirrors if:

The paper proceeds as follows: in Section 2 we establish a classification of mirroring; Section 3 describes our approach to detecting and classifying mirrored hosts; Section 4 presents data from our experiment; Section 5 discusses motives for mirroring; Section 6 presents other applications of this technique; Section 7 mentions related work and in Section 8 we draw some conclusions.

A high percentage of paths (that is, the portions of the URL after the hostname) are valid on both web sites, and These common paths link to documents that have similar content. Therefore, hosts that replicate content but rename paths are not considered mirrors under our definition

No comments: