My Photo

« Quantum of Spock | Main | LexaBlog on Sentiment »

November 19, 2008

Comments

Jeremy Hylton

There was a brief thread about the blogroll problem on the Google Group for blogsearch: http://groups.google.com/group/google-blog-search/browse_thread/thread/8244fc8731f47970

What I wrote there was:
"We have changed the way we index blog posts to include the full
content of the page. We've had occasional complaints about the use of
the feed content, particularly the problem with partial feeds that you
mentioned. The indexing change has improved the results for a lot of
queries, both because we have the full content of the page and because
we extract links that are missing from the feeds. The downside of
this change is that we see more results that match only the blogroll
and other parts of the page that are common to all of a blog's posts.

We expected some problems from blogroll matches, but may have
underestimated the impact on searches using the link: operator or
where the query matches a blog or blogger's name. We do expect to fix
the problem you're seeing. We'll use the full page content, but
exclude the content that isn't really part of the post. I'm not sure
if we'll be able to make the change before the end of the year, but we
are working on it and are pretty confident that it can be solved.
We'll post an update here when we've got a solution."

The comments to this entry are closed.

Twitter Updates

    follow me on Twitter

    March 2016

    Sun Mon Tue Wed Thu Fri Sat
        1 2 3 4 5
    6 7 8 9 10 11 12
    13 14 15 16 17 18 19
    20 21 22 23 24 25 26
    27 28 29 30 31    

    Categories

    Blog powered by Typepad