Jorn had a Forbes article on the IBM research this morning. Here’s the IBM site. The demo application is only for people inside the IBM firewall, but they have some sample search results compared with AltaVista searchs for the same strings.
Clever builds on Jon Kleinberg’s HITS (Hypertext-Induced Topic Search) algorithm, which seeks to find authoritative sources (Authorities) of information on the web, together with sites (Hubs) featuring good compilations of such authoritative sources. The original HITS algorithm, devised while Kleinberg was a visiting scientist at IBM Almaden, first uses a standard text search engine to gather a “root set” of pages matching the query subject. Next, it adds to the pool all pages pointing to or pointed to by the root set. Thereafter, it uses only the links between these pages to distill the best authorities and hubs. The key insight is that these links capture the annotative power (and effort) of millions of individuals independently building web pages.
This method gets around the problem of keyword trolling. For example a search for doctors on Alta Vista yielded people trying to sell medical-related domain names, while the Clever search brought up the AMA and other sites.
Stanford’s Google project does similar rankings. I tried the doctor search there and found similar results.
The problem with this backlinking approach is those of us who keep our site’s links in a backend database. We need to have a public list of links and keywords in a form for indexers to read. Dave Winer and Infoseek have proposed standards for this.
Possibly Related posts (machine generated):