Nodes of a web graph are distributed over a cluster of computers. Tables distributed over the computers map source (destination) locations to lists of destination (source) locations. To accommodate traversing hyperlinks forward, a table maps the location of a web page "X" to locations of all the web pages "X" links to. To accommodate traversing hyperlinks backward, a table maps the location of a web page "Y" to locations of all web pages that link to Y. URLs identifying web pages are mapped to fixed-sized checksums, reducing the storage required for each node, while providing a way to map a URL to a node. Mapping is chosen to preserve information about the web server component of the URL. Nodes can then be partitioned across the machines in the cluster such that nodes corresponding to URLs on the same web server are assigned to the same machine in the cluster.

 
Web www.patentalert.com

< Peer-to-peer computing architecture

> System and method for automated optimization of search result relevance

~ 00416