A machine learning classifier is used to determine whether a web page
belongs to a blog, based on a number of characteristics of web pages
(e.g., presence of words such as "permalink", or being hosted on a known
blogging site). The classifier may be initially trained using
human-judged examples. After classifying web pages as being blog pages,
the blog pages may be further identified or categorized as top level
blogs based on their URLs, for example.