I would:
* list the most visited websites that used to be updated once a day, but haven't been updated for many months. I would like to see archive sites.
* list the top expressions used in the anchors to link to a certain site.
* create an interface that allows you to start with a site and then go to the most visited external link (in Google). The journey continues until a threshold is reached.
* define the distance between two sites: how many links do you have to click from a page of site A to go to a page from site B?
* start with a query and see how people modify that query to obtain better results.
* list the sites that deliver the most clicked news stories in Google News.
* list the most used words in a language.
* describe a site using vocabulary richness: how many different words use BBC News or MySpace?
* list the results for a query not with respect to a page relevancy, but with respect to a site relevancy. Who is more entitled to talk about Sony Ericsson K800i: BBC News or a MobileReviews.com?
* what do people that use OS2, BeOS or Amiga are searching for?
* discover unknown parts of the web: what sites have no backlinks?
* create a verbose interface for Google that explains why a page was included in the results.
EmoticonEmoticon