2014年3月13日星期四

WEEK 10 READING NOTES

"The citation (link) graph of the web is an important resource that has largely gone unused in existing web search engines. We have created maps containing as many as 518 million of these hyperlinks, a significant sample of the total. These maps allow rapid calculation of a web page’s "PageRank", an objective measure of its citation importance that corresponds well with people’s subjective idea of
importance. Because of this correspondence, PageRank is an excellent way to prioritize the results of web keyword searches. For most popular subjects, a simple text matching search that is restricted to web page titles performs admirably when PageRank prioritizes the results (demo available at
google.stanford.edu). For the type of full text searches in the main Google system, PageRank also helps a great deal."

In the past, people may think that the page which mentions about the keyword must be the relevant page for users. The evaluation of relevance is about the word frequency. Hence, in the search engine first appearance age, the working mechanism of it was far from "artificial intelligence". Search engines before Google like Altavista and Excite, they were designed to rank information basing on priority. The ranking could be impacted in many different ways. If the visits are huge or the frequency of key words is high, the page may be ranked in a high place, though the page could be hardly relevant to the requirement of users. Obviously, this kind of ranking mechanism could cause cheating activities in a easy way.

没有评论:

发表评论