2014年2月5日星期三

WEEK 5 READING NOTES

"The model decomposes into two parts: a document collection network and a query network. The document collection network is large, but can be precomputed: it maps from documents to terms to concepts. The concepts are a thesaurus-based expansion of the terms appearing in the document. The query network is relatively small but a new network needs to be built each time a query comes in, and then attached to the document network. The query network maps from query terms, to query subexpressions (built using probabilistic or ``noisy'' versions of AND and OR operators), to the user's information need. "

This kind of network is very useful in IR field. It helps the system to gather related information from the word. Once the network constructed, the whole could be retrieved. However, the adding of new network may bring problems for the system because of the storage and the speed. The network represents the boolean module and Probabilistic information retrieval. But the usage of them is not practical enough. The evalution of every models never stops. Thus, the development of construction of the network could based on newly developed models not only previous models. Also, the potential problems are needed to be noticed by developers too.

没有评论:

发表评论