2014年1月29日星期三

WEEK 4 READING NOTES

"Many users, particularly professionals, prefer Boolean query models. Boolean queries are precise: a document either matches the query or it does not. This offers the user greater control and transparency over what is retrieved. And some domains, such as legal materials, allow an effective means of document ranking within a Boolean model: Westlaw returns documents in reverse chronological order, which is in practice quite effective. In 2007, the majority of law librarians still seem to recommend terms and connectors for high recall searches, and the majority of legal users think they are getting greater control by using them. However, this does not mean that Boolean queries are more effective for professional searchers. Indeed, experimenting on a Westlaw subcollection, Turtle (1994) found that free text queries produced better results than Boolean queries prepared by Westlaw's own reference librarians for the majority of the information needs in his experiments. A general problem with Boolean search is that using AND operators tends to produce high precision but low recall searches, while using OR operators gives low precision but high recall searches, and it is difficult or impossible to find a satisfactory middle ground."

Boolean query models allow users to simplify their queries in an efficient way, but there are still many limitations for using the model. The high precision and high recall cannot be achieved at the same time by far. Boolean model and vector space model could only achieve one of them. For users, sometimes they may want high recall, sometimes they may want high precision. They need to know how to choose model in a proper way. For professional searchers, they know the backgroud information about those models so they can choose the proper one. But for searchers who are not familiar with those models, it is hard to make choices. And they may just know how to use one of them. Therefore, the trainning of retrieval models for users is very necessary.

没有评论:

发表评论