Improving efficiency and flexibility of information retrieval systems

Author(s)Wu, Hao
Date Accessioned2016-06-02T13:27:39Z
Date Available2016-06-02T13:27:39Z
Publication Date2016
AbstractThe development of information retrieval (IR) (the search engine) is one of the revolutionary techniques of the past century. It changes the way people communicate and share knowledge, and it frees people up from the hassles of seeking and remembering information, in addition to saving time and energy that can be used more effectively elsewhere. As the amount of information grows exponentially, so do the complexities and the costs. Search-engine efficiency, which is related to user experience and the provider's revenue, becomes more and more important. Existing techniques such as dynamic pruning can help improve the efficiency of IR systems. However they do not solve the whole puzzle and in many situations (e.g. long queries, large number of returned documents and etc.) their improvement of efficiency is far from enough. To improve the efficiency of query processing, we create an analytical model to explain query processing time of IR systems. This model uses few features and it is more accurate than previous. Inspired by the model, we try to solve various IR efficiency problems. First, to improve the query processing time when k (e.g. the number returned documents) is large, we propose a document prioritizing methods which can better improve efficiency than state-or-art methods without hurting effectiveness. Second, we studied a special case of long query processing called pseudo-relevance feedback and we improve its efficiency by providing a new incremental approach. Third, we further explore the trade-off between efficiency and effectiveness and propose several methods which can improve efficiency at the cost of acceptable effectiveness loss in a time-constrained environment. In additional to query processing efficiency, we also explore another kind of efficiency: how easily people can implement different search-engines. Current toolkits can help implementing various retrieval functions with their API -based framework. However their APIs are usually complicated and it is still difficult for inexperienced users to implement retrieval functions. To improve this situation, we introduced an information retrieval toolkit called Virtual IR Lab. When we compared it to existing IR toolkits, it applied a simpler but more efficient architecture. By applying automatic code-generating techniques, the toolkit can help users implement various retrieval functions conveniently. Its friendly and flexible design makes it a good fit for both education and research.en_US
AdvisorFang, Hui
DegreePh.D.
DepartmentUniversity of Delaware, Department of Electrical and Computer Engineering
Unique Identifier951021840
URLhttp://udspace.udel.edu/handle/19716/17766
PublisherUniversity of Delawareen_US
URIhttp://search.proquest.com/docview/1776481144?accountid=10457
dc.subject.lcshInformation storage and retrieval systems.
dc.subject.lcshInformation retrieval.
TitleImproving efficiency and flexibility of information retrieval systemsen_US
TypeThesisen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2016_WuHao_PhD.pdf
Size:
3.03 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.22 KB
Format:
Item-specific license agreed upon to submission
Description: