Personalization and diversification of search results
University of Delaware
There has been lot of research in the area of information retrieval on different aspects of search such as personalization, diversity, evaluation measures etc. In this thesis, we hypothesize that personalization and diversification can coincidently exist with each other. We propose two novel approaches, one for personalization by incorporating feedback from query logs of similar users by extending the state art of personalization method, other for subtopic retrieval using N-grams as document representatives for diversity. There is a general consensus among researchers that personalization and diversity are opposed to each other since personalization advocates for information based on user interests while diversity support the maximum information gain for a given query by selecting documents which incorporate all perspectives of query. Our model aims to provide the users with maximum diverse information with consideration of user interests. For example, for a given a query "RSS" which has numerous meanings such as Rich Site Summary, Rashtriya Swayamsevak Sangh, Remote Sensing Service etc., the proposed system output should accommodate not only different aspects of the query in the output results, but also consider user interests. Given the above mentioned query, for users interested in politics, documents with the Rashtriya Swayamsevak Sangh aspect should be ranked higher compared to documents related to the technical perspective, i.e., Rich Site Summary.