User simulations in search sessions

Date
2019
Journal Title
Journal ISSN
Volume Title
Publisher
University of Delaware
Abstract
When users interact with the search engine to satisfy their information need, in most cases they reformulate their queries many times before abandoning the search due to their evolving information need, or low-quality search results, or broad information need that needs many reformulations or some other unclear reasons we do not know. During the search process, users inspect the listed snippets on search result pages, click to the documents that they find relevant and attractive, spend time on the clicked documents and eventually abandon the search. ☐ User models have been a vital part of information retrieval research for understanding users' behavior on the search process and assisting them in retrieving useful information for their information need. In this thesis, we present several efforts for building user models in information search scenario. We first introduce models that predict a single click on a search engine result page that make use of other snippets that are listed along with the snippet we predict user click, users' search history, and other users' search histories. Then we propose models that estimate user dwell time on clicked documents using document, snippet and session features. ☐ Next, we model user search session abandonment by using users' defined information needs, submitted queries, session features, document features and features extracted from other users' sessions. We show that session and query features have a major impact on predicting users' session abandonment decision. ☐ Our next endeavor consists in generating user search queries when information needs are clearly defined. We use a two-phase process by which we first generate queries by sampling from a language model, and then score them based on their discriminative power among topics. Evaluation of the query generation methods is the hardest part but for this task, we care most about whether the queries we generate were "good" for evaluating retrieval systems, and in particular, whether they were good for evaluating systems that use features derived from session history. We found that generated search queries create user search sessions very similar to actual search sessions. ☐ Finally, our last effort consists in modeling complete artificial user search sessions by simulating query reformulations, user click decisions, user dwell times and session abandonments. This phase is the most important phase because we put all our machine-learned user models together to create artificial search sessions and search answers to following questions; (1) "Can artificially created search sessions satisfy user information needs?" and (2) "Can users classify artificial and actual user sessions?" with a user experiment. For user experiment, we pick 48 actual sessions for 8 topics randomly from a user search session corpus and merge these actual search sessions with the generated 48 search sessions in a corpus. Then we ask users to follow the search sessions and provide their search satisfaction in terms of scores 1 to 5 such that 1 means no satisfaction and 5 means complete satisfaction. We also ask users to predict the provided session type. They answer to this question by selecting "C" for computer-generated session and "A" for actual user search session. Our findings show that search session type has no significant importance on user search satisfaction and users are barely better than random in predicting search session type.
Description
Keywords
Click prediction, Search sessions, Session abandonment, User models, User simulations
Citation