Hypothesizing Desired Graph Characteristics from Queries
University of Delaware
Information-graphics (non-pictorial graphics such as bar charts and line graphs that depict attributes of entities and relations among entities) have attracted increased attention from both the research and industrial worlds. While a lot of attention has been given in information retrieval research to retrieving textual documents, relatively little work has been done to retrieve information-graphics in response to user-written queries. Common image retrieval techniques, such as Image Meta search and Contentbased Image Retrieval (CBIR), work well for ranking ordinary images and pictures but often fall short when dealing with information-graphics since they do not take into account their structure and content. This research attempts to analyze a user's query to identify the content of the independent and dependent axes of graphs that might be relevant to the query and to identify the kind of high-level message that would be conveyed by relevant graphs. Natural language processing techniques are used to extract features from the query and machine learning is used to build a model for hypothesizing the content of the axes and the intended message. Results have shown that the constructed models can achieve an accuracy of about 81%.