A study of relation extraction for biomedical text

Author(s)Peng, Yifan
Date Accessioned2017-03-08T13:42:10Z
Date Available2017-03-08T13:42:10Z
Publication Date2016
AbstractA crucial area of Biomedical Natural Language Processing is relation extraction, the study of identifying relations between entities. One main challenge of relation extraction is text variations. They hinder pattern-based approaches to encode all patterns necessary for achieving a high recall, and limit the generalizability of machine-learning models especially when the size of training data is small. This thesis exams the representation of sentences for relation extraction. In particular, we are concerned with a suitable level of abstraction, which will improve the performance of the relation extraction systems, and in turn lead to advances in other text-mining fields. This thesis describes three steps along these lines. First, we propose an automatic approach for sentence simplification. It reduces the sentence complexity by detecting various syntactic constructs and generating simplified sentences. Second, we describe a framework to facilitate the development of pattern-based biomedical relation extraction systems. The framework leverages various linguistic theories to semi-automatically generate lexico-syntactic patterns. It also applies sentence simplification and semantic relations to increase the pattern coverage. Finally, we propose a structured representation, called Extended Dependency Graph (EDG). It provides an abstract representation accounting for textual variations, by not only considering syntactic dependencies between words in a sentence, but also utilizing information beyond syntax to capture dependencies. In each of these steps, we conduct experiments to evaluate the efficacy of the ideas. The results (1) show that various text-mining approaches can benefit from sentence simplification, (2) demonstrate that we can create state-of-the-art pattern-based systems using the framework to extract different types of relations, and (3) validate the utility of EDG in both pattern-based and machine-learning relation extraction systems.en_US
AdvisorShanker, Vijay
AdvisorWu, Cathy
DegreePh.D.
DepartmentUniversity of Delaware, Department of Computer and Information Sciences
Unique Identifier974785127
URLhttp://udspace.udel.edu/handle/19716/21125
PublisherUniversity of Delawareen_US
URIhttps://search.proquest.com/docview/1840890304?accountid=10457
TitleA study of relation extraction for biomedical texten_US
TypeThesisen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2016_PengYifan_PhD.pdf
Size:
1.1 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.22 KB
Format:
Item-specific license agreed upon to submission
Description: