Biomedical relation extraction with reduced manual effort

Li, Gang

Biomedical relation extraction with reduced manual effort

Author(s)	Li, Gang
Date Accessioned	2018-09-17T12:15:03Z
Date Available	2018-09-17T12:15:03Z
Publication Date	2018
SWORD Update	2018-07-27T13:03:50Z
Abstract	Biomedical relation extraction is an critical text-mining task that concerns automatic extraction of related bio-entities in text. Rule-based and machine learning methods are two main approaches for relation extraction. While these two methods can be used to develop high-performance relation extraction system, considerable amount of manual effort is required by both methods in different phases of the system development. This hinders fast application of both methods to extract new types of relations. ☐ This dissertation focuses on developing techniques to assist the development of biomedical relation extraction systems using rule-based and machine learning methods. For rule-based systems, one main component requiring manual effort is pattern design. Domain experts often need to examine considerable amount of documents and exhaustively collect every pattern that can be used to extract relations. We leverage various linguistic knowledge to automatically generate a comprehensive set of patterns. Our first approach is instantiated to develop miRTex, a system that extracts three kinds of miRNA-gene relations that regulate a wide range of biological processes and are involved with diseases. Only a small number of triggers and rules are needed to achieve the state-of-the-art performance. Our second approach is to translate the ideas in Lexicalized Tree Adjoining Grammar to dependency graph for pattern generation, and adopt Extended Dependency Graph as an abstract sentence representation. This approach is applied to extract five type of post-translational modifications, a class of relations that plays an important role in cellular functions. Evaluations on BioNLP 2011 EPI task show that the resulting system achieves state-of-the-art performance. ☐ For machine learning systems, a sizable training corpus is needed to train the extraction model, while the annotation of the corpus is time- and labor-intensive. We adopt distant supervision in two ways. Our first contribution is to develop noise reduction techniques to improve the data quality of the automatically generated large training set, leading to improvement over existing results for distantly supervised biomedical relation extraction. Secondly, we employ distant supervision in conjunction with human-labeled data and deep neural networks to achieve state-of-the-art performance on some benchmark relation extraction tasks.	en_US
Advisor	Shanker, Vijay K.
Advisor	Wu, Cathy H.
Degree	Ph.D.
Department	University of Delaware, Department of Computer and Information Sciences
DOI	https://doi.org/10.58088/4cgk-n623
Unique Identifier	1052612895
URL	http://udspace.udel.edu/handle/19716/23793
Language	en
Publisher	University of Delaware	en_US
URI	https://search.proquest.com/docview/2089996568?accountid=10457
Keywords	Applied sciences	en_US
Keywords	BioNLP	en_US
Keywords	Biomedical text mining	en_US
Keywords	Deep learning	en_US
Keywords	Distant supervision	en_US
Keywords	Pattern generation	en_US
Keywords	Relation extraction	en_US
Title	Biomedical relation extraction with reduced manual effort	en_US
Type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Li_udel_0060D_13327.pdf
Size:: 2.77 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.22 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Doctoral Dissertations (Winter 2014 to Present)