Open Access Publications
Permanent URI for this collection
Open access publications by faculty, postdocs, and graduate students in the Department of Computer and Information Sciences.
Browse
Browsing Open Access Publications by Author "A. S. M. Ashique Mahmood, Tsung-Jung Wu, Raja Mazumder, K. Vijay-Shanker"
Now showing 1 - 1 of 1
Results Per Page
Sort Options
Item DiMeX: A Text Mining System for Mutation- Disease Association Extraction(Public Library of Science, 2016-04-13) Mahmood, A. S. M. Ashique; Wu, Tsung-Jung; Mazumder, Raja; Vijay-Shanker, K.; A. S. M. Ashique Mahmood, Tsung-Jung Wu, Raja Mazumder, K. Vijay-Shanker; Mahmood, A. S. M. Ashique; Vijay-Shanker, K.The number of published articles describing associations between mutations and diseases is increasing at a fast pace. There is a pressing need to gather such mutation-disease associations into public knowledge bases, but manual curation slows down the growth of such databases. We have addressed this problem by developing a text-mining system (DiMeX) to extract mutation to disease associations from publication abstracts. DiMeX consists of a series of natural language processing modules that preprocess input text and apply syntactic and semantic patterns to extract mutation-disease associations. DiMeX achieves high precision and recall with F-scores of 0.88, 0.91 and 0.89 when evaluated on three different datasets for mutation-disease associations. DiMeX includes a separate component that extracts mutation mentions in text and associates them with genes. This component has been also evaluated on different datasets and shown to achieve state-of-the-art performance. The results indicate that our system outperforms the existing mutation-disease association tools, addressing the low precision problems suffered by most approaches. DiMeX was applied on a large set of abstracts from Medline to extract mutation-disease associations, as well as other relevant information including patient/cohort size and population data. The results are stored in a database that can be queried and downloaded at http:// biotm.cis.udel.edu/dimex/.We conclude that this high-throughput text-mining approach has the potential to significantly assist researchers and curators to enrich mutation databases.