Browsing by Author "Mahmood, A. S. M. Ashique"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item DiMeX: A Text Mining System for Mutation- Disease Association Extraction(Public Library of Science, 2016-04-13) Mahmood, A. S. M. Ashique; Wu, Tsung-Jung; Mazumder, Raja; Vijay-Shanker, K.; A. S. M. Ashique Mahmood, Tsung-Jung Wu, Raja Mazumder, K. Vijay-Shanker; Mahmood, A. S. M. Ashique; Vijay-Shanker, K.The number of published articles describing associations between mutations and diseases is increasing at a fast pace. There is a pressing need to gather such mutation-disease associations into public knowledge bases, but manual curation slows down the growth of such databases. We have addressed this problem by developing a text-mining system (DiMeX) to extract mutation to disease associations from publication abstracts. DiMeX consists of a series of natural language processing modules that preprocess input text and apply syntactic and semantic patterns to extract mutation-disease associations. DiMeX achieves high precision and recall with F-scores of 0.88, 0.91 and 0.89 when evaluated on three different datasets for mutation-disease associations. DiMeX includes a separate component that extracts mutation mentions in text and associates them with genes. This component has been also evaluated on different datasets and shown to achieve state-of-the-art performance. The results indicate that our system outperforms the existing mutation-disease association tools, addressing the low precision problems suffered by most approaches. DiMeX was applied on a large set of abstracts from Medline to extract mutation-disease associations, as well as other relevant information including patient/cohort size and population data. The results are stored in a database that can be queried and downloaded at http:// biotm.cis.udel.edu/dimex/.We conclude that this high-throughput text-mining approach has the potential to significantly assist researchers and curators to enrich mutation databases.Item Text mining of mutations and their impact from biomedical literature(University of Delaware, 2018) Mahmood, A. S. M. AshiqueThe increasing amount of research focusing on genetic mutations has triggered a rapid growth in the number of published articles describing mutations and their effect on diseases, drug responses and protein functionalities. With the advent of precision medicine, which aims at identifying targeted therapies that have maximal efficacy for individual patients, there is a pressing need to gather such mutational information from text into public knowledge bases. But manual curation slows down the growth of such databases. We have applied natural language processing (NLP) techniques to locate and extract mutational information from text that will assist curators and researchers. In particular, in this dissertation, we have addressed the following tasks: mutation detection, mutation to disease association, mutation impact on drug responses and impact of mutations on protein-protein interactions from research literature. ☐ We have developed a system, MeX, to detect a wide range of mutation mentions from text. Evaluations on several publicly available corpora exhibit that we have achieved state-of-the-art performance in mutation detection. The mutation detector also applies a novel algorithm to associate mutations with genes. We have developed a system, DiMeX, which finds the association between mutations and diseases from abstracts of published articles. Our system outperformed the current state-of-the-art when evaluated on multiple corpora. We have developed a system, eGARD, to identify the impact of genomic anomalies on drug responses. Evaluations showed high performance measures from eGARD that will significantly reduce manual curation time. Finally, we have developed a text mining system to extract mutation impact on protein-protein interaction. This type of information will provide further insight into how mutations affect protein functions, and thereby play a role in the development and progression of diseases. Our system outperformed the current state-of-the-art approaches for the task. To enable easier access to data and make it available to computational bioinformatics tools, we have applied DiMeX and eGARD on Medline-scale and stored the results in databases.Item WebGIVI: a web-based gene enrichment analysis and visualization tool(BioMed Central, 2017-05-04) Sun, Liang; Zhu, Yongnan; Mahmood, A. S. M. Ashique; Tudor, Catalina O.; Ren, Jia; Vijay-Shanker, K.; Chen, Jian; Schmidt, Carl J.; Liang Sun, Yongnan Zhu, A. S. M. Ashique Mahmood, Catalina O. Tudor, Jia Ren, K. Vijay-Shanker, Jian Chen and Carl J. Schmidt; Sun, Liang; Mahmood, A. S. M. Ashique; Tudor, Catalina O.; Ren, Jia; Vijay-Shanker, K.; Schmidt, Carl J.BACKGROUND: A major challenge of high throughput transcriptome studies is presenting the data to researchers in an interpretable format. In many cases, the outputs of such studies are gene lists which are then examined for enriched biological concepts. One approach to help the researcher interpret large gene datasets is to associate genes and informative terms (iTerm) that are obtained from the biomedical literature using the eGIFT text-mining system. However, examining large lists of iTerm and gene pairs is a daunting task. RESULTS: We have developed WebGIVI, an interactive web-based visualization tool (http://raven.anr.udel.edu/webgivi/) to explore gene:iTerm pairs. WebGIVI was built via Cytoscape and Data Driven Document JavaScript libraries and can be used to relate genes to iTerms and then visualize gene and iTerm pairs. WebGIVI can accept a gene list that is used to retrieve the gene symbols and corresponding iTerm list. This list can be submitted to visualize the gene iTerm pairs using two distinct methods: a Concept Map or a Cytoscape Network Map. In addition, WebGIVI also supports uploading and visualization of any two-column tab separated data. CONCLUSIONS: WebGIVI provides an interactive and integrated network graph of gene and iTerms that allows filtering, sorting, and grouping, which can aid biologists in developing hypothesis based on the input gene lists. In addition, WebGIVI can visualize hundreds of nodes and generate a high-resolution image that is important for most of research publications. The source code can be freely downloaded at https://github.com/sunliang3361/WebGIVI. The WebGIVI tutorial is available at http://raven.anr.udel.edu/webgivi/tutorial.php.