Open Access Publications
Permanent URI for this collection
Open access publications by faculty, staff, postdocs, and graduate students at the Center for Bioinformatics and Computational Biology.
Browse
Browsing Open Access Publications by Author "Arighi,Cecilia"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
Item BioC-compatible full-text passage detection for protein-protein interactions using extended dependency graph(Oxford University Press, 4/12/16) Peng,Yifan; Arighi,Cecilia; Wu,Cathy H.; Vijay-Shanker,K.; Yifan Peng, Cecilia Arighi, Cathy H. Wu and K. Vijay-Shanker; Arighi, Cecilia Noemi; Wu, Cathy Huey-Hwa; Shanker, Vijay KThere has been a large growth in the number of biomedical publications that report experimental results. Many of these results concern detection of protein-protein interactions (PPI). In BioCreative V, we participated in the BioC task and Developmenteloped a PPI system to detect text passages with PPIs in the full-text articles. By adopting the BioC format, the output of the system can be seamlessly added to the biocuration pipeline with little effort required for the system integration. A distinctive feature of our PPI system is that it utilizes extended dependency graph, an intermediate level of representation that attempts to abstract away syntactic variations in text. As a result, we are able to use only a limited set of rules to extract PPI pairs in the sentences, and additional rules to detect additional passages for PPI pairs. For evaluation, we used the 95 articles that were provided for the BioC annotation task. We retrieved the unique PPIs from the BioGRID database for these articles and show that our system achieves a recall of 83.5%. In order to evaluate the detection of passages with PPIs, we further annotated Abstract and Results sections of 20 documents from the dataset and show that an f-value of 80.5% was obtained. To evaluate the generalizability of the system, we also conducted experiments on AIMed, a well-known PPI corpus. We achieved an f-value of 76.1% for sentence detection and an f-value of 64.7% for unique PPI detection.Item BioCreative V BioC track overview: collaborative biocurator assistant task for BioGRID(Oxford University Press, 8/2/16) Kim,Sun; Dogan,Rezarta Islamaj; Chatr-Aryamontri,Andrew; Chang,Christie S.; Oughtred,Rose; Rust,Jennifer; Batista-Navarro,Riza; Carter,Jacob; Ananiadou,Sophia; Matos,Sergio; Santos,Andre; Campos,David; Oliveira,Jose Luis; Singh,Onkar; Jonnagaddala,Jitendra; Dai,Hong-Jie; Su,Emily Chia-Yu; Chang,Yung-Chun; Su,Yu-Chen; Chu,Chun-Han; Chen,Chien Chin; Hsu,Wen-Lian; Peng,Yifan; Arighi,Cecilia; Wu,Cathy H.; Vijay-Shanker,K.; Aydin,Ferhat; Husunbeyi,Zehra Melce; Ozgur,Arzucan; Shin,Soo-Yong; Kwon,Dongseop; Dolinski,Kara; Tyers,Mike; Wilbur,W. John; Comeau,Donald C.; Sun Kim, Rezarta Islamaj Do gan, Andrew Chatr-Aryamontri, Christie S. Chang, Rose Oughtred, Jennifer Rust, Riza Batista-Navarro, Jacob Carter, Sophia Ananiadou, Se� rgio Matos, Andre� Santos, David Campos, Jose�Lu?s Oliveira, Onkar Singh, Jitendra Jonnagaddala, Hong-Jie Dai, Emily Chia-Yu Su, Yung-Chun Chang, Yu-Chen Su, Chun-Han Chu, Chien Chin Chen,Wen-Lian Hsu,Yifan Peng, Cecilia Arighi,Cathy H. Wu, K. Vijay-Shanker, Ferhat Ayd?n, Zehra Melce Husunbey, Arzucan Ozgu, Soo-Yong Shin, Dongseop Kwon, Kara Dolinski, Mike Tyers, W. John Wilbur and Donald C. Comeau; Arighi, Cecilia Noemi; Wu, Cathy Huey-Hwa; Shanker, Vijay KBioC is a simple XML format for text, annotations and relations, and was Developmenteloped to achieve interoperability for biomedical text processing. Following the success of BioC in BioCreative IV, the BioCreative V BioC track addressed a collaborative task to build an assistant system for BioGRID curation. In this paper, we describe the framework of the collaborative BioC task and discuss our findings based on the user survey. This track consisted of eight subtasks including gene/protein/organism named entity recognition, protein-protein/genetic interaction passage identification and annotation visualization. Using BioC as their data-sharing and communication medium, nine teams, world-wide, participated and contributed either new methods or improvements of existing tools to address different subtasks of the BioC track. Results from different teams were shared in BioC and made available to other teams as they addressed different subtasks of the track. In the end, all submitted runs were merged using a machine learning classifier to produce an optimized output. The biocurator assistant system was evaluated by four BioGRID curators in terms of practical usability. The curators' feedback was overall positive and highlighted the user-friendly design and the convenient gene/protein curation tool based on text mining.