Protein Ontology (PRO): enhancing and scaling up the representation of protein entities
Date
2016-11-28
Journal Title
Journal ISSN
Volume Title
Publisher
Oxford University Press
Abstract
The Protein Ontology (PRO; http://purl.obolibrary.
org/obo/pr) formally defines and describes taxonspecific
and taxon-neutral protein-related entities in
three major areas: proteins related by evolution;
proteins produced from a given gene; and proteincontaining
complexes. PRO thus serves as a tool for
referencing protein entities at any level of specificity.
To enhance this ability, and to facilitate the comparison
of such entities described in different resources,
we developed a standardized representation of proteoforms
using UniProtKB as a sequence reference
and PSI-MOD as a post-translationalmodification reference.
We illustrate its use in facilitating an alignment
between PRO and Reactome protein entities.
We also address issues of scalability, describing our
first steps into the use of text mining to identify
protein-related entities, the large-scale import of proteoform
information from expert curated resources,
and our ability to dynamically generate PRO terms.
Web views for individual terms are now more informative
about closely-related terms, including for example
an interactive multiple sequence alignment.
Finally, we describe recent improvement in semantic
utility, with PRO now represented in OWL and as a
SPARQL endpoint. These developments will further
support the anticipated growth of PRO and facilitate
discoverability of and allow aggregation of data relating
to protein entities.
Description
Publisher's PDF
Keywords
Citation
Darren A. Natale, Cecilia N. Arighi, Judith A. Blake, Jonathan Bona, Chuming Chen, Sheng-Chih Chen, Karen R. Christie, Julie Cowart, Peter D'Eustachio, Alexander D. Diehl, Harold J. Drabkin, William D. Duncan, Hongzhan Huang, Jia Ren, Karen Ross, Alan Ruttenberg, Veronica Shamovsky, Barry Smith, Qinghua Wang, Jian Zhang, Abdelrahman El-Sayed, Cathy H. Wu; Protein Ontology (PRO): enhancing and scaling up the representation of protein entities, Nucleic Acids Research, Volume 45, Issue D1, 4 January 2017, Pages D339–D346, https://doi.org/10.1093/nar/gkw1075