Predicting nsSNPs that disrupt protein-protein interactions using docking

Journal Title
Journal ISSN
Volume Title
IEEE Computational Intelligence Society ; IEEE Computer Society ; IEEE Control Systems Society ; IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery
The human genome contains a large number of protein polymorphisms due to individual genome variation. How many of these polymorphisms lead to altered protein-protein interaction is unknown. We have developed a method to address this question. The intersection of the SKEMPI database (of affinity constants among interacting proteins) and CAPRI 4.0 docking benchmark was docked using HADDOCK, leading to a training set of 166 mutant pairs. A random forest classifier that uses the differences in resulting docking scores between the 166 mutant pairs and their wild-types was used, to distinguish between variants that have either completely or partially lost binding ability. 50% of non-binders were correctly predicted with a false discovery rate of only 2%. The model was tested on a set of 15 HIV-1 - human, as well as 7 human - human glioblastoma-related, mutant proteins pairs: 50% of combined non-binders were correctly predicted with a false discovery rate of 10%. The model was also used to identify 10 protein-protein interactions between human proteins and their HIV-1 partners that are likely to be abolished by rare non-synonymous single-nucleotide polymorphisms (nsSNPs). These nsSNPs may represent novel and potentially therapeutically-valuable targets for anti-viral therapy by disruption of viral binding.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
N. Goodacre; N. Edwards; M. Danielsen; P. Uetz; C. Wu, "Predicting nsSNPs that disrupt protein-protein interactions using docking," in IEEE/ACM Transactions on Computational Biology and Bioinformatics , vol.PP, no.99, pp.1-1 doi: 10.1109/TCBB.2016.2520931