| Literature DB >> 27536940 |
Jonas Reeb1,2, Maximilian Hecht1, Yannick Mahlich1,3,4, Yana Bromberg3,4, Burkhard Rost1,4,5.
Abstract
Developments in experimental and computational biology are advancing our understanding of how protein sequence variation impacts molecular protein function. However, the leap from the micro level of molecular function to the macro level of the whole organism, e.g. disease, remains barred. Here, we present new results emphasizing earlier work that suggested some links from molecular function to disease. We focused on non-synonymous single nucleotide variants, also referred to as single amino acid variants (SAVs). Building upon OMIA (Online Mendelian Inheritance in Animals), we introduced a curated set of 117 disease-causing SAVs in animals. Methods optimized to capture effects upon molecular function often correctly predict human (OMIM) and animal (OMIA) Mendelian disease-causing variants. We also predicted effects of human disease-causing variants in the mouse model, i.e. we put OMIM SAVs into mouse orthologs. Overall, fewer variants were predicted with effect in the model organism than in the original organism. Our results, along with other recent studies, demonstrate that predictions of molecular effects capture some important aspects of disease. Thus, in silico methods focusing on the micro level of molecular function can help to understand the macro system level of disease.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27536940 PMCID: PMC4990455 DOI: 10.1371/journal.pcbi.1005047
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Fig 1Predictions of SAV effects upon function and disease across species.
The numbers above bars give the number of SAVs in the set. A: Three methods (SNAP2 [16], SIFT [27], PolyPhen-2 [12]) predicted SAV effects upon molecular function (TrEffect/TrNeutral) and upon disease (OMIM). Exclusively for this panel SNAP2 was trained without using disease SAVs from OMIM [5] or HumVar [28]. The SNAP2 version trained exclusively on molecular function clearly captured aspects of OMIM-disease SAVs (leftmost bar OMIM higher than 2nd to the left TrEffect). TrNeutral was the SNAP2 training set of variants without effect. Comparing the bars for TrNeutral and OMIM for each method pointed to differential thresholds: Polyphen-2 correctly predicted more effect in OMIM than SNAP2 but also incorrectly predicted more effect in the neutral data, i.e. simply predicted more effect variants. B: OMIM is repeated from A. SNAP2 captured disease signals in humans and animals at similar levels. OMIA contained disease SAVs from animals other than mouse and rat (mostly dog and cattle). C: SNAP2 predicted OMIM SAVs with less effect in mouse orthologs than in human. Left bar (OMIM with mouse ortholog): SNAP2 predictions for the subset of all 4,229 OMIM SAVs for which we found a mouse ortholog. Right bar (OMIM in mouse): SNAP2 predictions when putting the human SAV into the mouse sequence. D: Disease variants happen in non-random positions. Left bar (NotOMIM conserved): in each protein with an OMIM SAV, we predicted the effect of all SAVs with a level of sequence conservation ≥ that of the OMIM variant. Right bar (NotOMIM not conserved): predictions for SAVs in non-OMIM positions with conservation < that of the OMIM SAV. Obviously, OMIM SAVs were very well conserved.