Literature DB >> 35402908

Machine-learning of complex evolutionary signals improves classification of SNVs.

Sapir Labes1, Doron Stupp1, Naama Wagner2, Idit Bloch1, Michal Lotem3, Ephrat L Lahad4, Paz Polak5, Tal Pupko2, Yuval Tabach1.   

Abstract

Conservation is a strong predictor for the pathogenicity of single-nucleotide variants (SNVs). However, some positions that present complex conservation patterns across vertebrates stray from this paradigm. Here, we analyzed the association between complex conservation patterns and the pathogenicity of SNVs in the 115 disease-genes that had sufficient variant data. We show that conservation is not a one-rule-fits-all solution since its accuracy highly depends on the analyzed set of species and genes. For example, pairwise comparisons between the human and 99 vertebrate species showed that species differ in their ability to predict the clinical outcomes of variants among different genes using conservation. Furthermore, certain genes were less amenable for conservation-based variant prediction, while others demonstrated species that optimize prediction. These insights led to developing EvoDiagnostics, which uses the conservation against each species as a feature within a random-forest machine-learning classification algorithm. EvoDiagnostics outperformed traditional conservation algorithms, deep-learning based methods and most ensemble tools in every prediction-task, highlighting the strength of optimizing conservation analysis per-species and per-gene. Overall, we suggest a new and a more biologically relevant approach for analyzing conservation, which improves prediction of variant pathogenicity.
© The Author(s) 2022. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics.

Entities:  

Year:  2022        PMID: 35402908      PMCID: PMC8988715          DOI: 10.1093/nargab/lqac025

Source DB:  PubMed          Journal:  NAR Genom Bioinform        ISSN: 2631-9268


  72 in total

1.  Maximum-likelihood phylogenetic analysis under a covarion-like model.

Authors:  N Galtier
Journal:  Mol Biol Evol       Date:  2001-05       Impact factor: 16.240

2.  The UCSC Table Browser data retrieval tool.

Authors:  Donna Karolchik; Angela S Hinrichs; Terrence S Furey; Krishna M Roskin; Charles W Sugnet; David Haussler; W James Kent
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

3.  MutationTaster evaluates disease-causing potential of sequence alterations.

Authors:  Jana Marie Schwarz; Christian Rödelsperger; Markus Schuelke; Dominik Seelow
Journal:  Nat Methods       Date:  2010-08       Impact factor: 28.547

Review 4.  Mitotic homologous recombination maintains genomic stability and suppresses tumorigenesis.

Authors:  Mary Ellen Moynahan; Maria Jasin
Journal:  Nat Rev Mol Cell Biol       Date:  2010-03       Impact factor: 94.444

5.  Identification of deleterious mutations within three human genomes.

Authors:  Sung Chun; Justin C Fay
Journal:  Genome Res       Date:  2009-07-14       Impact factor: 9.043

6.  ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R.

Authors:  Emmanuel Paradis; Klaus Schliep
Journal:  Bioinformatics       Date:  2019-02-01       Impact factor: 6.937

7.  Prophylactic oophorectomy in carriers of BRCA1 or BRCA2 mutations.

Authors:  Timothy R Rebbeck; Henry T Lynch; Susan L Neuhausen; Steven A Narod; Laura Van't Veer; Judy E Garber; Gareth Evans; Claudine Isaacs; Mary B Daly; Ellen Matloff; Olufunmilayo I Olopade; Barbara L Weber
Journal:  N Engl J Med       Date:  2002-05-20       Impact factor: 91.245

8.  PhyloGene server for identification and visualization of co-evolving proteins using normalized phylogenetic profiles.

Authors:  Ilyas R Sadreyev; Fei Ji; Emiliano Cohen; Gary Ruvkun; Yuval Tabach
Journal:  Nucleic Acids Res       Date:  2015-05-09       Impact factor: 16.971

9.  Co-evolution based machine-learning for predicting functional interactions between human genes.

Authors:  Doron Stupp; Elad Sharon; Idit Bloch; Marinka Zitnik; Or Zuk; Yuval Tabach
Journal:  Nat Commun       Date:  2021-11-09       Impact factor: 14.919

10.  Identification of small RNA pathway genes using patterns of phylogenetic conservation and divergence.

Authors:  Yuval Tabach; Allison C Billi; Gabriel D Hayes; Martin A Newman; Or Zuk; Harrison Gabel; Ravi Kamath; Keren Yacoby; Brad Chapman; Susana M Garcia; Mark Borowsky; John K Kim; Gary Ruvkun
Journal:  Nature       Date:  2012-12-23       Impact factor: 49.962

View more
  1 in total

Review 1.  Cutting-Edge AI Technologies Meet Precision Medicine to Improve Cancer Care.

Authors:  Peng-Chan Lin; Yi-Shan Tsai; Yu-Min Yeh; Meng-Ru Shen
Journal:  Biomolecules       Date:  2022-08-17
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.