Literature DB >> 31069506

Gene pathogenicity prediction of Mendelian diseases via the random forest algorithm.

Sijie He1,2,3,4, Weiwei Chen1,2, Hankui Liu2,3,4, Shengting Li2, Dongzhu Lei5, Xiao Dang2,3,4, Yulan Chen2,3,4, Xiuqing Zhang6,7, Jianguo Zhang8,9,10.   

Abstract

The study of Mendelian diseases and the identification of their causative genes are of great significance in the field of genetics. The evaluation of the pathogenicity of genes and the total number of Mendelian disease genes are both important questions worth studying. However, very few studies have addressed these issues to date, so we attempt to answer them in this study. We calculated the gene pathogenicity prediction (GPP) score by a machine learning approach (random forest algorithm) to evaluate the pathogenicity of genes. When we applied the GPP score to the testing gene set, we obtained an accuracy of 80%, recall of 93% and area under the curve of 0.87. Our results estimated that a total of 10,384 protein-coding genes were Mendelian disease genes. Furthermore, we found the GPP score was positively correlated with the severity of disease. Our results indicate that GPP score may provide a robust and reliable guideline to predict the pathogenicity of protein-coding genes. To our knowledge, this is the first trial to estimate the total number of Mendelian disease genes.

Entities:  

Mesh:

Year:  2019        PMID: 31069506     DOI: 10.1007/s00439-019-02021-9

Source DB:  PubMed          Journal:  Hum Genet        ISSN: 0340-6717            Impact factor:   4.132


  21 in total

1.  SIFT: Predicting amino acid changes that affect protein function.

Authors:  Pauline C Ng; Steven Henikoff
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

Review 2.  Exome sequencing as a tool for Mendelian disease gene discovery.

Authors:  Michael J Bamshad; Sarah B Ng; Abigail W Bigham; Holly K Tabor; Mary J Emond; Deborah A Nickerson; Jay Shendure
Journal:  Nat Rev Genet       Date:  2011-09-27       Impact factor: 53.242

3.  MutationTaster2: mutation prediction for the deep-sequencing age.

Authors:  Jana Marie Schwarz; David N Cooper; Markus Schuelke; Dominik Seelow
Journal:  Nat Methods       Date:  2014-04       Impact factor: 28.547

4.  A method and server for predicting damaging missense mutations.

Authors:  Ivan A Adzhubei; Steffen Schmidt; Leonid Peshkin; Vasily E Ramensky; Anna Gerasimova; Peer Bork; Alexey S Kondrashov; Shamil R Sunyaev
Journal:  Nat Methods       Date:  2010-04       Impact factor: 28.547

5.  ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data.

Authors:  Kai Wang; Mingyao Li; Hakon Hakonarson
Journal:  Nucleic Acids Res       Date:  2010-07-03       Impact factor: 16.971

6.  Identifying a high fraction of the human genome to be under selective constraint using GERP++.

Authors:  Eugene V Davydov; David L Goode; Marina Sirota; Gregory M Cooper; Arend Sidow; Serafim Batzoglou
Journal:  PLoS Comput Biol       Date:  2010-12-02       Impact factor: 4.475

7.  EvoTol: a protein-sequence based evolutionary intolerance framework for disease-gene prioritization.

Authors:  Owen J L Rackham; Hashem A Shihab; Michael R Johnson; Enrico Petretto
Journal:  Nucleic Acids Res       Date:  2014-12-29       Impact factor: 16.971

8.  Genic intolerance to functional variation and the interpretation of personal genomes.

Authors:  Slavé Petrovski; Quanli Wang; Erin L Heinzen; Andrew S Allen; David B Goldstein
Journal:  PLoS Genet       Date:  2013-08-22       Impact factor: 5.917

9.  A general framework for estimating the relative pathogenicity of human genetic variants.

Authors:  Martin Kircher; Daniela M Witten; Preti Jain; Brian J O'Roak; Gregory M Cooper; Jay Shendure
Journal:  Nat Genet       Date:  2014-02-02       Impact factor: 38.330

10.  A framework for the interpretation of de novo mutation in human disease.

Authors:  Kaitlin E Samocha; Elise B Robinson; Stephan J Sanders; Christine Stevens; Aniko Sabo; Lauren M McGrath; Jack A Kosmicki; Karola Rehnström; Swapan Mallick; Andrew Kirby; Dennis P Wall; Daniel G MacArthur; Stacey B Gabriel; Mark DePristo; Shaun M Purcell; Aarno Palotie; Eric Boerwinkle; Joseph D Buxbaum; Edwin H Cook; Richard A Gibbs; Gerard D Schellenberg; James S Sutcliffe; Bernie Devlin; Kathryn Roeder; Benjamin M Neale; Mark J Daly
Journal:  Nat Genet       Date:  2014-08-03       Impact factor: 38.330

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.