Literature DB >> 17234639

Deleterious SNP prediction: be mindful of your training data!

Matthew A Care1, Chris J Needham, Andrew J Bulpitt, David R Westhead.   

Abstract

MOTIVATION: To predict which of the vast number of human single nucleotide polymorphisms (SNPs) are deleterious to gene function or likely to be disease associated is an important problem, and many methods have been reported in the literature. All methods require data sets of mutations classified as 'deleterious' or 'neutral' for training and/or validation. While different workers have used different data sets there has been no study of which is best. Here, the three most commonly used data sets are analysed. We examine their contents and relate this to classifiers, with the aims of revealing the strengths and pitfalls of each data set, and recommending a best approach for future studies.
RESULTS: The data sets examined are shown to be substantially different in content, particularly with regard to amino acid substitutions, reflecting the different ways in which they are derived. This leads to differences in classifiers and reveals some serious pitfalls of some data sets, making them less than ideal for non-synonymous SNP prediction. AVAILABILITY: Software is available on request from the authors.

Entities:  

Mesh:

Year:  2007        PMID: 17234639     DOI: 10.1093/bioinformatics/btl649

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  29 in total

1.  SySAP: a system-level predictor of deleterious single amino acid polymorphisms.

Authors:  Tao Huang; Chuan Wang; Guoqing Zhang; Lu Xie; Yixue Li
Journal:  Protein Cell       Date:  2011-12-19       Impact factor: 14.870

2.  Functional hot spots in human ATP-binding cassette transporter nucleotide binding domains.

Authors:  Libusha Kelly; Hisayo Fukushima; Rachel Karchin; Jason M Gow; Leslie W Chinn; Ursula Pieper; Mark R Segal; Deanna L Kroetz; Andrej Sali
Journal:  Protein Sci       Date:  2010-11       Impact factor: 6.725

3.  Identification of deleterious mutations within three human genomes.

Authors:  Sung Chun; Justin C Fay
Journal:  Genome Res       Date:  2009-07-14       Impact factor: 9.043

Review 4.  Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data.

Authors:  Gregory M Cooper; Jay Shendure
Journal:  Nat Rev Genet       Date:  2011-08-18       Impact factor: 53.242

5.  A new disease-specific machine learning approach for the prediction of cancer-causing missense variants.

Authors:  Emidio Capriotti; Russ B Altman
Journal:  Genomics       Date:  2011-07-07       Impact factor: 5.736

6.  IMHOTEP-a composite score integrating popular tools for predicting the functional consequences of non-synonymous sequence variants.

Authors:  Carolin Knecht; Matthew Mort; Olaf Junge; David N Cooper; Michael Krawczak; Amke Caliebe
Journal:  Nucleic Acids Res       Date:  2017-02-17       Impact factor: 16.971

7.  Sequential Support Vector Regression with Embedded Entropy for SNP Selection and Disease Classification.

Authors:  Yulan Liang; Arpad Kelemen
Journal:  Stat Anal Data Min       Date:  2011-06-01       Impact factor: 1.051

8.  In silico analysis of missense substitutions using sequence-alignment based methods.

Authors:  Sean V Tavtigian; Marc S Greenblatt; Fabienne Lesueur; Graham B Byrnes
Journal:  Hum Mutat       Date:  2008-11       Impact factor: 4.878

9.  In silico functional profiling of human disease-associated and polymorphic amino acid substitutions.

Authors:  Matthew Mort; Uday S Evani; Vidhya G Krishnan; Kishore K Kamati; Peter H Baenziger; Angshuman Bagchi; Brandon J Peters; Rakesh Sathyesh; Biao Li; Yanan Sun; Bin Xue; Nigam H Shah; Maricel G Kann; David N Cooper; Predrag Radivojac; Sean D Mooney
Journal:  Hum Mutat       Date:  2010-03       Impact factor: 4.878

10.  Prediction of deleterious non-synonymous SNPs based on protein interaction network and hybrid properties.

Authors:  Tao Huang; Ping Wang; Zhi-Qiang Ye; Heng Xu; Zhisong He; Kai-Yan Feng; Lele Hu; Weiren Cui; Kai Wang; Xiao Dong; Lu Xie; Xiangyin Kong; Yu-Dong Cai; Yixue Li
Journal:  PLoS One       Date:  2010-07-30       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.