Literature DB >> 25928477

PaPI: pseudo amino acid composition to score human protein-coding variants.

Ivan Limongelli1,2, Simone Marini3, Riccardo Bellazzi4.   

Abstract

BACKGROUND: High throughput sequencing technologies are able to identify the whole genomic variation of an individual. Gene-targeted and whole-exome experiments are mainly focused on coding sequence variants related to a single or multiple nucleotides. The analysis of the biological significance of this multitude of genomic variant is challenging and computational demanding.
RESULTS: We present PaPI, a new machine-learning approach to classify and score human coding variants by estimating the probability to damage their protein-related function. The novelty of this approach consists in using pseudo amino acid composition through which wild and mutated protein sequences are represented in a discrete model. A machine learning classifier has been trained on a set of known deleterious and benign coding variants with the aim to score unobserved variants by taking into account hidden sequence patterns in human genome potentially leading to diseases. We show how the combination of amphiphilic pseudo amino acid composition, evolutionary conservation and homologous proteins based methods outperforms several prediction algorithms and it is also able to score complex variants such as deletions, insertions and indels.
CONCLUSIONS: This paper describes a machine-learning approach to predict the deleteriousness of human coding variants. A freely available web application (http://papi.unipv.it) has been developed with the presented method, able to score up to thousands variants in a single run.

Entities:  

Mesh:

Substances:

Year:  2015        PMID: 25928477      PMCID: PMC4411653          DOI: 10.1186/s12859-015-0554-8

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  50 in total

1.  Subcellular location prediction of apoptosis proteins.

Authors:  Guo-Ping Zhou; Kutbuddin Doctor
Journal:  Proteins       Date:  2003-01-01

2.  Predicting the functional consequences of non-synonymous DNA sequence variants--evaluation of bioinformatics tools and development of a consensus strategy.

Authors:  Kimon Frousios; Costas S Iliopoulos; Thomas Schlitt; Michael A Simpson
Journal:  Genomics       Date:  2013-07-03       Impact factor: 5.736

Review 3.  Sequencing technologies - the next generation.

Authors:  Michael L Metzker
Journal:  Nat Rev Genet       Date:  2009-12-08       Impact factor: 53.242

4.  Evolution and functional impact of rare coding variation from deep sequencing of human exomes.

Authors:  Jacob A Tennessen; Abigail W Bigham; Timothy D O'Connor; Wenqing Fu; Eimear E Kenny; Simon Gravel; Sean McGee; Ron Do; Xiaoming Liu; Goo Jun; Hyun Min Kang; Daniel Jordan; Suzanne M Leal; Stacey Gabriel; Mark J Rieder; Goncalo Abecasis; David Altshuler; Deborah A Nickerson; Eric Boerwinkle; Shamil Sunyaev; Carlos D Bustamante; Michael J Bamshad; Joshua M Akey
Journal:  Science       Date:  2012-05-17       Impact factor: 47.728

5.  Performance of mutation pathogenicity prediction methods on missense variants.

Authors:  Janita Thusberg; Ayodeji Olatubosun; Mauno Vihinen
Journal:  Hum Mutat       Date:  2011-02-22       Impact factor: 4.878

Review 6.  Human genomic disease variants: a neutral evolutionary explanation.

Authors:  Joel T Dudley; Yuseob Kim; Li Liu; Glenn J Markov; Kristyn Gerold; Rong Chen; Atul J Butte; Sudhir Kumar
Journal:  Genome Res       Date:  2012-06-04       Impact factor: 9.043

7.  Markov models of amino acid substitution to study proteins with intrinsically disordered regions.

Authors:  Adam M Szalkowski; Maria Anisimova
Journal:  PLoS One       Date:  2011-05-27       Impact factor: 3.240

8.  Predicting the functional, molecular, and phenotypic consequences of amino acid substitutions using hidden Markov models.

Authors:  Hashem A Shihab; Julian Gough; David N Cooper; Peter D Stenson; Gary L A Barker; Keith J Edwards; Ian N M Day; Tom R Gaunt
Journal:  Hum Mutat       Date:  2012-11-02       Impact factor: 4.878

9.  The Human Gene Mutation Database: 2008 update.

Authors:  Peter D Stenson; Matthew Mort; Edward V Ball; Katy Howells; Andrew D Phillips; Nick St Thomas; David N Cooper
Journal:  Genome Med       Date:  2009-01-22       Impact factor: 11.117

10.  ClinVar: public archive of relationships among sequence variation and human phenotype.

Authors:  Melissa J Landrum; Jennifer M Lee; George R Riley; Wonhee Jang; Wendy S Rubinstein; Deanna M Church; Donna R Maglott
Journal:  Nucleic Acids Res       Date:  2013-11-14       Impact factor: 16.971

View more
  16 in total

Review 1.  Computational approaches to study the effects of small genomic variations.

Authors:  Kamil Khafizov; Maxim V Ivanov; Olga V Glazova; Sergei P Kovalenko
Journal:  J Mol Model       Date:  2015-09-08       Impact factor: 1.810

Review 2.  KCTD7-related progressive myoclonic epilepsy: report of three Indian families and review of literature.

Authors:  Dhanya Lakshmi Narayanan; Puneeth H Somashekar; Purvi Majethia; Anju Shukla
Journal:  Clin Dysmorphol       Date:  2022-01-01       Impact factor: 0.816

3.  The parameter sensitivity of random forests.

Authors:  Barbara F F Huang; Paul C Boutros
Journal:  BMC Bioinformatics       Date:  2016-09-01       Impact factor: 3.169

4.  A Data Fusion Approach to Enhance Association Study in Epilepsy.

Authors:  Simone Marini; Ivan Limongelli; Ettore Rizzo; Alberto Malovini; Edoardo Errichiello; Annalisa Vetro; Tan Da; Orsetta Zuffardi; Riccardo Bellazzi
Journal:  PLoS One       Date:  2016-12-16       Impact factor: 3.240

5.  Patient similarity by joint matrix trifactorization to identify subgroups in acute myeloid leukemia.

Authors:  F Vitali; S Marini; D Pala; A Demartini; S Montoli; A Zambelli; R Bellazzi
Journal:  JAMIA Open       Date:  2018-05-14

6.  Bi-allelic c.181_183delTGT in BTB domain of KLHL7 is associated with overlapping phenotypes of Crisponi/CISS1-like and Bohring-Opitz like syndrome.

Authors:  Anil Kanthi; Malavika Hebbar; Stephanie L Bielas; Katta M Girisha; Anju Shukla
Journal:  Eur J Med Genet       Date:  2018-08-22       Impact factor: 2.708

Review 7.  Multivariate Methods for Genetic Variants Selection and Risk Prediction in Cardiovascular Diseases.

Authors:  Alberto Malovini; Riccardo Bellazzi; Carlo Napolitano; Guia Guffanti
Journal:  Front Cardiovasc Med       Date:  2016-06-08

8.  Prediction of endoplasmic reticulum resident proteins using fragmented amino acid composition and support vector machine.

Authors:  Ravindra Kumar; Bandana Kumari; Manish Kumar
Journal:  PeerJ       Date:  2017-09-04       Impact factor: 2.984

9.  A biallelic 36-bp insertion in PIBF1 is associated with Joubert syndrome.

Authors:  Malavika Hebbar; Anil Kanthi; Anju Shukla; Stephanie Bielas; Katta M Girisha
Journal:  J Hum Genet       Date:  2018-04-25       Impact factor: 3.172

10.  PPAI: a web server for predicting protein-aptamer interactions.

Authors:  Jianwei Li; Xiaoyu Ma; Xichuan Li; Junhua Gu
Journal:  BMC Bioinformatics       Date:  2020-06-09       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.