Literature DB >> 11414594

Relating amino acid sequence to phenotype: analysis of peptide-binding data.

M R Segal1, M P Cummings, A E Hubbard.   

Abstract

We illustrate data analytic concerns that arise in the context of relating genotype, as represented by amino acid sequence, to phenotypes (outcomes). The present application examines whether peptides that bind to a particular major histocompatibility complex (MHC) class I molecule have characteristic amino acid sequences. However, the concerns identified and addressed are considerably more general. It is recognized that simple rules for predicting binding based solely on preferences for specific amino acids in certain (anchor) positions of the peptide's amino acid sequence are generally inadequate and that binding is potentially influenced by all sequence positions as well as between-position interactions. The desire to elucidate these more complex prediction rules has spawned various modeling attempts, the shortcomings of which provide motivation for the methods adopted here. Because of (i) this need to model between-position interactions, (ii) amino acids constituting a highly (20) multilevel unordered categorical covariate, and (iii) there frequently being numerous such covariates (i.e., positions) comprising the sequence, standard regression/classification techniques are problematic due to the proliferation of indicator variables required for encoding the sequence position covariates and attendant interactions. These difficulties have led to analyses based on (continuous) properties (e.g., molecular weights) of the amino acids. However, there is potential information loss in such an approach if the properties used are incomplete and/or do not capture the mechanism underlying association with the phenotype. Here we demonstrate that handling unordered categorical covariates with numerous levels and accompanying interactions can be done effectively using classification trees and recently devised bump-hunting methods. We further tackle the question of whether observed associations are attributable to amino acid properties as well as addressing the assessment and implications of between-position covariation.

Mesh:

Substances:

Year:  2001        PMID: 11414594     DOI: 10.1111/j.0006-341x.2001.00632.x

Source DB:  PubMed          Journal:  Biometrics        ISSN: 0006-341X            Impact factor:   2.571


  15 in total

1.  A molecular footprint of limb loss: sequence variation of the autopodial identity gene Hoxa-13.

Authors:  Tiana Kohlsdorf; Michael P Cummings; Vincent J Lynch; Geffrey F Stopper; Kazuhiko Takahashi; Günter P Wagner
Journal:  J Mol Evol       Date:  2008-12       Impact factor: 2.395

2.  Prediction of supertype-specific HLA class I binding peptides using support vector machines.

Authors:  Guang Lan Zhang; Ivana Bozic; Chee Keong Kwoh; J Thomas August; Vladimir Brusic
Journal:  J Immunol Methods       Date:  2007-01-25       Impact factor: 2.303

3.  Mixed modeling and multiple imputation for unobservable genotype clusters.

Authors:  A S Foulkes; R Yucel; M P Reilly
Journal:  Stat Med       Date:  2008-07-10       Impact factor: 2.373

4.  Extreme polymorphism in a vaccine antigen and risk of clinical malaria: implications for vaccine development.

Authors:  Shannon L Takala; Drissa Coulibaly; Mahamadou A Thera; Adrian H Batchelor; Michael P Cummings; Ananias A Escalante; Amed Ouattara; Karim Traoré; Amadou Niangaly; Abdoulaye A Djimdé; Ogobara K Doumbo; Christopher V Plowe
Journal:  Sci Transl Med       Date:  2009-10-14       Impact factor: 17.956

5.  NetMHCpan, a method for MHC class I binding prediction beyond humans.

Authors:  Ilka Hoof; Bjoern Peters; John Sidney; Lasse Eggers Pedersen; Alessandro Sette; Ole Lund; Søren Buus; Morten Nielsen
Journal:  Immunogenetics       Date:  2008-11-12       Impact factor: 2.846

6.  Prediction of desmoglein-3 peptides reveals multiple shared T-cell epitopes in HLA DR4- and DR6-associated pemphigus vulgaris.

Authors:  Joo Chuan Tong; Tin Wee Tan; Animesh A Sinha; Shoba Ranganathan
Journal:  BMC Bioinformatics       Date:  2006-12-18       Impact factor: 3.169

7.  Iterative reconstruction of high-dimensional Gaussian Graphical Models based on a new method to estimate partial correlations under constraints.

Authors:  Vincent Guillemot; Andreas Bender; Anne-Laure Boulesteix
Journal:  PLoS One       Date:  2013-04-11       Impact factor: 3.240

8.  Identifying areas of the visual field important for quality of life in patients with glaucoma.

Authors:  Hiroshi Murata; Hiroyo Hirasawa; Yuka Aoyama; Kenji Sugisaki; Makoto Araie; Chihiro Mayama; Makoto Aihara; Ryo Asaoka
Journal:  PLoS One       Date:  2013-03-08       Impact factor: 3.240

9.  NetMHCpan, a method for quantitative predictions of peptide binding to any HLA-A and -B locus protein of known sequence.

Authors:  Morten Nielsen; Claus Lundegaard; Thomas Blicher; Kasper Lamberth; Mikkel Harndahl; Sune Justesen; Gustav Røder; Bjoern Peters; Alessandro Sette; Ole Lund; Søren Buus
Journal:  PLoS One       Date:  2007-08-29       Impact factor: 3.240

10.  Simple statistical models predict C-to-U edited sites in plant mitochondrial RNA.

Authors:  Michael P Cummings; Daniel S Myers
Journal:  BMC Bioinformatics       Date:  2004-09-16       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.