Literature DB >> 25532524

Kernel-based logistic regression model for protein sequence without vectorialization.

Youyi Fong1, Saheli Datta2, Ivelin S Georgiev3, Peter D Kwong3, Georgia D Tomaras4.   

Abstract

Protein sequence data arise more and more often in vaccine and infectious disease research. These types of data are discrete, high-dimensional, and complex. We propose to study the impact of protein sequences on binary outcomes using a kernel-based logistic regression model, which models the effect of protein through a random effect whose variance-covariance matrix is mostly determined by a kernel function. We propose a novel, biologically motivated, profile hidden Markov model (HMM)-based mutual information (MI) kernel. Hypothesis testing can be carried out using the maximum of the score statistics and a parametric bootstrap procedure. To improve the power of testing, we propose intuitive modifications to the test statistic. We show through simulation studies that the profile HMM-based MI kernel can be substantially more powerful than competing kernels, and that the modified test statistics bring incremental gains in power. We use these proposed methods to investigate two problems from HIV-1 vaccine research: (1) identifying segments of HIV-1 envelope (Env) protein that confer resistance to neutralizing antibody and (2) identifying segments of Env that are associated with attenuation of protective vaccine effect by antibodies of isotype A in the RV144 vaccine trial.
© The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Keywords:  Davies problem; Kernel methods; Maximum of score statistics

Mesh:

Substances:

Year:  2014        PMID: 25532524      PMCID: PMC4794618          DOI: 10.1093/biostatistics/kxu056

Source DB:  PubMed          Journal:  Biostatistics        ISSN: 1465-4644            Impact factor:   5.279


  25 in total

1.  A probabilistic treatment of phylogeny and sequence alignment.

Authors:  G J Mitchison
Journal:  J Mol Evol       Date:  1999-07       Impact factor: 2.395

2.  A discriminative framework for detecting remote protein homologies.

Authors:  T Jaakkola; M Diekhans; D Haussler
Journal:  J Comput Biol       Date:  2000 Feb-Apr       Impact factor: 1.479

3.  The spectrum kernel: a string kernel for SVM protein classification.

Authors:  Christina Leslie; Eleazar Eskin; William Stafford Noble
Journal:  Pac Symp Biocomput       Date:  2002

4.  A statistical framework for genomic data fusion.

Authors:  Gert R G Lanckriet; Tijl De Bie; Nello Cristianini; Michael I Jordan; William Stafford Noble
Journal:  Bioinformatics       Date:  2004-05-06       Impact factor: 6.937

5.  Mismatch string kernels for discriminative protein classification.

Authors:  Christina S Leslie; Eleazar Eskin; Adiel Cohen; Jason Weston; William Stafford Noble
Journal:  Bioinformatics       Date:  2004-01-22       Impact factor: 6.937

6.  Vector quantization kernels for the classification of protein sequences and structures.

Authors:  Wyatt T Clark; Predrag Radivojac
Journal:  Pac Symp Biocomput       Date:  2014

Review 7.  Profile hidden Markov models.

Authors:  S R Eddy
Journal:  Bioinformatics       Date:  1998       Impact factor: 6.937

8.  Immune-correlates analysis of an HIV-1 vaccine efficacy trial.

Authors:  Barton F Haynes; Peter B Gilbert; M Juliana McElrath; Susan Zolla-Pazner; Georgia D Tomaras; S Munir Alam; David T Evans; David C Montefiori; Chitraporn Karnasuta; Ruengpueng Sutthent; Hua-Xin Liao; Anthony L DeVico; George K Lewis; Constance Williams; Abraham Pinter; Youyi Fong; Holly Janes; Allan DeCamp; Yunda Huang; Mangala Rao; Erik Billings; Nicos Karasavvas; Merlin L Robb; Viseth Ngauy; Mark S de Souza; Robert Paris; Guido Ferrari; Robert T Bailer; Kelly A Soderberg; Charla Andrews; Phillip W Berman; Nicole Frahm; Stephen C De Rosa; Michael D Alpert; Nicole L Yates; Xiaoying Shen; Richard A Koup; Punnee Pitisuttithum; Jaranit Kaewkungwal; Sorachai Nitayaphan; Supachai Rerks-Ngarm; Nelson L Michael; Jerome H Kim
Journal:  N Engl J Med       Date:  2012-04-05       Impact factor: 91.245

9.  Structure of HIV-1 gp120 V1/V2 domain with broadly neutralizing antibody PG9.

Authors:  Jason S McLellan; Marie Pancera; Chris Carrico; Jason Gorman; Jean-Philippe Julien; Reza Khayat; Robert Louder; Robert Pejchal; Mallika Sastry; Kaifan Dai; Sijy O'Dell; Nikita Patel; Syed Shahzad-ul-Hussan; Yongping Yang; Baoshan Zhang; Tongqing Zhou; Jiang Zhu; Jeffrey C Boyington; Gwo-Yu Chuang; Devan Diwanji; Ivelin Georgiev; Young Do Kwon; Doyung Lee; Mark K Louder; Stephanie Moquin; Stephen D Schmidt; Zhi-Yong Yang; Mattia Bonsignori; John A Crump; Saidi H Kapiga; Noel E Sam; Barton F Haynes; Dennis R Burton; Wayne C Koff; Laura M Walker; Sanjay Phogat; Richard Wyatt; Jared Orwenyo; Lai-Xi Wang; James Arthos; Carole A Bewley; John R Mascola; Gary J Nabel; William R Schief; Andrew B Ward; Ian A Wilson; Peter D Kwong
Journal:  Nature       Date:  2011-11-23       Impact factor: 49.962

10.  Testing the fit of a regression model via score tests in random effects models.

Authors:  S le Cessie; H C van Houwelingen
Journal:  Biometrics       Date:  1995-06       Impact factor: 2.571

View more
  1 in total

1.  An Equation Based on Fuzzy Mathematics to Assess the Timing of Haemodialysis Initiation.

Authors:  Ying Liu; Degang Wang; Xiangmei Chen; Xuefeng Sun; Wenyan Song; Hongli Jiang; Wei Shi; Wenhu Liu; Ping Fu; Xiaoqiang Ding; Ming Chang; Xueqing Yu; Ning Cao; Menghua Chen; Zhaohui Ni; Jing Cheng; Shiren Sun; Huimin Wang; Yunyan Wang; Bihu Gao; Jianqin Wang; Lirong Hao; Suhua Li; Qiang He; Hongmei Liu; Fengmin Shao; Wei Li; Yang Wang; Lynda Szczech; Qiuxia Lv; Xianfeng Han; Luping Wang; Ming Fang; Zach Odeh; Ximing Sun; Hongli Lin
Journal:  Sci Rep       Date:  2019-04-10       Impact factor: 4.379

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.