Literature DB >> 16109767

Framework for kernel regularization with application to protein clustering.

Fan Lu1, Sündüz Keles, Stephen J Wright, Grace Wahba.   

Abstract

We develop and apply a previously undescribed framework that is designed to extract information in the form of a positive definite kernel matrix from possibly crude, noisy, incomplete, inconsistent dissimilarity information between pairs of objects, obtainable in a variety of contexts. Any positive definite kernel defines a consistent set of distances, and the fitted kernel provides a set of coordinates in Euclidean space that attempts to respect the information available while controlling for complexity of the kernel. The resulting set of coordinates is highly appropriate for visualization and as input to classification and clustering algorithms. The framework is formulated in terms of a class of optimization problems that can be solved efficiently by using modern convex cone programming software. The power of the method is illustrated in the context of protein clustering based on primary sequence data. An application to the globin family of proteins resulted in a readily visualizable 3D sequence space of globins, where several subfamilies and subgroupings consistent with the literature were easily identifiable.

Mesh:

Substances:

Year:  2005        PMID: 16109767      PMCID: PMC1187947          DOI: 10.1073/pnas.0505411102

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  17 in total

1.  A discriminative framework for detecting remote protein homologies.

Authors:  T Jaakkola; M Diekhans; D Haussler
Journal:  J Comput Biol       Date:  2000 Feb-Apr       Impact factor: 1.479

2.  Soft and hard classification by reproducing kernel Hilbert space methods.

Authors:  Grace Wahba
Journal:  Proc Natl Acad Sci U S A       Date:  2002-12-11       Impact factor: 11.205

3.  On the role of structural information in remote homology detection and sequence alignment: new methods using hybrid sequence profiles.

Authors:  Christopher L Tang; Lei Xie; Ingrid Y Y Koh; Shoshana Posy; Emil Alexov; Barry Honig
Journal:  J Mol Biol       Date:  2003-12-12       Impact factor: 5.469

4.  ExPASy: The proteomics server for in-depth protein knowledge and analysis.

Authors:  Elisabeth Gasteiger; Alexandre Gattiker; Christine Hoogland; Ivan Ivanyi; Ron D Appel; Amos Bairoch
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

5.  Remote homology detection: a motif based approach.

Authors:  Asa Ben-Hur; Douglas Brutlag
Journal:  Bioinformatics       Date:  2003       Impact factor: 6.937

6.  Basic local alignment search tool.

Authors:  S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal:  J Mol Biol       Date:  1990-10-05       Impact factor: 5.469

7.  Global mapping of the protein structure space and application in structure-based inference of protein function.

Authors:  Jingtong Hou; Se-Ran Jun; Chao Zhang; Sung-Hou Kim
Journal:  Proc Natl Acad Sci U S A       Date:  2005-02-10       Impact factor: 11.205

Review 8.  Profile hidden Markov models.

Authors:  S R Eddy
Journal:  Bioinformatics       Date:  1998       Impact factor: 6.937

9.  Kinetic characterization of myoglobins from vertebrates with vastly different body temperatures.

Authors:  R E Cashon; M E Vayda; B D Sidell
Journal:  Comp Biochem Physiol B Biochem Mol Biol       Date:  1997-08       Impact factor: 2.231

10.  Isolation and amino acid sequence of a monomeric hemoglobin in heart muscle of the bullfrog, Rana catesbeiana.

Authors:  N Maeda; W M Fitch
Journal:  J Biol Chem       Date:  1982-03-25       Impact factor: 5.157

View more
  11 in total

1.  Improving the quality of protein similarity network clustering algorithms using the network edge weight distribution.

Authors:  Leonard Apeltsin; John H Morris; Patricia C Babbitt; Thomas E Ferrin
Journal:  Bioinformatics       Date:  2010-11-29       Impact factor: 6.937

2.  Backward multiple imputation estimation of the conditional lifetime expectancy function with application to censored human longevity data.

Authors:  Jing Kong; Barbara E K Klein; Ronald Klein; Grace Wahba
Journal:  Proc Natl Acad Sci U S A       Date:  2015-09-14       Impact factor: 11.205

3.  Examining the relative influence of familial, genetic, and environmental covariate information in flexible risk models.

Authors:  Héctor Corrada Bravo; Kristine E Lee; Barbara E K Klein; Ronald Klein; Sudha K Iyengar; Grace Wahba
Journal:  Proc Natl Acad Sci U S A       Date:  2009-05-06       Impact factor: 11.205

4.  A remark on global positioning from local distances.

Authors:  Amit Singer
Journal:  Proc Natl Acad Sci U S A       Date:  2008-07-08       Impact factor: 11.205

5.  Using distance correlation and SS-ANOVA to assess associations of familial relationships, lifestyle factors, diseases, and mortality.

Authors:  Jing Kong; Barbara E K Klein; Ronald Klein; Kristine E Lee; Grace Wahba
Journal:  Proc Natl Acad Sci U S A       Date:  2012-11-21       Impact factor: 11.205

6.  Encoding Dissimilarity Data for Statistical Model Building.

Authors:  Grace Wahba
Journal:  J Stat Plan Inference       Date:  2010-12-01       Impact factor: 1.111

7.  Penalized nonparametric scalar-on-function regression via principal coordinates.

Authors:  Philip T Reiss; David L Miller; Pei-Shien Wu; Wen-Yu Hua
Journal:  J Comput Graph Stat       Date:  2016-08-02       Impact factor: 2.302

8.  Multidimensional scaling reveals the main evolutionary pathways of class A G-protein-coupled receptors.

Authors:  Julien Pelé; Hervé Abdi; Matthieu Moreau; David Thybert; Marie Chabbert
Journal:  PLoS One       Date:  2011-04-22       Impact factor: 3.240

9.  Convergent algorithms for protein structural alignment.

Authors:  Leandro Martínez; Roberto Andreani; José Mario Martínez
Journal:  BMC Bioinformatics       Date:  2007-08-22       Impact factor: 3.169

10.  Kinase Identification with Supervised Laplacian Regularized Least Squares.

Authors:  Ao Li; Xiaoyi Xu; He Zhang; Minghui Wang
Journal:  PLoS One       Date:  2015-10-08       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.