Literature DB >> 26974515

Identifying relevant positions in proteins by Critical Variable Selection.

Silvia Grigolon1, Silvio Franz2, Matteo Marsili3.   

Abstract

Evolution in its course has found a variety of solutions to the same optimisation problem. The advent of high-throughput genomic sequencing has made available extensive data from which, in principle, one can infer the underlying structure on which biological functions rely. In this paper, we present a new method aimed at the extraction of sites encoding structural and functional properties from a set of protein primary sequences, namely a multiple sequence alignment. The method, called critical variable selection, is based on the idea that subsets of relevant sites correspond to subsequences that occur with a particularly broad frequency distribution in the dataset. By applying this algorithm to in silico sequences, to the response regulator receiver and to the voltage sensor domain of ion channels, we show that this procedure recovers not only the information encoded in single site statistics and pairwise correlations but also captures dependencies going beyond pairwise correlations. The method proposed here is complementary to statistical coupling analysis, in that the most relevant sites predicted by the two methods differ markedly. We find robust and consistent results for datasets as small as few hundred sequences that reveal a hidden hierarchy of sites that are consistent with the present knowledge on biologically relevant sites and evolutionary dynamics. This suggests that critical variable selection is capable of identifying a core of sites encoding functional and structural information in a multiple sequence alignment.

Mesh:

Substances:

Year:  2016        PMID: 26974515     DOI: 10.1039/c6mb00047a

Source DB:  PubMed          Journal:  Mol Biosyst        ISSN: 1742-2051


  6 in total

1.  Evolutionarily Conserved Interactions within the Pore Domain of Acid-Sensing Ion Channels.

Authors:  Marina A Kasimova; Timothy Lynagh; Zeshan Pervez Sheikh; Daniele Granata; Christian Bernsen Borg; Vincenzo Carnevale; Stephan Alexander Pless
Journal:  Biophys J       Date:  2019-09-06       Impact factor: 4.033

2.  Multiscale relevance and informative encoding in neuronal spike trains.

Authors:  Ryan John Cubero; Matteo Marsili; Yasser Roudi
Journal:  J Comput Neurosci       Date:  2020-01-28       Impact factor: 1.621

3.  Evolution of an intricate J-protein network driving protein disaggregation in eukaryotes.

Authors:  Nadinath B Nillegoda; Antonia Stank; Duccio Malinverni; Niels Alberts; Anna Szlachcic; Alessandro Barducci; Paolo De Los Rios; Rebecca C Wade; Bernd Bukau
Journal:  Elife       Date:  2017-05-15       Impact factor: 8.140

4.  ConKit: a python interface to contact predictions.

Authors:  Felix Simkovic; Jens M H Thomas; Daniel J Rigden
Journal:  Bioinformatics       Date:  2017-07-15       Impact factor: 6.937

Review 5.  Applications of contact predictions to structural biology.

Authors:  Felix Simkovic; Sergey Ovchinnikov; David Baker; Daniel J Rigden
Journal:  IUCrJ       Date:  2017-04-18       Impact factor: 4.769

Review 6.  An introduction to the maximum entropy approach and its application to inference problems in biology.

Authors:  Andrea De Martino; Daniele De Martino
Journal:  Heliyon       Date:  2018-04-13
  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.