Literature DB >> 12162888

Analysis of information content for biological sequences.

Jian Zhang1.   

Abstract

Decomposing a biological sequence into modular domains is a basic prerequisite to identify functional units in biological molecules. The commonly used segmentation procedures usually have two steps. First, collect and align a set of sequences that are homologous to the target sequence. Then, parse this multiple alignment into several blocks and identify the functionally important ones by using a semi-automatic method, which combines manual analysis and expert knowledge. In this paper, we present a novel exploratory approach to parsing and analyzing such kinds of multiple alignments. It is based on a type of analysis-of-variance (ANOVA) decomposition of the sequence information content. Unlike the traditional change-point method, this approach takes into account not only the composition biases but also the overdispersion effects among the blocks. The new approach is tested on the families of ribosomal proteins and has a promising performance. It is shown that the new approach provides a better way for judging some important residues in these proteins. This allows one to find some subsets of residues, which are critical to these proteins.

Mesh:

Substances:

Year:  2002        PMID: 12162888     DOI: 10.1089/106652702760138583

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  2 in total

1.  Multipattern consensus regions in multiple aligned protein sequences and their segmentation.

Authors:  David K Y Chiu; Yan Wang
Journal:  EURASIP J Bioinform Syst Biol       Date:  2006

2.  Rooted triple consensus and anomalous gene trees.

Authors:  Gregory B Ewing; Ingo Ebersberger; Heiko A Schmidt; Arndt von Haeseler
Journal:  BMC Evol Biol       Date:  2008-04-25       Impact factor: 3.260

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.