Literature DB >> 33732074

Information theoretic perspective on genome clustering.

Alaguraj Veluchamy1,2, Preeti Mehta1, K V Srividhya1, Hirendra Vikram1, M K Govind1, Ramneek Gupta1, Abdul Aziz Bin Dukhyil3, Raed Abdullah Alharbi4, Saleh Abdullah Aloyuni4, Mohamed M Hassan5,6, S Krishnaswamy1.   

Abstract

Shannon's information theoretic perspective of communication helps one to understand the storage and processing of information in one-dimensional sequences. An information theoretic analysis of 937 available completely sequenced prokaryotic genomes and 238 eukaryotic chromosomes is presented. Information content (Id) values were used to cluster these chromosomes. Chargaff's second parity rule i.e compositional self-complementarity, an empirical fact is observed in all the genomes, except for the proteobacteria Candidatus Hodgkinia cicadicola. High information content, arising out of biased base composition in all the 14 chromosomes of Plasmodium falciparum is found among two other genomes of prokaryotes viz. Buchnera aphidicola str. Cc (Cinara cedri) and Candidatus Carsonella ruddii PV. Despite size and compositional variations, both prokaryotic and eukaryotic genomes do not deviate significantly from an equiprobable and random situation. Eukaryotic chromosomes of an organism tend to have similar informational restraints as seen when a simple distance based method is used to cluster them. In eukaryotes, in certain cases, Id values are also similar for the two arms (p and q arm) of the chromosomes. The results of this current study confirm that the information content can provide insights into the clustering of genomes and the evolution of messaging strategies of the genomes. An efficient and robust Perl CGI standalone tool is created based on this information theory algorithm for the analysis of the whole genomes and is made available at https://github.com/AlagurajVeluchamy/InformationTheory.
© 2020 The Author(s).

Entities:  

Keywords:  Genome arrangement; Genome clustering; Genome evolution; Information theory; Nucleotide distribution; Shannon redundancy

Year:  2020        PMID: 33732074      PMCID: PMC7938122          DOI: 10.1016/j.sjbs.2020.12.039

Source DB:  PubMed          Journal:  Saudi J Biol Sci        ISSN: 2213-7106            Impact factor:   4.219


  18 in total

Review 1.  The language of genes.

Authors:  David B Searls
Journal:  Nature       Date:  2002-11-14       Impact factor: 49.962

2.  Information Theory, Scaling Laws and the Thermodynamics of Evolution.

Authors: 
Journal:  J Theor Biol       Date:  1998-06-21       Impact factor: 2.691

3.  An information theoretic view of gapped and other alignments.

Authors:  J P Schmidt
Journal:  Pac Symp Biocomput       Date:  1998

4.  Improving the efficiency of the genetic code by varying the codon length--the perfect genetic code.

Authors:  A J Doig
Journal:  J Theor Biol       Date:  1997-10-07       Impact factor: 2.691

5.  Genome structure described by formal languages.

Authors:  V Brendel; H G Busse
Journal:  Nucleic Acids Res       Date:  1984-03-12       Impact factor: 16.971

6.  Computer methods to locate signals in nucleic acid sequences.

Authors:  R Staden
Journal:  Nucleic Acids Res       Date:  1984-01-11       Impact factor: 16.971

7.  An application of information theory to biological evolution.

Authors:  M Tanaka
Journal:  J Theor Biol       Date:  1980-08-21       Impact factor: 2.691

8.  Information content of individual genetic sequences.

Authors:  T D Schneider
Journal:  J Theor Biol       Date:  1997-12-21       Impact factor: 2.691

9.  Shannon information theoretic computation of synonymous codon usage biases in coding regions of human and mouse genomes.

Authors:  Barry Zeeberg
Journal:  Genome Res       Date:  2002-06       Impact factor: 9.043

10.  Shannon information in complete genomes.

Authors:  Chang-Heng Chang; Li-Ching Hsieh; Ta-Yuan Chen; Hong-Da Chen; Liaofu Luo; Hoong-Chien Lee
Journal:  J Bioinform Comput Biol       Date:  2005-06       Impact factor: 1.122

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.