Literature DB >> 26427337

Rare k-mer DNA: Identification of sequence motifs and prediction of CpG island and promoter.

Ezzeddin Kamil Mohamed Hashim1, Rosni Abdullah2.   

Abstract

Empirical analysis on k-mer DNA has been proven as an effective tool in finding unique patterns in DNA sequences which can lead to the discovery of potential sequence motifs. In an extensive study of empirical k-mer DNA on hundreds of organisms, the researchers found unique multi-modal k-mer spectra occur in the genomes of organisms from the tetrapod clade only which includes all mammals. The multi-modality is caused by the formation of the two lowest modes where k-mers under them are referred as the rare k-mers. The suppression of the two lowest modes (or the rare k-mers) can be attributed to the CG dinucleotide inclusions in them. Apart from that, the rare k-mers are selectively distributed in certain genomic features of CpG Island (CGI), promoter, 5' UTR, and exon. We correlated the rare k-mers with hundreds of annotated features using several bioinformatic tools, performed further intrinsic rare k-mer analyses within the correlated features, and modeled the elucidated rare k-mer clustering feature into a classifier to predict the correlated CGI and promoter features. Our correlation results show that rare k-mers are highly associated with several annotated features of CGI, promoter, 5' UTR, and open chromatin regions. Our intrinsic results show that rare k-mers have several unique topological, compositional, and clustering properties in CGI and promoter features. Finally, the performances of our RWC (rare-word clustering) method in predicting the CGI and promoter features are ranked among the top three, in eight of the CGI and promoter evaluations, among eight of the benchmarked datasets. Crown
Copyright © 2015. Published by Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  CGI; Classification; Genome; Rare-word; k-tuple; n-mer

Mesh:

Substances:

Year:  2015        PMID: 26427337     DOI: 10.1016/j.jtbi.2015.09.014

Source DB:  PubMed          Journal:  J Theor Biol        ISSN: 0022-5193            Impact factor:   2.691


  5 in total

1.  Evolutionary mechanism and biological functions of 8-mers containing CG dinucleotide in yeast.

Authors:  Yan Zheng; Hong Li; Yue Wang; Hu Meng; Qiang Zhang; Xiaoqing Zhao
Journal:  Chromosome Res       Date:  2017-02-09       Impact factor: 5.239

2.  Informational laws of genome structures.

Authors:  Vincenzo Bonnici; Vincenzo Manca
Journal:  Sci Rep       Date:  2016-06-29       Impact factor: 4.379

3.  PseUI: Pseudouridine sites identification based on RNA sequence information.

Authors:  Jingjing He; Ting Fang; Zizheng Zhang; Bei Huang; Xiaolei Zhu; Yi Xiong
Journal:  BMC Bioinformatics       Date:  2018-08-29       Impact factor: 3.169

4.  Intrinsic laws of k-mer spectra of genome sequences and evolution mechanism of genomes.

Authors:  Zhenhua Yang; Hong Li; Yun Jia; Yan Zheng; Hu Meng; Tonglaga Bao; Xiaolong Li; Liaofu Luo
Journal:  BMC Evol Biol       Date:  2020-11-23       Impact factor: 3.260

5.  Methylation-driven model for analysis of dinucleotide evolution in genomes.

Authors:  Jian-Hong Sun; Shi-Meng Ai; Shu-Qun Liu
Journal:  Theor Biol Med Model       Date:  2020-04-08       Impact factor: 2.432

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.