| Literature DB >> 28602693 |
Xianjun Lai1, Sairam Behera2, Zhikai Liang3, Yanli Lu4, Jitender S Deogun5, James C Schnable6.
Abstract
One method for identifying noncoding regulatory regions of a genome is to quantify rates of divergence between related species, as functional sequence will generally diverge more slowly. Most approaches to identifying these conserved noncoding sequences (CNSs) based on alignment have had relatively large minimum sequence lengths (≥15 bp) compared with the average length of known transcription factor binding sites. To circumvent this constraint, STAG-CNS that can simultaneously integrate the data from the promoters of conserved orthologous genes in three or more species was developed. Using the data from up to six grass species made it possible to identify conserved sequences as short as 9 bp with false discovery rate ≤0.05. These CNSs exhibit greater overlap with open chromatin regions identified using DNase I hypersensitivity assays, and are enriched in the promoters of genes involved in transcriptional regulation. STAG-CNS was further employed to characterize loss of conserved noncoding sequences associated with retained duplicate genes from the ancient maize polyploidy. Genes with fewer retained CNSs show lower overall expression, although this bias is more apparent in samples of complex organ systems containing many cell types, suggesting that CNS loss may correspond to a reduced number of expression contexts rather than lower expression levels across the entire ancestral expression domain.Entities:
Keywords: comparative genomics; conserved noncoding sequence; grain crops; longest path algorithm; suffix tree
Mesh:
Substances:
Year: 2017 PMID: 28602693 DOI: 10.1016/j.molp.2017.05.010
Source DB: PubMed Journal: Mol Plant ISSN: 1674-2052 Impact factor: 13.164