| Literature DB >> 24192840 |
Zhixin Zhao1, Cheng Guo, Sreeskandarajan Sutharzan, Pei Li, Craig S Echt, Jie Zhang, Chun Liang.
Abstract
Tandem repeats (TRs) extensively exist in the genomes of prokaryotes and eukaryotes. Based on the sequenced genomes and gene annotations of 31 plant and algal species in Phytozome version 8.0 (http://www.phytozome.net/), we examined TRs in a genome-wide scale, characterized their distributions and motif features, and explored their putative biological functions. Among the 31 species, no significant correlation was detected between the TR density and genome size. Interestingly, green alga Chlamydomonas reinhardtii (42,059 bp/Mbp) and castor bean Ricinus communis (55,454 bp/Mbp) showed much higher TR densities than all other species (13,209 bp/Mbp on average). In the 29 land plants, including 22 dicots, 5 monocots, and 2 bryophytes, 5'-UTR and upstream intergenic 200-nt (UI200) regions had the first and second highest TR densities, whereas in the two green algae (C. reinhardtii and Volvox carteri) the first and second highest densities were found in intron and coding sequence (CDS) regions, respectively. In CDS regions, trinucleotide and hexanucleotide motifs were those most frequently represented in all species. In intron regions, especially in the two green algae, significantly more TRs were detected near the intron-exon junctions. Within intergenic regions in dicots and monocots, more TRs were found near both the 5' and 3' ends of genes. GO annotation in two green algae revealed that the genes with TRs in introns are significantly involved in transcriptional and translational processing. As the first systematic examination of TRs in plant and green algal genomes, our study showed that TRs displayed nonrandom distribution for both intragenic and intergenic regions, suggesting that they have potential roles in transcriptional or translational regulation in plants and green algae.Entities:
Keywords: SSR; genomes; green algae; plants; tandem repeats
Mesh:
Substances:
Year: 2014 PMID: 24192840 PMCID: PMC3887541 DOI: 10.1534/g3.113.008524
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1The phylogenic tree of 31 species showed in Phytozome version 8.0 (http://www.phytozome.net). The abbreviated names (the first two letters from both genus and species name are combined) and common names are listed in parentheses.
Figure 2The schematic intragenic and intergenic regions used for TR analysis. UI200: 1–200 nt upstream of 5′UTR; UI500: 201–700 nt upstream of 5′UTR; UI1000: 701–1700 nt upstream of 5′UTR; DI200: 1–200 nt downstream of 3′UTR; DI500: 201–700 nt downstream of 3′UTR; and DI1000: 701–1700 nt downstream of 3′UTR.
Figure 3Genome size vs. genomic TR density in 31 land plants and green algae. The abbreviation name is based on the first two letters from both genus and species name (see Figure 1). The linear trendline was made using Excel.
Figure 4The relative TR densities in different intragenic and intergenic regions. (A) Two green algae. (B) Twenty species including dicots and bryophytes. (C) Five monocot land plants. (D) Four land plant species without UTR annotations.
The most frequent TR motifs and GC contents in different genomic regions
| Group Name (Genome GC Content Range) | Region | Average GC Content, % | Top Motifs |
|---|---|---|---|
| Dicots (32%–40%) | Whole genome | 35.61 | A/T; AT; ATT/AAT, AAG/CTT |
| UI1000 | 32.63 | A/T; AT; ATT/AAT, AAG/CTT | |
| UI500 | 31.75 | A/T; AT; ATT/AAT | |
| UI200 | 35.56 | A/T; AT, CT; CTT/AAG | |
| 5′UTR | 39.89 | CT/AG; CTT/AAG | |
| CDS | 44.34 | AAG/CTT | |
| Intron | 33.19 | T; AT; ATT | |
| 3′UTR | 35. 41 | T; AT; ATT/AAT | |
| DI200 | 34.58 | T/A; AT, AG; AAT/ATT, AAG/CTT | |
| DI500 | 32.78 | T/A; AT; AAT/ATT, AAG/CTT | |
| DI1000 | 33.50 | T/A; AT; AAT/ATT, AAG/CTT | |
| Monocots and bryophytes (34%–55%) | Whole genome | 43.29 | AT; AAT/ATT, CCG/CGG |
| UI1000 | 42.39 | AT; AAT/ATT, CCG/CGG | |
| UI500 | 43.38 | AT; ATT, CCG | |
| UI200 | 49.20 | AT; CCG | |
| 5′UTR | 54.79 | AG/CT; CCG | |
| CDS | 53.32 | CGG/CCG | |
| Intron | 39.15 | C; CT | |
| 3′UTR | 42.34 | CTT/AAG, CCG, GT | |
| DI200 | 45.27 | AT; CGG/CCG | |
| DI500 | 42.67 | AT; CGG/CCG | |
| DI1000 | 42.56 | AT; CGG/CCG | |
| Green algae (>55%) | Whole genome | 59.58 | AC/GT; CCG/CGG; AAGCATATGCGATCTGC |
| UI1000 | 57.38 | AC/GT; CCG/CGG; AAGCATATGCGATCTGC | |
| UI500 | 55.84 | GT/AC | |
| UI200 | 52.32 | GT/AC | |
| 5′UTR | 52.82 | AGC | |
| CDS | 66.17 | CGG/CGG, AGC | |
| Intron | 58.38 | GT/AC | |
| 3′UTR | 55.42 | GCT/AGC; GT | |
| DI200 | 53.75 | GT/AC | |
| DI500 | 56.32 | GT/AC; AAGCATATGCGATCTGC | |
| DI1000 | 57.16 | GT/AC; AAGCATATGCGATCTGC |
Figure 5The relative distribution position of TRs in the intron and intergenic regions. (A) UI200 region in dicots and monocots. (B) DI200 region in dicots and monocots. (C) Intron region in the 31 investigated species. (D) UI200 region in bryophytes and green algae. (E) DI200 region in bryophytes and green algae.
Figure 6The percentage of TRs in different intragenic and intergenic regions. (A) mRNAs. (B) UI200 region. (C) 5′-UTR region. (D) 3′-UTR region. (E) CDS region. (F) Intron region.
The most significant GO functions of genes with TRs in introns in C. reinhardtii
| Term | |
|---|---|
| Catalytic activity | 5.384e−35 |
| Hydrolase activity | 1.344e−10 |
| Oxidoreductase activity | 7.792e−6 |
| Peptidase activity | 1.109e−4 |
| Peptidase activity acting on L-amino acid peptides | 1.766e−4 |
| Endopeptidase activity | 5.793e−4 |