| Literature DB >> 23923055 |
Shu-Ye Jiang1, Srinivasan Ramachandran.
Abstract
Long terminal repeat (LTR) retrotransposons are the major class I mobile elements in plants. They play crucial roles in gene expansion, diversification and evolution. However, their captured genes are yet to be genome-widely identified and characterized in most of plants although many genomes have been completely sequenced. In this study, we have identified 7,043 and 23,915 full-length LTR retrotransposons in the rice and sorghum genomes, respectively. High percentages of rice full-length LTR retrotransposons were distributed near centromeric region in each of the chromosomes. In contrast, sorghum full-length LTR retrotransposons were not enriched in centromere regions. This dissimilarity could be due to the discrepant retrotransposition during and after divergence from their common ancestor thus might be contributing to species divergence. A total of 672 and 1,343 genes have been captured by these elements in rice and sorghum, respectively. Gene Ontology (GO) and gene set enrichment analysis (GSEA) showed that no over-represented GO term was identified in LTR captured rice genes. For LTR captured sorghum genes, GO terms with functions in DNA/RNA metabolism and chromatin organization were over-represented. Only 36% of LTR captured rice genes were expressed and expression divergence was estimated as 11.9%. Higher percentage of LTR captured rice genes have evolved into pseudogenes under neutral selection. On the contrary, higher percentage of LTR captured sorghum genes were under purifying selection and 72.4% of them were expressed. Thus, higher percentage of LTR captured sorghum genes was functional. Small RNA analysis suggested that some of LTR captured genes in rice and sorghum might have been involved in negative regulation. On the other hand, positive selection has been observed in both rice and sorghum LTR captured genes and some of them were still expressed and functional. The data suggest that some of these LTR captured genes might have evolved into new gene functions.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23923055 PMCID: PMC3726574 DOI: 10.1371/journal.pone.0071118
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1LTR retrotransposons in the rice, sorghum and maize genomes.
(A) A general information of the rice, sorghum and maize genomes including their genome size and annotated genes. The rice genome size and annotation were based on the release 7 of pseudomolecules (http://rice.plantbiology.msu.edu/). The genome size of sorghum and maize as well as their annotation were estimated according to Paterson et al. (2009) [15] and Schnable et al. (2009) [5]. (B) The occupied genome size of LTR retrotransposons and its percentage in the whole genome in rice, sorghum and maize. (C) Genome-wide identification of full-length LTR retrotransposons and their occupied genome size as well as percentages in rice, sorghum and maize. (D) LTR captured genes in rice, sorghum and maize.
Figure 2Chromosomal distributions of full-length LTR retrotransposons in the rice and sorghum genomes.
Density distributions are based on the physical positions of corresponding LTR retrotransposons. X-axis indicates chromosomal positions (Mb). Y-axis indicates retrotransposon density (the percentage of total number of retrotransposons). (A) and (B) show the distributions of retrotransposons in the rice and sorghum genomes, respectively. Centromere positions are marked with red dots on each chromosome. Blue lines indicate high percentages of full-length LTR retrotransposons near chromosome centromere regions and green curves indicate that high percentages of retrotransposons are not located near centromere regions.
Figure 3LTR retrotransposon mediated gene expansion in rice and sorghum.
(A) and (B) Examples show the expansion of full-length LTR captured genes in rice and sorghum, respectively. Pink boxes indicate predicted 5′-LTR (left) and 3′-LTR (right) as well as their length. Red boxes show PBSs and their sequences as well as their positions. Brown boxes show PPTs and their sequences as well as their positions. Blue boxes indicate LTR captured genes and their positions. Green arrowheads indicate the start and end positions of a LTR retrotransposon.
Figure 4Gene set enrichment analysis and Protein domain analysis in LTR retrotransposon captured genes.
(A) Gene set enrichment analysis in sorghum. Blue and red columns indicate the percentages of this GO term in all LTR captured sorghum proteins and in total annotated proteins, respectively. The percentage is calculated as the frequency of the total numbers of each GO term in all LTR captured proteins with GO term assigned or in all annotated proteins with GO term assigned. “P”, “F” and “C” in (A) indicate GO three categories biological process, molecular function and cellular component, respectively. GO term annotation in (A) was shown below: 1, DNA integration; 2, reproductive cellular process; 3, DNA metabolic process; 4, RNA-dependent DNA replication; 5, DNA replication; 6, multicellular organism reproduction; 7, reproductive process in a multicellular organism; 8, chromatin assembly or disassembly; 9, chromatin organization; 10, RNA-directed DNA polymerase activity; 11, DNA polymerase activity; 12, aspartic-type endopeptidase activity; 13, aspartic-type peptidase activity; 14, nucleotidyltransferase activity; 15, ribonuclease H activity; 16, endonuclease activity, active with either ribo- or deoxyribonucleic acids and producing 5′-phosphomonoesters; 17, chromatin binding; 18, endoribonuclease activity, producing 5′-phosphomonoesters; 19, nucleic acid binding; 20, endopeptidase activity; 21, peptidase activity; 22, hydrolase activity; 23, di-, tri-valent inorganic cation transmembrane transporter activity; 24, endoribonuclease activity; 25, nuclease activity; 26, endonuclease activity; 27, ribonuclease activity; 28, peptidase activity, acting on L-amino acid peptides; 29, zinc ion transmembrane transporter activity; 30, chromosomal part; 31, chromatin. (B) and (C) Pfam domain/motif analysis in rice and sorghum, respectively. Green and pink columns indicate the percentages of this domain in the LTR captured protein and in total proteins, respectively. Commonly detected over- or under- represented domains in rice and sorghum are highlighted with pink lines. Two stars indicate statistically significant differences at P value <0.01. PF00067, oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen; PF00069, protein phosphorylation; PF00076, nucleic acid binding; PF00122, nucleotide binding; PF00201, transferase activity, transferring hexosyl groups; PF00385, ‘chromo’ (CHRromatin Organisation MOdifier) domain; PF00400, WD domain, G-beta repeat; PF00560, protein binding; PF00642, nucleic acid binding; PF00806, RNA binding; PF00931, apoptosis; PF01190, Pollen proteins Ole e I like; PF01535, PPR repeat; PF05754, Domain of unknown function (DUF834); PF07714, protein phosphorylation; PF11835, Domain of unknown function (DUF3355.
Figure 5Distribution of Ka, Ks and Ka/Ks values of LTR captured gene pairs in rice and sorghum.
(A) The distribution curves of Ka/Ks values for LTR captured gene pairs in rice (green curve) and sorghum (pink curve). (B) Average values of Ka, Ks and Ka/Ks in rice (green bars) and sorghum (pink bars). (C) The distribution curves of Ks values for LTR captured gene pairs in rice (green curve) and sorghum (pink curve).
Figure 6Pseudogenes in rice LTR retrotransposon captured genes.
(A) Total of pseudogenes identified from the version 7 of rice genome pseudomolecules (blue column) and LTR captured genes (red column), respectively. (B) Pseudogene percentages in total of annotated genes in the version 7 of rice genome (blue column) and in LTR captured genes (red column), respectively. (C) An example of expansion and evolution of a pseudogene in rice. (D) An expanded gene is truncated in its encoded protein, which was supported by corrsponding full-length cDNA sequence and is regarded as a pseudigene.
Figure 7Expression and smRNA analysis.
(A) and (B) A summary of expression profiling of LTR captured genes in rice and sorghum, respectively. The analysis was based on collected full-length cDNA/EST, MPSS, microarray expression data and/or RNA_Seq data. (C) Expression percentages of LTR captured genes and total annotated genes in rice and sorghum. (D) Expression divergence of genes expanded by LTR retrotransposons in rice and sorghum. (C) Unique and expressed small RNAs located on LTR captured genes in rice and sorghum.