| Literature DB >> 33297579 |
Ilya Kirov1,2, Murad Omarov1,3, Pavel Merkulov1, Maxim Dudnikov1,2, Sofya Gvaramiya1, Elizaveta Kolganova1, Roman Komakhin1, Gennady Karlov1, Alexander Soloviev1.
Abstract
LTR retrotransposons (RTEs) play a crucial role in plant genome evolution and adaptation. Although RTEs are generally silenced in somatic plant tissues under non-stressed conditions, some expressed RTEs (exRTEs) escape genome defense mechanisms. As our understanding of exRTE organization in plants is rudimentary, we systematically surveyed the genomic and transcriptomic organization and mobilome (transposition) activity of sunflower (Helianthus annuus L.) exRTEs. We identified 44 transcribed RTEs in the sunflower genome and demonstrated their distinct genomic features: more recent insertion time, longer open reading frame (ORF) length, and smaller distance to neighboring genes. We showed that GAG-encoding ORFs are present at significantly higher frequencies in exRTEs, compared with non-expressed RTEs. Most exRTEs exhibit variation in copy number among sunflower cultivars and one exRTE Gagarin produces extrachromosomal circular DNA in seedling, demonstrating recent and ongoing transposition activity. Nanopore direct RNA sequencing of full-length RTE RNA revealed complex patterns of alternative splicing in RTE RNAs, resulting in isoforms that carry ORFs for distinct RTE proteins. Together, our study demonstrates that tens of expressed sunflower RTEs with specific genomic organization shape the hidden layer of the transcriptome, pointing to the evolution of specific strategies that circumvent existing genome defense mechanisms.Entities:
Keywords: gag; mobilome; nanopore; retrotransposons; splicing; sunflower; transcription
Mesh:
Substances:
Year: 2020 PMID: 33297579 PMCID: PMC7730604 DOI: 10.3390/ijms21239331
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1RNAseq-based identification of expressed LTR retrotransposons (exRTEs) in sunflower. (A): Scheme of the pipeline used in this study. (B): Venn diagram showing the number of exRTEs with detectable expression in stressed and non-stressed conditions. (C): Examples of RNAseq coverage plots for two exRTEs with distinct patterns of expression. (D): RT-PCR showing the expression of selected exRTEs in five-day-old seedlings.
NCBI accession numbers of RNAseq read archives used in this study.
| Accession Number in NCBI | Description | Number of Reads after Quality Filtering |
|---|---|---|
| SRR7691052 | Pistil | 19,498,261 |
| SRR7691053 | Stamen | 21,945,535 |
| SRR7691054 | Ligule | 21,602,172 |
| SRR7691055 | Leaf | 20,908,077 |
| SRR7691059 | NaCl 12 h | 23,081,013 |
| SRR7691051 | NaCl 3 h | 21,727,606 |
| SRR7691057 | Seeds | 22,697,764 |
| SRR7691056 | Roots | 18,190,537 |
| SRR7691047 | PEG 12 h | 20,937,047 |
| SRR7691048 | PEG 6 h | 23,482,045 |
| SRR4996808 | Ovary | 13,636,857 |
| SRR4996851 | ABA leaves | 31,312,818 |
| SRR4996849 | Meja leaves | 20,970,863 |
Figure 2Classification and insertion time analysis in expressed LTR retrotransposons (exRTEs). (A): A bar plot comparing the number of Copia and Gypsy RTEs among exRTEs and non-expressed RTEs (n-exRTEs). Three stars indicate significant differences based on Fisher’s Exact Test for Count Data; p-value < 0.001. (B): Distribution of insertion time (mya: million years ago) values for Copia (blue) and Gypsy (red) superfamilies in the sunflower genome. (C): Clade-based classification of (n-)exRTEs. Three stars indicate significant differences based on Fisher’s Exact Test for Count Data; p-value < 0.001. (D): Insertion time (mya: million years ago) calculated for the individual (n-)exRTE clades. Two and three stars indicate significant differences based on Wilcoxon rank sum test p-value < 0.01 and < 0.001, respectively.
Figure 3Open reading frame (ORF) length and domain composition analysis. (A): Boxplot comparing the distribution of maximum ORF length for exRTEs and n-exRTEs. A line in each box represents the median size of the longest ORF. Top and bottom edges of the box indicate the 75th and 25th percentiles, respectively. (B): Bar plot demonstrating the proportion of (n-)exRTE with predicted proteins exhibiting similarity to canonical RTE proteins. (C): Bar plot demonstrating the percentage of (n-)exRTEs with similarities to various combinations of canonical RTE proteins. (D): Pie charts demonstrating the number of (n-)exRTEs possessing ORFs that encode proteins with similarities to GAG (left), and the number of these proteins with and without RNA-binding motif (CX2CX4HX4C, right). (E): Logo of the amino acid sequence of the RBM motif of (n-)exRTEs GAG protein. *** indicates significant differences based on Fisher’s Exact Test for Count Data, p-value < 0.001.
Figure 4The distance of (n-)exRTEs to the adjacent genes. (A): Box plot of the distance (kb) to the closest genes for exRTEs and n-exRTEs. (B): Examples of exRTEs and their proximity to the closest genes. Three stars indicate significant differences based on Wilcoxon rank sum test p-value < 0.001 (C): Bar plot showing the number of (n-)exRTEs with different distances to the closest genes.
Figure 5Mobilome activity of exRTEs. (A): Heatmap demonstrating the variation in ratio between the normalized RTE coverage and minimum read coverage for this RTE across 13 sunflower cultivars (ACM ratio, log2 transformed values are represented). Bar plot of insertion time values and boxplots showing distribution of reads per kilobase per million reads (RPKM) and normalized log2 transformed values of RTE coverage by genomic reads of 13 cultivars are drawn on the right side. (B): Inverted-PCR with total genomic DNA (gDNA) and genomic DNA enriched by extrachromosomal coiled DNA (eccDNA). Schema of primer annealing on eccDNA with single LTR (result of homologous recombination between two LTRs of RTE cDNA) and double LTRs (result of non-homologous end-joining of two LTRs of RTE cDNA) is represented in the top. Diamonds and stars indicate exRTEs used for eccDNA assay. Stars point to “Gagarin” RTE.
Figure 6Detection of RTE transcripts by nanopore direct RNA sequencing (DRS). (A): Organization of the main protein domains and LTRs for Tyran (exRTEs, left) and Varan (RTE, right). DRS reads mapped to the genomic sequences of these RTEs are colored blue. Primer positions used for isoform verification are depicted below these. (B): Isoform composition for Varan and Tyran based on DRS read mapping and alignment of cDNA sequencing obtained with ORF primers (only for Tyran). (C): RNAseq (leaves) read coverage plot. (D): Results of RT-PCR with primers designed on Varan and Tyran isoforms and Actin (Act) in RNA extracted from 5-day-old seedlings.
Sequence Read Archive (SRA) accessions of genomic reads of 13 sunflower cultivars [33].
| SRA | Cultivar |
|---|---|
| SRR10484607 | SAM227 |
| SRR10484608 | SAM060 |
| SRR10484609 | SAM167 |
| SRR10484610 | SAM175 |
| SRR10737894 | SAM210 |
| SRR5140325 | SAM012 |
| SRR5140331 | SAM011 |
| SRR5140336 | SAM010 |
| SRR5140395 | SAM006 |
| SRR5907847 | ann04-nwAR |
| SRR5907848 | ann05-ccNM |
| SRR5907869 | ann01-cwIA |
| SRR5912489 | SAM009 |
RT-PCR primers.
| Gene/TE ids | Primers |
|---|---|
| Actin | TTCAACGTTCCCGCCATGTA; |
| TE01s125448413 | ATTGGCTTCGATCCATCTCGACG; |
| TE04s156439376 | CACTGTGACTTGTGGACATCCCC; |
| TE05s178342634 | CCGGGTCAACCTGTCATGGATTT; |
| TE13s189833316 | ACCACTTAGCAGCACAAACTCGT; |
| TE05s252167574 | AGCCGTACAGAGACGAAGAGACA; |
| TE09s40344039 | GATCTGGAGCATGCGTATGGAGG; |
| TE04s45676682 | TACCAGCAAGAATTTGAGCGGCT; |
| TE03s54222821 | TAGAACTCTTGCTAGGGCGTGGA; |
| TE08s71329455 | GATGGGTGATGGTTCGGGTGAAA; |
| Int1/2 (Tyran) | CCAGTCACCAGGATTCTCCC; |
| ORF (Tyran) |
AGGGTGATAGTTCTGGGTCCT; |
| gRNA (Varan) | CTGTTTCAGCCCATACAGCGACT; |
Primers for eccDNA amplification.
| Gene/TE Id | Primers |
|---|---|
| Tyran | TCACTTGCTTGGAGATATGGGT; |
| Gagarin | CGAAGAGGCTACTTGGGAGA; |
| TE13s189833316 | CAAAACCCGCTTCAAAGAAA; |
| TE05s252167574 | GGTGAGGTTGACGGTGGTAT; |
| TE04s45676682 | GGATTTGTTTGTTTTAATGTGATG; |