| Literature DB >> 28800125 |
Ning Chang1, Qingqing Sun2, Jinglei Hu3, Chuanjing An4, And Hongbo Gao5.
Abstract
Most of the eukaryotic genes contain introns, which are removed from the pre-RNA during RNA processing. In contrast to the introns in animals, which are usually several kilo base pairs (kb), those in plants generally are very small, which are mostly from dozens of base pairs (bp) to a few hundred bp. According to annotation version 10.0 of the genome of Arabidopsis thaliana, there are 127,854 introns in the nuclear genes; 99.23% of them are less than 1 kb, and only 16 introns are annotated to be larger than 5 kb, which are extremely large introns (ELI) in Arabidopsis. To learn whether these introns are true introns or not and how large introns could be in Arabidopsis, RT-PCR analysis of genes containing these ELIs were carried out. The results indicated that some of these putative introns are indeed ELIs. These ELIs are mainly composed of transposons or transposable elements (TE), excepting one, whose counterparts are also very long in diverse plant species. Thus, this study confirms the existence of introns larger than 5 kb or even 10 kb in Arabidopsis.Entities:
Keywords: Arabidopsis; large intron; reverse transcription PCR; transposon
Year: 2017 PMID: 28800125 PMCID: PMC5575664 DOI: 10.3390/genes8080200
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Figure 2Gene structures of AT3G60961.1, AT1G58602.1, AT3G05410.2, AT5G13250.1, AT2G34100.2, and AT5G22090.2. (A) The structure of AT3G60961.1. (B) The structure of AT1G58602.1. (C) The structure of AT3G05410.2 in Col and Ler ecotype. (D) The structure of AT5G13250.1. (E) The structure of AT2G34100.2. (F) The structure of AT5G22090.2. White boxes represent the 5’ and 3’ untranslated region (UTR), black boxes represent the coding sequence, and black lines represent the intron. Arrows indicate the positions of forward and reverse primers for RT-PCR analysis.
Figure 1The distribution of intron lengths in Arabidopsis thaliana according to the annotation of TAIR10. (A) In the genome of A. thaliana, there are 127,854 introns of nuclear genes ranging from 8 bp to 57,631 bp. Among those, 62,565 introns are shorter than 100 bp, while 64,310 introns are from 100 to 999 bp. There are 844, 77, 23, 19, 12, and 4 introns in the range of 1000–1999 bp, 2000–2999 bp, 3000–3999 bp, 4000–4999 bp, 5000–9999 bp, and >10,000 bp, respectively. (B) A list of introns larger than 5000 bp.
An analysis of the introns larger than 5 kb in protein coding genes in A. thaliana.
| Gene | Annotated Intron | Length (bp) | Validation Results | Description of Gene | Description of Intron | Total Size of TEs in the Intron |
|---|---|---|---|---|---|---|
| AT2G34100.2-2 | 11,602 | Validated; 11,602 bp | Nonsense-mediated mRNA decay-like protein | This intron contains 13 genes or TEs: | 6325 bp; (54.52%) | |
| AT3G60961.1-1 | 10,234 | Validated; 10,247 bp | P-loop containing nucleoside triphosphate hydrolases superfamily protein | This intron was also annotated as AT3G60965, a copia-like retrotransposon. | 9903 bp (96.64%) | |
| AT2G34110.1-1 | 9724 | Not validated | hypothetical protein | NA | ||
| AT1G58602.1-1 | 7384 | Validated; 7385 bp | LRR and NB-ARC domains-containing disease resistance protein | This intron contains 4 TEs: | 6164 bp (83.47%) | |
| AT1G58602.1-2 and AT1G58602.1-3 | 6070 | Validated; 6070 bp | LRR and NB-ARC domains-containing disease resistance protein | These two annotated introns and the exon between them were found to be an intron of 6070 bp. It contains 10 TEs: | 5192 bp (85.54%) | |
| AT3G05410.2-2 | 5748 | Validated; 5748 bp | Photosystem II reaction center OEC23 protein | A major part of the intron is a transposon, | 5261 bp (91.53%) | |
| AT5G13250.1-3 | 5670 | Validated; 5673 bp | RING finger protein | This intron contains | 530 bp (9.34%) | |
| AT3G52700.1-1 | 5134 | Not validated | Hypothetical protein | NA | ||
| AT5G22090.2-1 | 5082 | Validated; 5082 bp | FAF-like protein | This intron contains 3 TEs: | 1985 bp (39.06%) |
TE: transposable element; mRNA: messenger RNA; LRR: leucine-rich repeat; NB: Nucleotide Binding; ARC: Apaf-1, certain R gene products, and CED-4; OEC: Oxygen-Evolving Complex; RING: Really Interesting New Gene; FAF: Fantastic Four; NA: not applicable.
Figure 3RT-PCR analysis of the genes shown in Figure 2. (A) RT-PCR analysis of AT3G60961.1, AT1G58602.1, AT3G05410.2, and AT5G13250.1 with RNA extracted from leaves. Positions of primers are shown in Figure 2. (B) RT-PCR analysis of AT2G34100.2 and AT5G22090.2 with RNA extracted from floral tissues. Positions of primers are shown in Figure 2. (C) Semi-quantitative RT-PCR analysis of AT3G05410.2 in Col and Ler ecotype. Black triangles above indicate that the quantity of cDNA was serially diluted three times with a dilution factor of four (from left to right). PP2AA3 gene (AT1G13320) was used as a control.
Lengths of intron AT3G05410.2-2 and its counterparts in other plants.
| Organism | Length (bp) |
|---|---|
| 5748 | |
| 456 | |
| 461 | |
| 391 | |
| 77 | |
| 83 | |
| 18 | |
| 254 | |
| 130 | |
| 2903 |
Lengths of intron AT5G13250.1-3 and its counterparts in other plants.
| Organism | Length (bp) |
|---|---|
| 5670 | |
| 8873 | |
| 6986 | |
| 5302 | |
| 6587 | |
| 9945 | |
| 16,721 | |
| 6684 | |
| 6244 | |
| 5931 |