| Literature DB >> 30184083 |
Wenzhen Cheng1, Yunlin Zhou1, Xin Miao1, Chuanjing An1, Hongbo Gao1.
Abstract
Most eukaryotic genes contain introns, which are noncoding sequences that are removed during premRNA processing. Introns are usually preserved across evolutionary time. However, the sizes of introns vary greatly. In Arabidopsis, some introns are longer than 10 kilo base pairs (bp) and others are predicted to be shorter than 10 bp. To identify the shortest intron in the genome, we analyzed the predicted introns in annotated version 10 of the Arabidopsis thaliana genome and found 103 predicted introns that are 30 bp or shorter, which make up only 0.08% of all introns in the genome. However, our own bioinformatics and experimental analyses found no evidence for the existence of these predicted introns. The predicted introns of 30-39 bp, 40-49 bp, and 50-59 bp in length are also rare and constitute only 0.07%, 0.2%, and 0.28% of all introns in the genome, respectively. An analysis of 30 predicted introns 31-59 bp long verified two in this range, both of which were 59 bp long. Thus, this study suggests that there is a limit to how small introns in A. thaliana can be, which is useful for the understanding of the evolution and processing of small introns in plants in general.Entities:
Mesh:
Substances:
Year: 2018 PMID: 30184083 PMCID: PMC6161759 DOI: 10.1093/gbe/evy197
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
. 1.—Distribution of the predicted introns shorter than 100 bp. (A) In the Arabidopsis genome, there are 62,565 introns shorter than 100 bp. A classification of these introns based on the size is shown. Numbers of introns 50–59 bp and 40–49 bp in length are 357 and 253, respectively. (B) The length of introns 30 bp or shorter and the number of introns of that length, as predicted by TAIR.
. 2.—A diagram of the principle for the RT-PCR analysis primers. Besides the putative very small intron, another intron was also included in the RT-PCR analysis to make sure that the PCR product is from a true cDNA fragment. Black boxes represent the exon, and black lines represent the intron. Upper: the very small intron is before another larger intron; lower: the very small intron is after another larger intron. The arrows indicate the positions of the primers for RT-PCR analysis.
. 3.—Electrophoresis analysis of the RT-PCR products of the selected 48 predicted introns. Each number (1–48) corresponds with one predicted intron from RT-PCR analysis, also shown in table 1. Bands of cDNA are marked with ▲; bands of genomic DNA are marked with *. Molecular weight markers are 100 bp DNA ladders (New England Biolabs).
Some Basic Information of the Very Small Introns Analyzed in This Study
| No. | Predicted Intron | Size (bp) | Sequencing Results | Existence | No. of Splice Variants | Homologous Genes |
|---|---|---|---|---|---|---|
| 1 | AT1G62580.1-5 | 27 | cDNA | No | 3 | |
| 2 | AT2G04395.1-2 | 29 | cDNA | No | 5 | |
| 3 | AT5G51795.1-2 | 28 | gDNA | UJ | 1 | |
| 4 | AT2G07240.1-4 | 30 | gDNA | UJ | 1 | No homolog |
| 5 | AT2G21330.3-6 | 30 | cDNA | No | 3 | |
| 6 | AT2G44980.1-10 | 30 | cDNA | No | 3 | No homolog |
| 7 | AT5G50080.1-1 | 27 | cDNA | No | 2 | |
| 8 | AT3G53740.1-3 | 27 | cDNA | No | 4 | |
| 9 | AT2G41700.2-18 | 30 | cDNA | No | 2 | No homolog |
| 10 | AT2G31370.5-6 | 28 | cDNA | No | 7 | |
| 11 | AT1G51490.1-10 | 23 | cDNA | No | 1 | No homolog |
| 12 | AT3G51260.2-3 | 21 | cDNA | No | 2 | |
| 13 | AT3G55280.3-3 | 18 | cDNA | No | 3 | |
| 14 | AT4G35300.3-3 | 30 | cDNA | No | 11 | No homolog |
| 15 | AT1G01620.2-1 | 29 | cDNA | No | 2 | |
| 16 | AT3G53980.2-2 | 25 | cDNA | No | 2 | |
| 17 | AT3G59350.3-6 | 23 | cDNA | No | 6 | |
| 18 | AT2G05520.2-2 | 21 | cDNA | No | 6 | |
| 19 | AT2G10930.1-1 | 29 | gDNA | UJ | 1 | |
| 20 | AT4G38300.1-2 | 28 | cDNA | No | 1 | |
| 21 | AT1G71280.1-2 | 25 | cDNA | No | 2 | |
| 22 | AT3G28170.1-1 | 10 | gDNA | UJ | 1 | No homolog |
| 23 | AT1G18050.1-3 | 8 | cDNA | No | 1 | No homolog |
| 24 | AT5G22050.1-7 | 20 | gDNA | UJ | 2 | No homolog |
| 25 | AT2G40920.2-1 | 16 | cDNA | No | 2 | |
| 26 | AT1G27290.2-2 | 16 | cDNA | No | 2 | No homolog |
| 27 | AT1G02950.3-4 | 15 | cDNA | No | 5 | No homolog |
| 28 | AT1G31170.3-5 | 15 | cDNA | No | 5 | No homolog |
| 29 | AT5G48760.2-1 | 13 | cDNA | No | 2 | |
| 30 | AT2G14720.2-1 | 10 | cDNA | No | 2 | |
| 31 | AT5G30341.1-1 | 30 | cDNA | No | 1 | |
| 32 | AT4G06479.1-1 | 29 | gDNA | UJ | 1 | No homolog |
| 33 | AT2G13125.1-1 | 29 | gDNA | UJ | 1 | |
| 34 | AT2G06500.1-1 | 29 | gDNA | UJ | 1 | No homolog |
| 35 | AT1G49015.1-2 | 29 | gDNA | UJ | 1 | |
| 36 | AT3G28020.1-4 | 28 | gDNA | UJ | 1 | |
| 37 | AT1G76720.1-13 | 26 | gDNA | UJ | 2 | |
| 38 | AT2G18530.1-2 | 24 | gDNA | UJ | 1 | |
| 39 | AT2G24340.1-3 | 24 | gDNA | UJ | 1 | No homolog |
| 40 | AT3G27600.1-1 | 23 | gDNA | UJ | 1 | No homolog |
| 41 | AT2G13125.1-2 | 23 | gDNA | UJ | 1 | No homolog |
| 42 | AT2G11010.1-4 | 23 | gDNA | UJ | 1 | |
| 43 | AT1G35860.1-1 | 23 | gDNA | UJ | 1 | No homolog |
| 44 | AT2G05440.4-2 | 21 | cDNA | No | 9 | |
| 45 | AT3G05450.1-1 | 19 | cDNA | No | 1 | No homolog |
| 46 | AT1G72270.1-13 | 18 | cDNA | No | 1 | |
| 47 | AT4G13850.2-5 | 15 | cDNA | No | 4 | |
| 48 | AT1G24460.1-4 | 14 | cDNA | No | 2 | No homolog |
| S1 | AT1G76530.1-5 | 31 | cDNA | 80 bp | 3 | |
| S2 | AT4G01780.1-2 | 32 | gDNA | UJ | 1 | No homolog |
| S3 | AT4G28670.1-3 | 34 | cDNA | 74 bp | 1 | No homolog |
| S4 | AT4G20900.1-4 | 35 | cDNA | 83 bp | 2 | |
| S5 | AT5G07510.2-2 | 36 | cDNA | No | 3 | |
| S6 | AT2G36010.2-1 | 36 | cDNA | 516 bp | 3 | No homolog |
| S7 | AT3G56300.1-5 | 37 | cDNA | No | 3 | |
| S8 | AT1G02670.1-5 | 37 | cDNA | No | 7 | |
| S9 | AT1G16150.1-2 | 38 | cDNA | 92 bp | 1 | |
| S10 | AT1G14390.1-4 | 39 | cDNA | 96 bp | 1 | |
| S11 | AT3G13920.2-5 | 41 | cDNA | No | 5 | |
| S12 | AT4G04710.1-3 | 43 | cDNA | 82 bp | 4 | |
| S13 | AT5G40600.1-1 | 44 | cDNA | 534 bp | 4 | No homolog |
| S14 | AT1G19090.1-3 | 44 | cDNA | No | 1 | |
| S15 | AT4G04680.1-4 | 45 | cDNA | No | 2 | |
| S16 | AT1G48740.1-4 | 45 | cDNA | 81 bp | 4 | |
| S17 | AT3G11040.1-9 | 45 | cDNA | 111 bp | 2 | |
| S18 | AT2G35075.1-2 | 46 | gDNA | UJ | 1 | No homolog |
| S19 | AT4G15300.1-3 | 46 | cDNA | 88 bp | 3 | |
| S20 | AT4G14310.2-2 | 47 | cDNA | 363 bp | 2 | No homolog |
| S21 | AT3G43290.1-1 | 51 | gDNA | UJ | 1 | |
| S22 | AT3G56160.1-1 | 52 | cDNA | 133 bp | 5 | No homolog |
| S23 | AT3G09090.2-12 | 55 | cDNA | 71 bp | 3 | No homolog |
| S24 | AT1G15120.2-5 | 55 | cDNA | 92 bp | 2 | |
| S25 | AT4G12750.1-7 | 56 | cDNA | 98 bp | 1 | No homolog |
| S26 | AT4G21820.1-9 | 56 | cDNA | 102 bp | 3 | No homolog |
| S27 | AT4G24930.1-3 | 59 | cDNA | 1 | No homolog | |
| S28 | AT3G23080.2-4 | 59 | cDNA | 3 | ||
| S29 | AT2G30650.1-3 | 59 | gDNA | UJ | 2 | No homolog |
| S30 | AT2G29390.1-5 | 59 | cDNA | 95 bp | 6 |
Note.—Genes with an E value of 1E–7 or smaller in the BLAST search were seen as homlogous genes. Some of the introns were larger than predicted, so their actual sizes were shown in the table.
DNA: genomic DNA; UJ: unable to judge, because the sequence of the PCR product is the same as the genomic DNA.
AT4G00430, AT3G61430, AT4G23400, AT2G45960, AT2G16850, AT4G35100, AT3G53420, AT2G37170, AT2G37180, AT4G00413, AT3G54820.
AT2G43230, AT3G17410, AT2G47060, AT3G62220, AT2G30740, AT1G48210, AT1G06700, AT2G30730.
AT2G05380, AT2G05530, AT2G05440, AT2G05441, AT2G05510.
AT4G04720, AT4G04695, AT4G04740, AT4G04700, AT4G21940.
. 4.—Electrophoresis analysis of the RT-PCR products from the selected 30 predicted introns 31–59 bp in length. S1–S30 correspond to each intron that was analyzed, which are also shown in table 1. Bands of cDNA are marked with ▲; bands of genomic DNA are marked with *. Molecular weight markers are 100 bp DNA ladders (New England Biolabs).
. 5.—Analysis of the introns in AT1G71280. Upper: the predicted gene model; the very small predicted intron on the right, AT1G71280.1-2, does not exist. Lower: the confirmed gene model; the intron AT1G71280.1-1 is 66 bp.