| Literature DB >> 25710723 |
Florent Hubé1, Claire Francastel2.
Abstract
Introns represent almost half of the human genome, yet their vast majority is eliminated from eukaryotic transcripts through RNA splicing. Nevertheless, they feature key elements and functions that deserve further interest. At the level of DNA, introns are genomic segments that can shelter independent transcription units for coding and non-coding RNAs which transcription may interfere with that of the host gene, and regulatory elements that can influence gene expression and splicing itself. From the RNA perspective, some introns can be subjected to alternative splicing. Intron retention appear to provide some plasticity to the nature of the protein produced, its distribution in a given cell type and timing of its translation. Intron retention may also serve as a switch to produce coding or non-coding RNAs from the same transcription unit. Conversely, splicing of introns has been directly implicated in the production of small regulatory RNAs. Hence, splicing of introns also appears to provide plasticity to the type of RNA produced from a genetic locus (coding, non-coding, short or long). We addressed these aspects to add to our understanding of mechanisms that control the fate of introns and could be instrumental in regulating genomic output and hence cell fate.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25710723 PMCID: PMC4394429 DOI: 10.3390/ijms16034429
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Distribution of exons and introns across human chromosomes. Human genome dataset was downloaded from the UCSC main table browser (GCRh38, December 2013 build). Data was processed using tabular software. Analysis was performed using the 34,856 genes containing 312,351 exons and 277,495 introns. Caution: introns with the length <30–40 nt likely reflect artifact or error (see text).
| Chr # | Total # Genes | Total # Exons | Total # Introns | Max # Exons/Gene | Chromosome Size (bp) | Avg # of Exons/Gene | Avg Length (bp) ± Std Dev | Total Length (bp) | Shortest (bp) | Longest (bp) | Genes/Millions | Intronless Genes | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Exon | Intron | Exon | Intron | Exon | Intron | Gene | Exon | Intron | Gene | |||||||||
| 3592 | 31,744 | 28,152 | 138 | 248,956,422 | 8.8 | 313 ± 705 | 5283 ± 16,017 | 9,934,447 | 148,725,908 | 3 | 1 | 41 | 12,573 | 451,448 | 1,491,100 | 14.4 | 544 | |
| 2208 | 24,805 | 22,597 | 363 | 242,193,529 | 11.2 | 299 ± 718 | 6574 ± 21,058 | 7,407,514 | 147,572,357 | 1 | 14 | 49 | 17,969 | 866,400 | 1,900,275 | 9.1 | 203 | |
| 1916 | 18,292 | 16,376 | 118 | 198,295,559 | 9.5 | 317 ± 759 | 7669 ± 25,134 | 5,801,464 | 125,580,914 | 3 | 1 | 21 | 24,927 | 842,378 | 1,502,150 | 9.7 | 181 | |
| 1234 | 10,880 | 9646 | 84 | 190,214,555 | 8.8 | 343 ± 740 | 8687 ± 26,112 | 3,736,564 | 83,794,301 | 8 | 1 | 44 | 9856 | 912,253 | 1,474,687 | 6.5 | 197 | |
| 1496 | 13,193 | 11,697 | 90 | 181,538,259 | 8.8 | 342 ± 778 | 7678 ± 23,233 | 4,508,195 | 89,805,936 | 6 | 12 | 43 | 22,753 | 772,519 | 1,519,058 | 8.2 | 205 | |
| 1793 | 16,106 | 14,313 | 146 | 170,805,979 | 9.0 | 326 ± 719 | 6876 ± 20,012 | 5,256,442 | 98,421,513 | 6 | 1 | 50 | 15,177 | 478,750 | 1,987,246 | 10.5 | 227 | |
| 1629 | 15,280 | 13,651 | 108 | 159,345,973 | 9.4 | 315 ± 763 | 7770 ± 24,375 | 4,806,589 | 106,063,166 | 2 | 1 | 53 | 21,017 | 657,297 | 2,304,636 | 10.2 | 187 | |
| 1207 | 10,041 | 8834 | 86 | 145,138,636 | 8.3 | 335 ± 779 | 8354 ± 26,109 | 3,363,013 | 73,802,469 | 5 | 12 | 23 | 15,980 | 955,098 | 2,059,454 | 8.3 | 160 | |
| 1402 | 12,839 | 11,437 | 98 | 138,394,717 | 9.2 | 314 ± 714 | 6073 ± 16,910 | 4,029,327 | 69,453,705 | 3 | 5 | 54 | 10,345 | 344,501 | 2,298,478 | 10.1 | 200 | |
| 1346 | 12,626 | 11,280 | 68 | 133,797,422 | 9.4 | 321 ± 723 | 7821 ± 23,764 | 4,058,525 | 88,222,345 | 5 | 67 | 50 | 11,090 | 482,575 | 1,783,674 | 10.1 | 131 | |
| 2112 | 17,709 | 15,597 | 90 | 135,086,622 | 8.4 | 311 ± 959 | 5306 ± 20,073 | 5,501,031 | 82,757,607 | 2 | 1 | 50 | 91,671 | 811,152 | 1,468,409 | 15.6 | 377 | |
| 1773 | 17,618 | 15,845 | 173 | 133,275,309 | 9.9 | 300 ± 686 | 5076 ± 15,654 | 5,281,875 | 80,428,519 | 9 | 5 | 50 | 14,194 | 403,400 | 1,249,864 | 13.3 | 169 | |
| 715 | 5982 | 5267 | 83 | 114,364,328 | 8.4 | 339 ± 957 | 9101 ± 29,323 | 2,025,232 | 47,933,871 | 5 | 66 | 48 | 37,567 | 740,920 | 1,468,616 | 6.3 | 94 | |
| 1171 | 9771 | 8600 | 116 | 107,043,718 | 8.3 | 319 ± 734 | 6212 ± 20,436 | 3,115,048 | 53,426,644 | 4 | 14 | 46 | 17,546 | 479,079 | 1,464,560 | 10.9 | 229 | |
| 1188 | 11,480 | 10,292 | 104 | 101,991,189 | 9.7 | 306 ± 698 | 5672 ± 17,698 | 3,514,500 | 58,372,375 | 8 | 21 | 33 | 11,532 | 732,200 | 887,042 | 11.6 | 220 | |
| 1468 | 13,212 | 11,744 | 63 | 90,338,345 | 9.0 | 285 ± 607 | 3892 ± 17,423 | 3,766,310 | 45,712,994 | 3 | 1 | 51 | 10,024 | 778,855 | 1,694,208 | 16.3 | 159 | |
| 2069 | 19,482 | 17,413 | 85 | 83,257,441 | 9.4 | 289 ± 615 | 3531 ± 13,106 | 5,622,686 | 61,484,756 | 8 | 1 | 47 | 10,345 | 1,043,910 | 1,143,719 | 24.9 | 251 | |
| 495 | 4568 | 4073 | 75 | 80,373,285 | 9.2 | 362 ± 850 | 10,552 ± 25,422 | 1,653,586 | 42,976,591 | 9 | 75 | 50 | 14,862 | 411,175 | 1,195,732 | 6.2 | 58 | |
| 2449 | 18,534 | 16,085 | 106 | 58,617,616 | 7.6 | 297 ± 655 | 2382 ± 6158 | 5,507,290 | 38,316,474 | 6 | 2 | 47 | 21,693 | 255,789 | 301,152 | 41.8 | 358 | |
| 983 | 8035 | 7052 | 80 | 64,444,167 | 8.2 | 312 ± 675 | 5515 ± 18,323 | 2,509,662 | 38,891,514 | 8 | 31 | 50 | 10,441 | 544,980 | 2,057,697 | 15.3 | 103 | |
| 463 | 3745 | 3282 | 47 | 46,709,983 | 8.1 | 309 ± 728 | 6401 ± 18,061 | 1,157,367 | 21,006,837 | 9 | 9 | 60 | 13,351 | 323,564 | 834,698 | 9.9 | 95 | |
| 797 | 7029 | 6232 | 55 | 50,818,468 | 8.8 | 312 ± 712 | 4321 ± 12,968 | 2,194,256 | 26,927,006 | 4 | 1 | 52 | 12,955 | 355,998 | 701,852 | 15.7 | 84 | |
| 1152 | 8062 | 6910 | 84 | 156,040,895 | 7.0 | 350 ± 873 | 6898 ± 24,018 | 2,820,104 | 47,662,441 | 7 | 67 | 48 | 37,027 | 536,479 | 1,368,337 | 7.4 | 244 | |
| 198 | 1318 | 1120 | 46 | 57,227,415 | 6.7 | 259 ± 540 | 10,637 ± 33,468 | 341,789 | 14,179,080 | 22 | 67 | 64 | 8690 | 493,512 | 686,139 | 3.5 | 11 | |
Figure 1Size distribution of human introns. The human genome data was downloaded from the UCSC main table browser (GRCh38, December 2013 build). Data was processed using tabular software. Analysis was performed using 277,495 introns. The lower size limit for human introns is represented by the dark arrow and is comprised between 60 and 70 bases.
Exons and introns distribution across human coding genes. Human genome data was downloaded from the UCSC main table browser (GCRh38, December 2013 build). Data was processed using tabular software. Analysis was performed using 26,177 coding genes, containing 278,420 exons and 252,243 introns. See the caution in Table 1 for the shortest introns (<30–40 nt).
| Chr # | Total # Genes | Total # Exons | Total # Introns | Max # Exons/Gene | Chromosome Size (bp) | Avg # of Exons/Gene | Avg Length (bp) ± Std Dev | Total Length (bp) | Shortest (bp) | Longest (bp) | Genes/Millions | Intronless Genes | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Exon | Intron | Exon | Intron | Exon | Intron | Gene | Exon | Intron | Gene | |||||||||
| 2729 | 28,554 | 25,825 | 138 | 248,956,422 | 10.5 | 307 ± 702 | 5136 ± 15,685 | 8,778,766 | 132,633,451 | 3 | 1 | 270 | 12,573 | 451,448 | 1,491,100 | 11.0 | 149 | |
| 1653 | 22,244 | 20,591 | 363 | 242,193,529 | 13.5 | 294 ± 723 | 6283 ± 20,404 | 6,534,873 | 128,437,100 | 1 | 37 | 582 | 17,969 | 866,400 | 1,900,275 | 6.8 | 35 | |
| 1448 | 16,336 | 14,888 | 118 | 198,295,559 | 11.3 | 312 ± 765 | 7535 ± 24,995 | 5,098,251 | 5,098,251 | 3 | 3 | 294 | 24,927 | 24,927 | 1,502,150 | 7.3 | 44 | |
| 989 | 9991 | 9002 | 84 | 190,214,555 | 10.1 | 338 ± 729 | 8471 ± 25,955 | 3,372,765 | 76,252,775 | 8 | 1 | 576 | 9856 | 912,253 | 1,474,687 | 5.2 | 107 | |
| 1103 | 11,709 | 10,606 | 90 | 181,538,259 | 10.6 | 337 ± 785 | 7240 ± 22,309 | 3,944,829 | 76,788,283 | 6 | 21 | 530 | 22,753 | 772,519 | 1,519,058 | 6.1 | 85 | |
| 1432 | 14,660 | 13,228 | 146 | 170,805,979 | 10.2 | 317 ± 713 | 6665 ± 19,530 | 4,653,325 | 88,160,529 | 6 | 1 | 354 | 15,177 | 478,750 | 1,987,246 | 8.4 | 109 | |
| 1189 | 13,187 | 11,998 | 108 | 159,345,973 | 11.1 | 306 ± 744 | 7897 ± 24,804 | 4,032,145 | 94,745,609 | 2 | 1 | 600 | 14,889 | 657,297 | 2,304,636 | 7.5 | 55 | |
| 869 | 8779 | 7910 | 86 | 145,138,636 | 10.1 | 327 ± 768 | 7942 ± 24,970 | 2,871,568 | 62,820,216 | 5 | 12 | 663 | 15,980 | 955,098 | 2,059,454 | 6.0 | 26 | |
| 1041 | 11,352 | 10,311 | 98 | 138,394,717 | 10.9 | 304 ± 703 | 5971 ± 16,684 | 3,453,046 | 61,566,944 | 3 | 5 | 411 | 10,345 | 344,501 | 2,298,478 | 7.5 | 73 | |
| 978 | 11,096 | 10,118 | 68 | 133,797,422 | 11.3 | 310 ± 715 | 7926 ± 24,491 | 3,440,785 | 80,199,711 | 5 | 67 | 563 | 11,090 | 482,575 | 1,783,674 | 7.3 | 26 | |
| 1703 | 16,196 | 14,493 | 90 | 135,086,622 | 9.5 | 301 ± 665 | 5136 ± 19,917 | 4,880,180 | 74,431,941 | 2 | 1 | 484 | 18,173 | 811,152 | 1,468,409 | 12.6 | 211 | |
| 1407 | 16,186 | 14,779 | 173 | 133,275,309 | 11.5 | 294 ± 688 | 4939 ± 15,284 | 4,757,204 | 72,993,897 | 9 | 5 | 396 | 14,194 | 403,400 | 1,249,864 | 10.6 | 50 | |
| 426 | 4920 | 4494 | 83 | 114,364,328 | 11.5 | 330 ± 864 | 8615 ± 29,149 | 1,624,165 | 38,717,496 | 5 | 66 | 1310 | 21,022 | 740,920 | 1,468,616 | 3.7 | 10 | |
| 832 | 8745 | 7913 | 116 | 107,043,718 | 10.5 | 319 ± 741 | 6107 ± 20,447 | 2,786,125 | 48,327,558 | 4 | 14 | 465 | 17,546 | 479,079 | 1,464,560 | 7.8 | 38 | |
| 769 | 9686 | 8917 | 104 | 101,991,189 | 12.6 | 299 ± 689 | 5508 ± 16,227 | 2,900,248 | 49,116,901 | 8 | 21 | 918 | 10,227 | 550,366 | 887,042 | 7.5 | 23 | |
| 1117 | 11,738 | 10,621 | 63 | 90,338,345 | 10.5 | 281 ± 607 | 3831 ± 17,251 | 3,296,823 | 40,686,903 | 5 | 1 | 397 | 10,024 | 778,855 | 1,694,208 | 12.4 | 20 | |
| 1619 | 17,644 | 16,025 | 85 | 83,257,441 | 10.9 | 280 ± 605 | 3427 ± 13,031 | 4,943,096 | 54,914,295 | 8 | 1 | 445 | 9719 | 1,043,910 | 1,143,719 | 19.4 | 68 | |
| 364 | 4082 | 3718 | 75 | 80,373,285 | 11.2 | 360 ± 872 | 10,356 ± 24,257 | 1,469,520 | 38,504,335 | 9 | 75 | 906 | 14,862 | 411,175 | 1,195,732 | 4.5 | 17 | |
| 1904 | 16,757 | 14,853 | 106 | 58,617,616 | 8.8 | 296 ± 659 | 2283 ± 5233 | 4,964,098 | 33,906,702 | 6 | 2 | 541 | 21,693 | 121,730 | 301,152 | 32.5 | 54 | |
| 741 | 7114 | 6373 | 80 | 64,444,167 | 9.6 | 307 ± 669 | 5564 ± 18,912 | 2,180,506 | 35,459,411 | 11 | 66 | 666 | 10,441 | 544,980 | 2,057,697 | 11.5 | 23 | |
| 311 | 3183 | 2872 | 47 | 46,709,983 | 10.2 | 292 ± 700 | 5831 ± 16,574 | 930,502 | 16,745,676 | 9 | 9 | 147 | 11,938 | 323,564 | 834,698 | 6.7 | 54 | |
| 598 | 6190 | 5592 | 55 | 50,818,468 | 10.4 | 300 ± 702 | 4156 ± 11,646 | 1,857,403 | 23,242,726 | 8 | 1 | 686 | 12,955 | 322,908 | 701,852 | 11.8 | 14 | |
| 854 | 7222 | 6368 | 84 | 156,040,895 | 8.5 | 344 ± 774 | 6428 ± 22,011 | 2,483,142 | 40,934,557 | 10 | 67 | 501 | 10,363 | 536,479 | 1,368,337 | 5.5 | 79 | |
| 101 | 849 | 748 | 46 | 57,227,415 | 8.4 | 267 ± 576 | 9483 ± 32,196 | 226,383 | 5,888,994 | 24 | 67 | 737 | 8690 | 493,512 | 686,139 | 1.8 | 9 | |
Exons and introns distribution across human non-coding genes. Human genome data was downloaded from the UCSC main table browser (GCRh38, December 2013 build). Data was processed using tabular software. Analysis was performed using 8679 non-coding genes containing 33,931 exons and 25,252 introns. See the caution in Table 1 for the shortest introns (<30–40 nt).
| Chr # | Total # Genes | Total # Exons | Total # Introns | Max # Exons/Gene | Chromosome Size (bp) | Avg # of Exons/Gene | Avg length (bp) ± Std Dev | Total length (bp) | Shortest (bp) | Longest (bp) | Genes/Millions | Intronless Genes | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Exon | Intron | Exon | Intron | Exon | Intron | Gene | Exon | Intron | Gene | |||||||||
| 863 | 3190 | 2327 | 46 | 248,956,422 | 3.7 | 362 ± 731 | 6916 ± 19,245 | 1,155,681 | 16,092,457 | 4 | 44 | 41 | 11,846 | 300,899 | 670,478 | 3.5 | 395 | |
| 555 | 2561 | 2006 | 55 | 242,193,529 | 4.6 | 341 ± 668 | 9539 ± 26,653 | 872,641 | 19,135,257 | 3 | 14 | 49 | 11,633 | 415,325 | 1,126,123 | 2.3 | 168 | |
| 468 | 1956 | 1488 | 45 | 198,295,559 | 4.2 | 360 ± 702 | 9009 ± 26,455 | 703,213 | 13,405,364 | 12 | 67 | 21 | 8244 | 427,004 | 581,065 | 2.4 | 137 | |
| 245 | 889 | 644 | 36 | 190,214,555 | 3.6 | 409 ± 847 | 11,710 ± 28,055 | 363,799 | 7,541,526 | 12 | 60 | 44 | 9848 | 250,403 | 491,647 | 1.3 | 90 | |
| 393 | 1484 | 1091 | 30 | 181,538,259 | 3,8 | 380 ± 722 | 11,932 ± 30,494 | 563,366 | 13,017,653 | 15 | 12 | 43 | 8875 | 340,222 | 932,203 | 2.2 | 120 | |
| 361 | 1446 | 1085 | 36 | 170,805,979 | 4.0 | 417 ± 772 | 9457 ± 25,027 | 603,117 | 10,260,984 | 7 | 73 | 50 | 8695 | 326,934 | 621,277 | 2.1 | 118 | |
| 440 | 2093 | 1653 | 48 | 159,345,973 | 4.8 | 370 ± 871 | 6847 ± 20,981 | 774,444 | 11,317,557 | 13 | 70 | 53 | 21,017 | 414,132 | 630,440 | 2.8 | 132 | |
| 338 | 1262 | 924 | 29 | 145,138,636 | 3.7 | 389 ± 844 | 11,886 ± 34,163 | 491,445 | 10,982,253 | 15 | 71 | 23 | 12,722 | 499,303 | 541,308 | 2.3 | 134 | |
| 361 | 1487 | 1126 | 45 | 138,394,717 | 4.1 | 388 ± 791 | 7004 ± 18,841 | 576,281 | 7,886,761 | 14 | 68 | 54 | 7835 | 308,685 | 310,090 | 2.6 | 127 | |
| 368 | 1530 | 1162 | 28 | 133,797,422 | 4.2 | 404 ± 776 | 6904 ± 16,076 | 617,740 | 8,022,634 | 15 | 73 | 50 | 7617 | 212,605 | 337,030 | 2.8 | 105 | |
| 409 | 1513 | 1104 | 29 | 135,086,622 | 3.7 | 410 ± 2455 | 7541 ± 21,908 | 620,851 | 8,325,666 | 5 | 44 | 50 | 91,671 | 295,436 | 663,821 | 3.0 | 166 | |
| 366 | 1432 | 1066 | 27 | 133,275,309 | 3.9 | 366 ± 666 | 6974 ± 19,998 | 524,671 | 7,434,622 | 17 | 15 | 50 | 10,432 | 266,879 | 373,979 | 2.7 | 119 | |
| 289 | 1062 | 773 | 26 | 114,364,328 | 3.7 | 378 ± 1306 | 11,923 ± 30,179 | 401,067 | 9,216,375 | 14 | 76 | 48 | 37,567 | 330,963 | 562,471 | 2.5 | 84 | |
| 339 | 1026 | 687 | 46 | 107,043,718 | 3.0 | 321 ± 679 | 7422 ± 20,278 | 328,923 | 5,099,086 | 23 | 76 | 46 | 8430 | 289,502 | 437,743 | 3.2 | 191 | |
| 419 | 1794 | 1375 | 35 | 101,991,189 | 4.3 | 342 ± 742 | 6731 ± 25,217 | 614,252 | 9,255,474 | 10 | 21 | 33 | 11,532 | 732,200 | 797,140 | 4.1 | 197 | |
| 351 | 1474 | 1123 | 50 | 90,338,345 | 4.2 | 319 ± 601 | 4476 ± 18,977 | 469,487 | 5,026,091 | 3 | 47 | 51 | 7148 | 368,335 | 531,096 | 3.9 | 139 | |
| 450 | 1838 | 1388 | 61 | 83,257,441 | 4.1 | 370 ± 699 | 4734 ± 13,885 | 679,590 | 6,570,461 | 15 | 70 | 47 | 10,345 | 220,687 | 325,488 | 5.4 | 183 | |
| 131 | 486 | 355 | 22 | 80,373,285 | 3.7 | 379 ± 642 | 12,598 ± 35,371 | 184,066 | 4,472,256 | 26 | 89 | 50 | 4791 | 326,668 | 545,072 | 1.6 | 41 | |
| 545 | 1777 | 1232 | 30 | 58,617,616 | 3.3 | 306 ± 611 | 3579 ± 12,785 | 543,192 | 4,409,772 | 11 | 62 | 47 | 11,194 | 255,789 | 292,306 | 9.3 | 304 | |
| 242 | 921 | 679 | 43 | 64,444,167 | 3.8 | 357 ± 720 | 5055 ± 11,393 | 329,156 | 3,432,103 | 8 | 31 | 50 | 10,441 | 138,007 | 195,695 | 3.8 | 80 | |
| 152 | 562 | 410 | 32 | 46,709,983 | 3.7 | 404 ± 869 | 10,393 ± 25,888 | 226,865 | 4,261,161 | 16 | 79 | 60 | 13,351 | 256,374 | 539,254 | 3.3 | 41 | |
| 199 | 839 | 640 | 30 | 50,818,468 | 4.2 | 401 ± 775 | 5757 ± 21,234 | 336,853 | 3,684,280 | 4 | 9 | 52 | 8320 | 355,998 | 411,958 | 3.9 | 70 | |
| 298 | 840 | 542 | 17 | 156,040,895 | 2.8 | 401 ± 1472 | 12,413 ± 40,403 | 336,962 | 6,727,884 | 7 | 78 | 48 | 37,027 | 405,107 | 1,033,350 | 1.9 | 165 | |
| 97 | 469 | 372 | 26 | 57,227,415 | 4.8 | 246 ± 467 | 16,221 ± 58,381 | 115,406 | 1,313,894 | 22 | 90 | 64 | 5836 | 353,508 | 320,464 | 1.7 | 2 | |
Figure 2Selected tracks for the human MTOR (A); NF1 (B) and MOB2 (C) genes. (A) Mechanistic target of rapamycin (serine/threonine kinase) gene (MTOR; chr1:11,166,588–11,322,608) is composed of 58 exons spanning 156,020 nucleotides. It contains 3 nested genes: MTOR-AS1 (MTOR antisense RNA 1; chr1:11,203,955–11,209,595) gene which encodes an antisense RNA across introns 24 to 27 and overlapping exon 25; ANGPTL7 (chr1:11,249,346–11,256,038) gene embedded in intron 30; RPL39P6 (chr1:11,293,020–11,293,169) gene which is a pseudogene included in intron 43; (B) Neurofibromin 1 gene (NF1; chr17:29,421,945–29,704,695) composed of 58 exons spanning 282,751 nucleotides holds 4 nested genes, 3 of which are coding genes located in intron 30 and 1 is a pseudogene (AK4P1; chr17:29,672,539–29,673,205; included in intron 42), all of which are transcribed from the opposite strand. The 3 nested genes are the oligodendrocyte myelin glycoprotein gene (OMG; chr17:29,621,668–29,624,380), ecotropic viral integration site 2A (EVI2A; chr17:29,643,428-29,648,767) and 2B (EVI2B; chr17:29,630,788–29,641,130); (C) MOB kinase activator 2 gene (MOB2; chr11:1,490,685–1,785,501) contains 13 nested genes: dual specificity phosphatase 8 (DUSP8) coding gene; KRTAP5-1/KRTAP5-2 antisense RNA 1 (KRTAP5-AS1) pseudogene; keratin associated protein 5-1 (KRTAP5-1) and 5 other paralog genes (KRTAP5-2 to -6); family with sequence similarity 99, member A (FAM99A) and member B (FAM99B) which are non-coding genes; interferon induced transmembrane protein 10 (IFITM10) and cathepsin D (CTSD) coding genes; one unannotated gene corresponding to EST GenBank AF085962. Amazingly, KRTAP5-1 gene was embedded within and oriented in the opposite direction of KRTAP5-AS1 pseudogene intron, itself oriented in the opposite direction to MOB2 intron.
Figure 3Schematic representation of canonical- (A), recursive- (B), and intra- (C) splicing. Boxes and lines are exons and introns, respectively. Splicing processes are shown by dotted lines.
Some examples of intronic snoRNAs and miRNAs and their host genes.
| Name | Genome Position | Host Intron | Host Gene | Genome Position | Gene Function | |
|---|---|---|---|---|---|---|
| chr21:33,749,496–33,749,631 | Intron 5 | URB Ribosome Biogenesis 1 homolog (URB1) | chr21:33,683,330–33,765,312 | Ribosome biogenesis | ||
| chr20:17,943,353–17,943,589 | Intron 1 | Sorting Nexin 5 (SNX5) | chr20:17,922,244–17,949,490 | Member of the sorting nexin family, involved in intracellular trafficking | ||
| chr20:2,443,605–2,443,686 | Intron 2 | Small Nuclear Ribonucleoprotein Polypeptides B and B1 (SNRPB) | chr20:2,442,288–2,451,499 | Nuclear proteins that are found in common among U1, U2, U4/U6, and U5 small ribonucleoprotein particles (snRNPs) | ||
| chr6:133136446–133136518 | Intron 3 | Ribosomal protein S12 (RPS12) | chr6:133,135,708–133,138,703 | Component of the ribosomal 40S subunit | ||
| chr6:133137941–133138016 | Intron 4 | Ribosomal protein S12 (RPS12) | chr6:133,135,708–133,138,703 | Component of the ribosomal 40S subunit | ||
| chr6:133138358–133138490 | Intron 5 | Ribosomal protein S12 (RPS12) | chr6:133,135,708–133,138,703 | Component of the ribosomal 40S subunit | ||
| chr18:51,748,654–51,748,782 | Intron 1 | Methyl-CpG Binding Domain protein 2 (MBD2) | chr18:51,677,971–51,751,158 | Repress transcription from methylated gene promoters | ||
| chr19:52,785,050–52,785,146 | Intron 1 | Zinc Finger protein 766 (ZNF766) | chr19:52,772,824–52,795,976 | Unknown | ||
| chr19:49,063,529–49,063,611 | Intron 1 | Sulfotransferase family, cytosolic, 2B, member 1 (SULT2B1) | chr19:49,055,429–49,102,684 | Catalyze the sulfate conjugation of many hormones, neurotransmitters, drugs, and xenobiotic compounds | ||
| chr19:47,730,201–47,730,276 | Intron 2 | BCL2 Binding Component 3 (BBC3) | chr19:47,724,079–47,736,023 | Member of the BCL-2 family of proteins, cooperates with direct activator proteins to induce mitochondrial outer membrane permeabilization and apoptosis | ||
| chr14:101,318,727–101,318,824 | Intron 9 | Maternally Expressed 3 (non-protein coding) (MEG3) | chr14:101,292,445–-101,327,360 | Long ncRNA tumor suppressor. Interacts with the tumor suppressor p53, and regulates p53 target gene expression | ||
| chr1:10287776–10287861 | Intron 1 | Kinesin family member 1B (KIF1B) | chr1:10,270,764–10,441,661 | Transports mitochondria and synaptic vesicle precursors | ||
| chr19:47,730,199–47,730,278 | Intron 2 | BCL2 Binding Component 3 (BBC3) | chr19:47,724,079–47,736,023 | Member of the BCL-2 family of proteins, cooperates with direct activator proteins to induce mitochondrial outer membrane permeabilization and apoptosis | ||
| chr1:117,637,265–117,637,350 | Intron 18 | Transcription Termination Factor, RNA polymerase II (TTF2) | chr1:117,602,949–117,645,491 | Member of the SWI2/SNF2 family of proteins | ||
Figure 4Different mechanisms of microRNA biogenesis. The first three panels correspond to the canonical miRNA pathway, either intergenic (miRNAs) or intronic (intronic miRNAs), and the last two panels represent new alternative pathways, either independent of the microprocessor Drosha/DGCR8 (mirtrons) or independent of DGCR8 but dependent of U1 snRNP (simtrons), both dependent on splicing to produce miRNAs. DGCR8 stands for DiGeorge syndrome critical region gene 8. Adapted from [48].
Figure 5Intron distribution relative to their position in genes. About 20% of all introns subjected to alternative splicing (AS) are intron 1.
Figure 6Carcino-embryonic antigen related cell adhesion molecule 6 (CAECAM6) and its novel spliced variant CAECAM6-Long (CAECAM6-L) from rat testis. Gene, mRNA and protein representation reproduced using data from [69]. Exons are numbered. Twisted arrow and star corresponded to ATG and stop codon, respectively. Intron 3 is denoted i3. Colored boxes corresponded to exons, thin line to introns except for i3, which was shown as a black box. White and grey boxes are for untranslated region (UTR) and coding sequences, respectively.
Figure 7Both coding and non-coding RNAs can be produced by a given genetic locus. Intron-spliced isoforms are translated into protein while intron-retaining transcripts escape surveillance machineries and produce functional ncRNAs. Exons are numbered and represented by grey boxes and by thin lines. Arrow and star represent ATG and stop codon, respectively. In the mature transcript, the retained intron appears as a black box, UTR and coding sequences as white and grey boxes respectively.