| Literature DB >> 18665230 |
Qingfa Wu1, Yeong C Kim, Jian Lu, Zhenyu Xuan, Jun Chen, Yonglan Zheng, Tom Zhou, Michael Q Zhang, Chung-I Wu, San Ming Wang.
Abstract
BACKGROUND: Transcripts expressed in eukaryotes are classified as poly A+ transcripts or poly A- transcripts based on the presence or absence of the 3' poly A tail. Most transcripts identified so far are poly A+ transcripts, whereas the poly A- transcripts remain largely unknown. METHODOLOGY/PRINCIPALEntities:
Mesh:
Substances:
Year: 2008 PMID: 18665230 PMCID: PMC2481391 DOI: 10.1371/journal.pone.0002803
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1The Total RNA Detection system.
A universal RNA adaptor was firstly added to the 3′ ends of all RNA templates. The abundant 18S and 28S ribosome RNAs were then subtracted by using biotinylated ribosomal-specific probes. Small-size RNAs containing the degraded RNA intermediates were removed by size-filtration. The enriched transcripts were converted into double-strand cDNA by using the 3′ end RNA adaptor-based primer. The cDNAs were further digested by NlaIII. The 3′ cDNAs were isolated by using the streptoavidin beads. An adaptor was added to the 5′ ends of the 3′ cDNAs. The 3′ cDNAs were then amplified by PCR using the 5′ adaptor-based sense primer and the 3′ end RNA adaptor-based antisense primer. The amplified 3′ cDNAs were sequenced from the 3′ end by the 454 system. See further details in Materials and Methods.
Summary of the sequence information.
| Items | Number (%) |
| Total sequences | 273,949 |
| With 3′ end tag | 241,864 (100) |
| 28S ribosome RNA | 29,049 (12) |
| 18S ribosome RNA | 43,831 (18) |
| 5.8S RNA | 346 (0.1) |
| 5S RNA | 17 (0) |
| tRNA | 1,228 (0.5) |
| Mitochondrion RNA, <11 bps, low quality | 18,783 (8) |
| Final qualified sequences | 148,520 (61) |
| Final distinct sequences | 52,571 |
| Genome-mapped sequences | 13,782 |
After removing sequences of 28S, 18S, 5.8S, 5S, tRNAs, mitochondrial RNA, <11 bps, and low quality.
Novelty, classification and abundance distribution of 3′ EST.
| Total (%) | Classification | |||
| Poly A- | Poly A+ | Bimorphic | ||
| Total | 13,782 (100) | 3,299 (24) | 4,898 (36) | 5,585 (40) |
| Compare to known mRNA | ||||
| Non match | 4,086 (30) | 2,984 (22) | 924 (7) | 178 (1) |
| Match | 9,696 (70) | 315 (2) | 3,974 (29) | 5,407 (39) |
| RefSeq | 1,629 | 0 | 647 | 982 |
| mRNA | 3,149 | 0 | 1,272 | 1,877 |
| EST | 6,350 | 0 | 2,732 | 3,618 |
| SAGE tag | 4,727 | 0 | 1,951 | 2,776 |
| Abundance distribution | ||||
| Total copies | 55,172 (100) | 6,369 (12) | 40,317 (73) | 8,486 (15) |
| >1000 | 5 | 0 | 5 | 0 |
| 501 to 1000 | 6 | 1 | 5 | 0 |
| 101 to 500 | 22 | 6 | 15 | 1 |
| 51 to 100 | 26 | 6 | 18 | 2 |
| 11 to 50 | 153 | 24 | 110 | 19 |
| 6 to 10 | 235 | 28 | 133 | 74 |
| 2 to 5 | 3,084 | 596 | 1,241 | 1,247 |
| 1 | 9,936 | 2,323 | 3,371 | 4,242 |
The 315 3′ ESTs matched to histone sequences.
The 3′ end distribution of histone 3′ ESTs.
| Histone mRNA | Number of matched 3′ ESTs | Match to 3′ end of histone mRNA | |
| Distal to 3′ end | Proximal to 3′ end | ||
| HIST1H1B | 4 | 2 | 2 |
| HIST1H1C | 3 | 3 | |
| HIST1H1E | 1 | 1 | |
| HIST1H1T | 2 | 2 | |
| HIST1H2AB | 15 | 1 | 14 |
| HIST1H2AC | 1 | 1 | |
| HIST1H2AE | 1 | 1 | |
| HIST1H2AG | 3 | 3 | |
| HIST1H2AI | 2 | 2 | |
| HIST1H2AJ | 1 | 1 | |
| HIST1H2AK | 3 | 2 | 1 |
| HIST1H2AM | 1 | 1 | |
| HIST1H2BB | 2 | 2 | |
| HIST1H2BC | 1 | 1 | |
| HIST1H2BD | 8 | 1 | 7 |
| HIST1H2BE | 1 | 1 | |
| HIST1H2BF | 2 | 2 | |
| HIST1H2BG | 10 | 3 | 7 |
| HIST1H2BH | 4 | 2 | 2 |
| HIST1H2BJ | 1 | 1 | |
| HIST1H2BN | 2 | 2 | |
| HIST1H3A | 7 | 2 | 5 |
| HIST1H3B | 20 | 2 | 18 |
| HIST1H3C | 5 | 5 | |
| HIST1H3D | 16 | 1 | 15 |
| HIST1H3E | 1 | 1 | |
| HIST1H3F | 29 | 29 | |
| HIST1H3G | 23 | 23 | |
| HIST1H3H | 18 | 1 | 17 |
| HIST1H4A | 8 | 8 | |
| HIST1H4B | 8 | 1 | 7 |
| HIST1H4C | 26 | 26 | |
| HIST1H4E | 8 | 5 | 3 |
| HIST1H4H | 2 | 2 | |
| HIST1H4I | 9 | 9 | |
| HIST1H4J | 11 | 4 | 7 |
| HIST1H4K | 4 | 4 | |
| HIST2H2AA | 9 | 4 | 5 |
| HIST2H2AC | 5 | 5 | |
| HIST4H4 | 9 | 8 | 1 |
| H1FX | 1 | 1 | |
| H2AFX | 7 | 7 | |
| H2AFY | 1 | 1 | |
| H3F3A | 8 | 6 | 2 |
| H3F3B | 1 | 1 | |
| H4/o | 11 | 3 | 8 |
| Total (%) | 315 (100) | 63 (20) | 252 (80) |
H3F3A mRNA is polyadenylated (Wells D, Kedes L. PNAS 82, 2834, 1985)
Figure 2Example of histone 3′ EST distribution.
Fifteen 3′ ESTs that map to the full-length histone 1H2AB cDNA sequences (NM_003513) are clustered proximal to the 3′ end of the full-length sequence. See Table 3, Table S3 and Figure S1 for the distribution of other histone 3′ ESTs.
Mapping 3′ ESTs to the human genome reference sequences (HG18).
| Total (%) | Poly A- | Poly A+ | Bimorphic | |
| Total mapped 3′ EST | 13,467 | 2,984 | 4,898 | 5,585 |
| Single mapped 3′ EST | 8,178 (100) | 2,113 (100) | 2,760 (100) | 3,305 (100) |
| Intergenic mapping | 2,310 (28) | 808 (38) | 751 (27) | 751 (23) |
| Intragenic mapping | 5,868 (72) | 1,305 (62) | 2,009 (73) | 2,554 (77) |
| Distribution of Intragenic mapping | ||||
| Total | 5,868 (100) | 1,305 (100) | 2,009 (100) | 2,554 (100) |
| Sense | 5,426 (93) | 1,123 (86) | 1,871 (93) | 2,432 (95) |
|
| 4,182 | 976 | 1,385 | 1,816 |
|
| 1,125 | 125 | 441 | 558 |
|
| 127 | 22 | 45 | 58 |
| Antisense | 442 (8) | 182 (14) | 138 (7) | 122 (5) |
|
| 385 | 175 | 113 | 95 |
|
| 24 | 5 | 5 | 13 |
|
| 37 | 2 | 20 | 14 |
The mapping difference is at statistically significant level (p = 3.67×10 −16, X2 test).
Overlapping genome position of 3′ ESTs.
| Type of overlapping | Type of sequences (%) | ||
| Poly A+ | Poly A- | Bimorphic | |
| Common in 3 | 633 (23) | 474 (22) | 749 (23) |
| Common in 2 | 832 (30) | 468 (22) | 898 (27) |
| Poly A+/Poly A- | 481 | 471 | |
| Poly A-/bimorphic | 541 | 576 | |
| Poly A+/bimorphic | 1,028 | 1,059 | |
| Only in 1 | 1,295 (47) | 1,171 (55) | 1,658 (50) |
| Total | 2,760 (100) | 2,113 (100) | 3,305 (100) |
Multiple overlapping exists for the sequences of common in 2.
Evolutionary conservation of the 3′ EST-mapped intergenic regions*
| Species | Divergence | Total sequences | Type of sequences | |||
| Mean | Median | ( | Poly A- | Poly A+ | Bimorphic | |
| Chimp | 0.013 | 0.000 | 1,311 | 451 | 423 | 426 |
| Macaque | 0.056 | 0.042 | 1,263 | 436 | 399 | 418 |
| Mouse | 0.359 | 0.339 | 441 | 252 | 303 | 336 |
| Rat | 0.367 | 0.342 | 425 | 240 | 297 | 321 |
| Rabbit | 0.298 | 0.274 | 532 | 236 | 284 | 304 |
| Dog | 0.267 | 0.245 | 759 | 338 | 355 | 375 |
| Cow | 0.269 | 0.240 | 733 | 317 | 334 | 371 |
| Armadillo | 0.255 | 0.234 | 568 | 210 | 263 | 295 |
| Elephant | 0.254 | 0.227 | 600 | 222 | 276 | 301 |
| Tenrec | 0.328 | 0.304 | 404 | 192 | 242 | 266 |
| Opossum | 0.431 | 0.412 | 250 | 139 | 197 | 233 |
| Chicken | 0.388 | 0.342 | 130 | 52 | 105 | 104 |
| Frog | 0.364 | 0.297 | 77 | 31 | 49 | 58 |
| Zebrafish | 0.454 | 0.368 | 49 | 28 | 36 | 45 |
| Tetraodon | 0.455 | 0.360 | 43 | 34 | 31 | 38 |
| Fugu | 0.489 | 0.422 | 34 | 28 | 26 | 44 |
| Total non-redundant 3′ EST (%) | 1,322 (100) | 467 (32) | 428 (32) | 437 (33) | ||
2,310 3′ ESTs mapping to single intergenic region were used for the study
p value was not used for chimp and macaque as they are too close to the humans.
Figure 3Experimental verification of novel 3′ ESTs.
(A). 3′ end verification for each subtype of 3′ EST. 3′ ESTs from poly A-, poly A+, and bimorphic subtypes were selected for the confirmation. Known poly A+ transcripts were used as positive control. Random priming- generated cDNA and oligo dT-generated cDNA were used as the templates. R: cDNA generated by random priming; T: cDNA generated by oligo dT priming. See Table S7 for primer information. (B). Verification of 3′ ESTs mapped to intronic microRNA precursors. RT-PCR was used to verify the 3′ ESTs that map to intronic microRNA precursors. Amplified products were cloned and sequenced. See Table S4C for primer information. (C). northern blot verification of poly A- 3′ EST. Two poly A- 3′ ESTs were used as the probes (Table S7) and RNAs from five human cell lines were used for the detection.
Poly A- 3′ EST overlapped with poly A- “transfrag”
| Compartment | Type of sequences (%) | |
| “Transfrag” | Poly A- 3′ EST | |
| Cytosolic | 6,185 (14) | 37 (18) |
| Nuclear | 17,995 (41) | 53 (25) |
| Both cytosolic and nuclear | 20,200 (46) | 120 (57) |
| Total | 44,380 (100) | 210 (100) |