| Literature DB >> 21171994 |
Henk P J Buermans1, Yavuz Ariyurek, Gertjan van Ommen, Johan T den Dunnen, Peter A C 't Hoen.
Abstract
BACKGROUND: MicroRNAs are small non-coding RNA transcripts that regulate post-transcriptional gene expression. The millions of short sequence reads generated by next generation sequencing technologies make this technique explicitly suitable for profiling of known and novel microRNAs. A modification to the small-RNA expression kit (SREK, Ambion) library preparation method for the SOLiD sequencing platform is described to generate microRNA sequencing libraries that are compatible with the Illumina Genome Analyzer.Entities:
Mesh:
Substances:
Year: 2010 PMID: 21171994 PMCID: PMC3022920 DOI: 10.1186/1471-2164-11-716
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Amplification primers. A: Alternative primers used during miRNA sequencing library preparation to make the SREK protocol compatible with the Illumina Genome Analyzer. * Indicates a phosphorothioate bond. B: Library fragment separation on a 6% PAGE gel. Different library fragment sizes can be discerned which represent different types of RNA ligated in between the SREK adapters as indicated at the left of the gel. Due to the length of the alternative primers used during library amplification, the size of an empty library fragment is 125 bp. The miRNA fraction of the library is located at 140-150 bp.
Figure 2E-miR pipeline Flowchart. Schematic overview of the E-miR pipeline. A more detailed description for each of the individual steps can be found in the Methods section.
EmiR data processing table
| EMtot-single | EMs-single | EMs-multi | HTs-single | HTs-multi | |
|---|---|---|---|---|---|
| Input | 5,794,222 | 5,294,849 | 4,042,101 | 6,517,852 | 3,292,684 |
| < 15 nt | 2,979,930 | 2,368,163 | 2,011,435 | 3,027,338 | 1,596,093 |
| ≥15 nt | 2,814,292 | 2,926,686 | 2,030,666 | 3,490,514 | 1,696,591 |
| NM | 577,710 | 569,163 | 294,637 | 921,828 | 467,263 |
| R* | 967,090 | 1,138,353 | 814,299 | 1,383,114 | 653,343 |
| U0 | 684,178 | 571,752 | 477,298 | 578,162 | 295,420 |
| U1 | 439,677 | 472,181 | 333,808 | 428,136 | 201,149 |
| U2 | 145,637 | 175,237 | 110,624 | 179,274 | 79,416 |
| U0&U1 | 1,123,855 | 1,043,933 | 811,106 | 1,006,298 | 496,569 |
| miRNA | 557,167 [49.58%] | 765,899 [73.37%] | 586,779 [72.34%] | 442,242 [43.95%] | 215,293 [43.36%] |
| miscRNA | 2,934 [0.26%] | 1,341 [0.13%] | 981 [0.12%] | 3,590 [0.36%] | 1,849 [0.37%] |
| pseudogene | 10 [0%] | 14 [0%] | 6 [0%] | 17 [0%] | 8 [0%] |
| rRNA | 260 [0.02%] | 2,392 [0.23%] | 1,990 [0.25%] | 1,738 [0.17%] | 869 [0.18%] |
| snRNA | 366 [0.03%] | 360 [0.03%] | 285 [0.04%] | 614 [0.06%] | 344 [0.07%] |
| snoRNA | 45,052 [4.01%] | 22,576 [2.16%] | 14,706 [1.81%] | 32,513 [3.23%] | 15,284 [3.08%] |
| tRNA | 10,031 [0.89%] | 13,908 [1.33%] | 11,625 [1.43%] | 27,266 [2.71%] | 13,998 [2.82%] |
| other | 508,035 [45.2%] | 237,443 [22.75%] | 194,734 [24.01%] | 498,318 [49.52%] | 248,924 [50.13%] |
Overview of sequence data processing by the perl analysis pipeline for the heart tube and embryo samples for the single read (single) and multiplexed (multi) libraries. Total RNA input and smallRNA input samples are indicated with the 'tot' and 's' suffixes, respectively. Indicated from top to bottom are the total number of input sequence reads, sequence reads that were shorter than 15 nt after 3' adapter truncation, sequences accepted for further analysis after 3' adapter truncation, genome alignment classes and the number of sequences annotated to non-coding RNA transcripts. For the genome alignment classes, NM indicates reads that could not be mapped to the genome, R* reads that were mapped to repeat areas and U0-U2, reads that were mapped with either no, one or two mismatches to the genome. Percentages of annotated transcripts are calculated relative to the sum of U0 and U1 reads. The 'other' category indicates reads not annotated to any of the Chicken non-coding RNAs in the Ensembl database.
Figure 3EM and HT miRNA expression. A: Venn diagram indicating the number of 5p (blue) and 3p (red) miRNA transcripts expressed with at least 5 tpm in either HT or whole embryo libraries. B: Scatter plots comparing the single (x-axis) and multiplex (y-axis) sequencing run for the sum of isomirs per transcripts to calculate miRNA transcript expression levels in sqrt(tpm). Heart tube and Embryo samples are indicated with triangles and diamonds, respectively. Pearson correlation coefficients are indicated at the top of the plot for each summation method. C: Heatmap clustering comparing all samples used in this study. All expressed miRNA transcripts were used to generate this heatmap. Indicated at the right are the tissues (EM & HT), total (tot) or small enriched (s) RNA, the single or multiplex runs. The different summation methods to calculate miRNA transcript expression levels for each sample, i.e., the sum of all isomirs, the sum of uniquely aligned isomirs without mismatches and the most abundant isomir, are indicated with '_sum', '_U0' and '_mab', respectively. Horizontal and vertical labels are identical. D: Scatter plot indicating average miRNA expression in sqrt(tpm) for heart tube (x-axis) and whole embryo (y-axis). Open and closed black circles represent non-significant and significantly differentially expressed miRbase miRNA transcripts respectively. The top left insert depicts an enlarged section of the 0-20 sqrt(tpm) area.
Figure 4Short amplification primer libraries. A: 4-12% PAGE gel visualizing improved library fraction separation by using the shorter amplification primers. B: Scatter plot highly correlated miRNA expression levels [sqrt(tpm)] between libraries generated using the long (x-axis) and short (y-axis) amplification primers. Expression levels were calculated relative to the sum of aligned transcripts.
Cardiac enriched miRNAs
| miRNA transcript | Location | EM-s | HT-s | Fold Difference |
|---|---|---|---|---|
| miRNA | chr2:105670407-105670428 | 0 ± 0 | 9 ± 1.3 | inf |
| miRNA | chr20:8107876-8107896 | 0 ± 0 | 23 ± 6.9 | inf |
| miRNA | chr23:4664062-4664083 | 0 ± 0 | 8 ± 2.9 | inf |
| miRNA | chr20:2599386-2599408 | 2.8 ± 1.3 | 2542.5 ± 574 | 906 |
| miRNA | chr20:2599348-2599370 | 15.5 ± 1.1 | 3048.4 ± 318.2 | 196 |
| miRNA | chr23:4664099-4664120 | 2.8 ± 1.3 | 496.9 ± 44.9 | 177 |
| miRNA | chr23:4663916-4663936 | 1.7 ± 1.1 | 142.9 ± 22.7 | 83 |
| miRNA | chr20:8107838-8107858 | 35.4 ± 20.4 | 2366.1 ± 777.2 | 67 |
| miRNA | chr23:4663953-4663973 | 91.1 ± 24.2 | 6086.5 ± 1328.3 | 67 |
| miRNA | chr4:2151195-2151215 | 1.4 ± 2 | 32.5 ± 0.4 | 23 |
| miRNA | chr4:2151238-2151260 | 30.2 ± 6.1 | 1277.3 ± 123.2 | 42 |
| miRNA | chr1:59948716-59948737 | 45.3 ± 12.6 | 1447.5 ± 231.3 | 32 |
| miRNA | chr20:8109147-8109169 | 1.1 ± 0.2 | 25.5 ± 3.8 | 23 |
| miRNA | chr5:42365992-42366012 | 8.6 ± 6.9 | 130.5 ± 0.5 | 15 |
| miRNA | chr1:59948755-59948776 | 13.8 ± 2.1 | 228.3 ± 47.4 | 17 |
| miRNA | chr1:104486649-104486675 | 8.2 ± 4.6 | 84.6 ± 5.7 | 10 |
| miRNA | chr3:110384948-110384968 | 93 ± 21.6 | 866.5 ± 52.1 | 9.3 |
| miRNA | chr12:10938277-10938299 | 1.7 ± 1.1 | 15 ± 1.6 | 8.8 |
| miRNA | chr28:1055205-1055224 | 26.5 ± 11.3 | 204.5 ± 18.6 | 7.7 |
| miRNA | chr14:759503-759526 | 31.8 ± 0.3 | 213.2 ± 26 | 6.7 |
| miRNA | chr2:62758752-62758773 | 2.1 ± 1.2 | 12.5 ± 0.6 | 6.1 |
| miRNA | chr13:7555655-7555676 | 10.6 ± 4.2 | 60.6 ± 8.3 | 5.7 |
| miRNA | chr2:148337300-148337321 | 1452 ± 39.6 | 7275.3 ± 46.5 | 5.0 |
| miRNA | chr1:104486710-104486736 | 915.9 ± 102.7 | 3702.3 ± 152.3 | 4.0 |
| miRNA | chr2:40745167-40745183 | 9.4 ± 2.4 | 33.5 ± 1 | 3.6 |
| miRNA | chr2:148331648-148331669 | 461.2 ± 63 | 1638.4 ± 129.7 | 3.6 |
| miRNA | chr4:233007-233027 | 2759.8 ± 88 | 9608.1 ± 564.1 | 3.5 |
| miRNA | chr3:76659772-76659792 | 219.8 ± 31.9 | 725.3 ± 80.2 | 3.3 |
| miRNA | chr1:102457663-102457684 | 5255.7 ± 209.1 | 15151.5 ± 1241 | 2.9 |
| miRNA | chr8:2001620-2001641 | 154.6 ± 49.6 | 424.6 ± 8.1 | 2.7 |
| miRNA | chr3:85102244-85102265 | 898.4 ± 20.4 | 2238.4 ± 52.7 | 2.5 |
| miRNA | chr19:5824134-5824155 | 79.4 ± 4.2 | 187.5 ± 8.9 | 2.4 |
| miRNA | chr26:1442955-1442976 | 11.8 ± 1 | 25.5 ± 3.8 | 2.2 |
| miRNA | chr17:8431792-8431812 | 1310.7 ± 123.6 | 2561 ± 161.5 | 2.0 |
| miRNA | chr23:5248432-5248450 | 1382.7 | 2630.2 ± 3.1 | 1.9 |
| miRNA | chr4:58651749-58651768 | 60.9 | 105.5 ± 4 | 1.7 |
| miRNA | chr1:152248637-152248658 | 49.1 | 82.5 ± 2.8 | 1.7 |
Table of significantly cardiac enriched miRNA transcripts. Indicated are transcript name (EnsemblTranscript | 5p/3p | miRNA name | (anti)sense), genomic location, average expression values ± SD and the fold difference between samples. The ' ~' indicates miRNA transcripts for which the genomic positions were not present in the Ensembl database and thus were inferred from the complementary position of the hairpin structure. # indicates the difference in expression level was confirmed by qPCR.
Figure 5qPCR confirmation of sequencing data. MicroRNA expression levels for both 5p and 3p transcripts from single precursor transcripts in HH16 whole chicken embryo and heart tube. 5 s rRNA was used as an internal control to normalize gene expression. Left and right column represent miSeq and qPCR derived expression levels, respectively. The bars represent mean expression levels ± sd. * indicates a significant difference in gene expression relative to whole embryo.
Figure 6Globaltest analysis results. Example covariate plot for the miRNA-125b-5p transcript (A) displaying the individual contributions of its 203 isomirs to the overall test statistics for differential expression. Significantly contributing transcripts are indicated by the dark line in the hierarchical tree. B: Zoomed section of the hierarchical tree depicting the subset of 23 significantly contributing isomir sequences. Positions and the detected nucleotide of variations are indicated, e.g., 17T# indicates that nucleotide 17 in the reference sequence was a 'T'. C: Isomir expression plots between the Heart and Embryo showing uniform differential isomir expression patterns for mir-499-3p, mir-125b-5p, mir-20b-5p and mir-219-3p, respectively.