| Literature DB >> 25403361 |
Christian B Matranga, Kristian G Andersen, Sarah Winnicki, Michele Busby, Adrianne D Gladden, Ryan Tewhey, Matthew Stremlau, Aaron Berlin, Stephen K Gire, Eleina England, Lina M Moses, Tarjei S Mikkelsen, Ikponmwonsa Odia, Philomena E Ehiane, Onikepe Folarin, Augustine Goba, S Humarr Kahn, Donald S Grant, Anna Honko, Lisa Hensley, Christian Happi, Robert F Garry, Christine M Malboeuf, Bruce W Birren, Andreas Gnirke, Joshua Z Levin, Pardis C Sabeti.
Abstract
We have developed a robust RNA sequencing method for generating complete de novo assemblies with intra-host variant calls of Lassa and Ebola virus genomes in clinical and biological samples. Our method uses targeted RNase H-based digestion to remove contaminating poly(rA) carrier and ribosomal RNA. This depletion step improves both the quality of data and quantity of informative reads in unbiased total RNA sequencing libraries. We have also developed a hybrid-selection protocol to further enrich the viral content of sequencing libraries. These protocols have enabled rapid deep sequencing of both Lassa and Ebola virus and are broadly applicable to other viral genomics studies.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25403361 PMCID: PMC4262991 DOI: 10.1186/PREACCEPT-1698056557139770
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1RNase H selective depletion of poly(rA) carrier from Lassa samples. (A) Native polyacrylamide gel depicting library PCR and side products of LASV preparations with poly(rA) carrier present (middle) or depleted (right panel). No free poly(rA) was present in control library (left). (B) Median base qualities per MiSeq cycle of poly(rA)-contaminated LASV libraries (solid line) and control (no carrier observed in library, dashed) from FastQC report. Both read 1 and read 2 of paired end reads are merged in the library BAM file and the quality scores are shown at each base. (C) Schematic of carrier RNA selective depletion and DNase treatment of oligo (dT).
Figure 2Depletion of rRNA from human LASV isolates. (A) Rarefaction analysis of LASV sample (ISTH2016) from a rRNA-depleted (gray) or control (undepleted, blue) preparation. Data best fit (dashed line) to the Michelis-Menten formula in which projected saturation value equals V (see Materials and methods). (B) LASV genomic coverage from a LASV sample (ISTH0073) from a rRNA-depleted (gray) or control (blue) preparation. L, S segment, Z, L, NP, GPC: boundaries of each LASV genomic segment with specified genes encoded on each segment. (C) Starting overall content (RNA input) and enrichment of unique LASV (Library content) upon rRNA depletion from nine different clinical isolates.
Figure 3Depletion of rRNA from rodent and macaque LASV isolates. (A) Depletion of rRNA (top) and unique LASV (bottom) enrichment from Mastomys natalensis spleen and (B) various tissues from cynomolgous macaque (day 12 post LASV infection). Numbers over fraction unique reads represent fold-enrichment in LASV content after rRNA depletion.
LASV genome coverage from standard RNA-seq and hybrid selection libraries
|
|
| |||||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
| G090 | 5.2 | 1 | 0.28 | No | 1.2 | 20 | 19.25 | Yes |
| G2230 | 1.3 | 2 | 7.73 | No | 1.2 | 1 | 24.84 | No |
| G733 | 6.9 | 85 | 17.18 | Yes | 1.3 | 527 | 636.71 | Yes |
| G771 | 24.5 | 65 | 3.55 | Yes | 2.5 | 14 | 12.56 | Yes |
| ISTH0073 | 35.0 | 115 | 3.86 | Yes | 1.5 | 208 | 197.28 | Yes |
| ISTH0230 | 7.3 | 4 | 0.33 | No | 1.3 | 6 | 4.28 | Yes |
| ISTH1137 | 8.1 | 18 | 2.86 | Yes | 8.0 | 47 | 6.84 | Yes |
| ISTH2020 | 8.9 | 28 | 5.26 | Yes | 1.2 | 53 | 78.84 | Yes |
| ISTH2025 | 40.2 | 13 | 0.60 | Yes | 1.2 | 30 | 43.83 | Yes |
| ISTH2050 | 6.9 | 20 | 3.44 | Yes | 1.2 | 18 | 41.94 | Yes |
| LM032 | 14.9 | 121 | 8.99 | Yes | 12.3 | 1,003 | 88.18 | Yes |
| LM222 | 6.3 | 6 | 0.96 | Yes | 2.6 | 390 | 158.73 | Yes |
| Z002 | 5.8 | 0 | 0.08 | No | 1.1 | 23 | 26.09 | Yes |
aAverage base coverage per 1 million reads. Successful LASV genome assembly required >1× coverage of 90% of LASV ORF covered. Coverage metrics are based upon unique, non-duplicated LASV reads. G-series: Sierra Leone clinical isolates (4). ISTH series: Nigeria clinical isolates (6). LM and Z series: Mastomys natalensis isolates. Other metrics including average (×) coverage and % genome coverage at >1× are included in Additional file 1: Table S2.
Figure 4Hybrid selection of LASV. Frequencies of intra-host variants (iSNVs) observed in (A) human (G733) and (B) rodent (LM032) in standard and hybrid selected libraries. Data fit to a linear regression with y-axis intercepts set at 0. r: Pearson correlation value.
Figure 5Depletion of rRNA from EBOV-Sierra Leone clinical samples. (A) Percentage rRNA (left) and unique EBOV content (right) with (gray) and without (blue) rRNA depletion in four individual clinical serum isolates (G3676-2, G3677-1, G3677-2, G3682-1). (B) Average EBOV genome coverage with (gray) and without (blue) rRNA depletion from four individual isolates with standard deviation (black). N, VP35, VP40, GP, VP30, VP24, L: boundary for each gene in the EBOV genome. Positions and variant allele of two iSNVs (in G3676-2 only) observed after rRNA depletion are depicted.