| Literature DB >> 29459752 |
Chengran Zhou1,2,3, Shanlin Liu2,3,4, Wenhui Song2,3, Shiqi Luo5, Guanliang Meng2,3, Chentao Yang2,3, Hua Yang1, Jinmin Ma2,3, Liang Wang6, Shan Gao6, Jian Wang2,7, Huanming Yang2,7, Yun Zhao8, Hui Wang9,10,11, Xin Zhou12,13.
Abstract
RNA alternative splicing (AS) is an important post-transcriptional mechanism enabling single genes to produce multiple proteins. It has been well demonstrated that viruses deploy host AS machinery for viral protein productions. However, knowledge on viral AS is limited to a few disease-causing viruses in model species. Here we report a novel approach to characterizing viral AS using whole transcriptome dataset from host species. Two insect transcriptomes (Acheta domesticus and Planococcus citri) generated in the 1,000 Insect Transcriptome Evolution (1KITE) project were used as a proof of concept using the new pipeline. Two closely related densoviruses (Acheta domesticus densovirus, AdDNV, and Planococcus citri densovirus, PcDNV, Ambidensovirus, Densovirinae, Parvoviridae) were detected and analyzed for AS patterns. The results suggested that although the two viruses shared major AS features, dramatic AS divergences were observed. Detailed analysis of the splicing junctions showed clusters of AS events occurred in two regions of the virus genome, demonstrating that transcriptome analysis could gain valuable insights into viral splicing. When applied to large-scale transcriptomics projects with diverse taxonomic sampling, our new method is expected to rapidly expand our knowledge on RNA splicing mechanisms for a wide range of viruses.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29459752 PMCID: PMC5818608 DOI: 10.1038/s41598-018-21190-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Analysis framework. (A) Analysis framework; (B) Detailed analytical pipeline. Virus detection and viral expression analyses: this pipeline was designed to detect and obtain viral sequences from transcriptome datasets; all in house Perl scripts used in the pipeline are available on web (https://github.com/linzhi2013/Virusfishing).
Figure 2Genome coverage and annotations of AdDNV and PcDNV. Genome coverage of (A) AdDNV and (B) PcDNV. Log2 scale of read density was based on genomic sequences of AdDNV and PcDNV. Vertical bars highlight mutation sites against the reference sequences. Annotations of (C) AdDNV and (D) PcDNV. Coverage (Y-axis) of each nucleotide position (X-axis) was plotted for AdDNV_1KITE and PcDNV_1KITE. Six reading-frames and previously described genes were represented using information provided by NCBI, including: start/stop codons (short blue/red vertical bars), transcription directions (black arrows, from top to bottom: forward reading frames +1, +2, +3 and reverse reading frames −1, −2, −3), ORFs (solid gray boxes). Virus genes (blue boxes), proteins (red boxes) and conserved motifs (black boxes) were represented according to the NCBI annotations. BWA mapping profiles (green lines), TopHat2 mapping profiles (purple lines), TopHat2 gap mapping profiles (yellow lines, the number of both splicing and non-splicing reads, correspond to splicing junctions) were represented according to the mapping results. AdDNV introns reported in existing studies include: In (nt 223 to 855), Ia (nt 4403 to 4758), Ib (nt 4403 to 4544) and II (nt 4260 to 4434).
Detected introns of AdDNV and PcDNV.
| Species | ID | reads supports | location | direction | Length (base) | Intron Type | Note |
|---|---|---|---|---|---|---|---|
| AdDNV | AdDNV_I1 | 88 | 223..855 | + | 633 | GT-AG | A5SS, RI |
| AdDNV_I2 | 3 | 431..855 | + | 425 | GT-AG | A5SS, RI | |
| AdDNV_I3 | 7 | 4245..4533 | − | 289 | GT-AG | A3SS | |
| AdDNV_I4 | 135 | 4260..4434 | − | 175 | GT-AG | RI | |
| AdDNV_I5 | 18 | 4403..4533 | − | 131 | GT-AG | A3SS | |
| PcDNV | PcDNV_I1 | 43 | 217..879 | + | 663 | GT-AG | A5SS, RI |
| PcDNV_I2 | 275 | 221..879 | + | 659 | GT-AG | A5SS, RI | |
| PcDNV_I3 | 1 | 287..879 | + | 593 | GT-AG | A5SS, RI | |
| PcDNV_I4 | 20 | 304..879 | + | 576 | GT-AG | A5SS, RI | |
| PcDNV_I5 | 10 | 689..879 | + | 191 | GT-AG | A5SS, RI | |
| PcDNV_I6 | 3 | 710..879 | + | 170 | GT-AG | A5SS, RI | |
| PcDNV_I7 | 4 | 770..879 | + | 110 | GT-AG | A5SS, RI | |
| PcDNV_I8 | 1 | 1188..1299 | + | 112 | GT-AG | RI | |
| PcDNV_I9 | 3 | 2721..2820 | − | 100 | GT-AG | A3SS, RI | |
| PcDNV_I10 | 2 | 2740..2820 | − | 81 | GT-AG | A3SS, RI | |
| PcDNV_I11 | 1 | 3721..3906 | − | 186 | GT-AG | A3SS, RI | |
| PcDNV_I12 | 291 | 3824..3897 | − | 74 | GT-AG | A5SS, RI | |
| PcDNV_I13 | 44 | 3824..3906 | − | 83 | GT-AG | A5SS,A3SS, RI | |
| PcDNV_I14 | 3 | 4198..4480 | − | 283 | GT-AG | A3SS | |
| PcDNV_I15 | 1 | 4249..4340 | − | 92 | GT-AG | A5SS, RI | |
| PcDNV_I16 | 1994 | 4249..4423 | − | 175 | GT-AG | A5SS, A3SS | |
| PcDNV_I17 | 7 | 4249..4480 | − | 232 | GT-AG | A5SS, A3SS, SE | |
| PcDNV_I18 | 3 | 4281..4423 | − | 143 | GT-AG | A3SS | |
| PcDNV_I19 | 1 | 4341..4423 | − | 83 | GT-AG | A3SS | |
| PcDNV_I20 | 429 | 4403..4480 | − | 78 | GT-AG | A3SS, RI, SE | |
| PcDNV_I21 | 2 | 4775..4852 | − | 78 | GT-AG | A5SS, RI | |
| PcDNV_I22 | 6 | 4775..4898 | − | 124 | GT-AG | A5SS, RI | |
| PcDNV_I23 | 1699 | 4775..4958 | − | 184 | GT-AG | A5SS, RI | |
| PcDNV_MI1 | 3 | 4249..4340;4403..4480 | − | 92;78 | GT-AG;GT-AG | SE |
Figure 3Splicing profiles of AdDNV and PcDNV. (A) Detected splicing junctions of AdDNV_1KITE. (B) Detected splicing junction models of PcDNV_1KITE: Solid gray areas represented the TopHat2 mapping profiles and each color-coded block represented a splicing junction. Red and purple blocks were forward and reverse junctions, respectively. The edge of each block represented the coverage of supporting reads and the length of a block represented the location of a splicing event. The number near each block was the coverage of supporting reads. The middle bridge showed the intron region from the splicing event. The block thickness represented frequency (the number of supporting reads) of the intron. Splice site compositions for donor sites, branch sites and acceptor sites of all GT-AG type introns in AdDNV_1KITE (Panel C) and PcDNV_1KITE (Panel D) were displayed using WebLogo. The overall height of each stack indicated the sequence conservation at that position, measured in bits. Proteins mediating the GT-AG splicing were labelled as snRNP (small nuclear ribonucleoproteins) and SR (splicing regulatory proteins). (E) Log2 scale of reads density of introns in the genome alignment: The Y-axis showed the expression levels (the number of reads) of intron related splicing events. Introns with forward junctions (red labels, at NS region) and reverse junctions (blue labels, at VP region) of PcDNV_1KITE (top half) and AdDNV_1KITE (bottom half) were shown in the genome alignment. Multiple splicing events (orange labels) were also displayed.
Viral gene products and their expression levels.
| Species | Name | Regions | Involved Splicing sites | Nucleotide length (nt) | Effective length | FPKM (RSEM) | Relative expression level (%) | Product characters | NR Best hit overview | Putative Gene products |
|---|---|---|---|---|---|---|---|---|---|---|
| AdDNV | AdDNV_NS_ORF1 | 225..866 | none | 642 | 403 | 18362.89 | 5.59 | Known | NS3 (AdDNV) | nonstructural protein NS3 |
| AdDNV_NS_ORF1_I2 | join(225..430, 856) | AdDNV_I1 with depth 3 | 207 | 1 | 0 | 0.00 | Truncation (C-terminal) | NS3 (AdDNV) | nonstructural protein | |
| AdDNV_NS_ORF2 | 856..2586 | none | 1731 | 1492 | 328484.48 | 100.00 | Known | NS1 (AdDNV) | nonstructural protein NS1 with rolling-circle replication motif, walker/NTPase motif and Parvo_NS1 region | |
| AdDNV_NS_ORF3 | 875..1735 | none | 861 | 622 | 0 | 0.00 | Known | NS2 (AdDNV) | nonstructural protein NS2 | |
| AdDNV_VP_ORF4 | c(2605..4398) | none | 1794 | 1555 | 65660.96 | 19.99 | Known | NS2 (AdDNV) | structural protein with Denso_VP4 region | |
| AdDNV_VP_ORF5 | c(4424..5230) | none | 807 | 568 | 28846.15 | 8.78 | Known | putative structural protein (AdDNV_gp5) | structural protein 2 with Parvo_coat_N and PLA2 motif regions | |
| AdDNV_VP_ORF5_I5 | c(join(4398..4402, 4534..5230)) | AdDNV_I5 with depth 18 | 702 | 463 | 8673.58 | 2.64 | Truncation (C-terminal); non-synonymous Mutation (G233V) | putative structural protein (AdDNV_gp5) | structural protein with Parvo_coat_N and PLA2 motif regions | |
| AdDNV_VP_ORF6_I4 | c(join(2605..4259, 4435..5230)) | AdDNV_I4 with depth 135 | 2451 | 2212 | 164855.44 | 50.19 | Known; ORF shift (C-terminal); Non-synonymous mutation (E266Q) | structural protein VP1 (AdDNV) | structural protein VP1 with PLA2 motif, Parvo_coat_N and Denso_VP4 regions | |
| AdDNV_VP_ORF6_I3 | c(join(2605..4244, 4534..5230)) | AdDNV_I3 with depth 7 | 2337 | 2098 | 7443.55 | 2.27 | Deletion | structural protein VP1 (AdDNV) | structural protein with PLA2 motif, Parvo_coat_N and Denso_VP4 regions | |
| PcDNV | PcDNV_NS_ORF1 | 160..873 | none | 714 | 469 | 17344.87 | 6.42 | Known | NS3 (PcDNV) | nonstructural protein NS3 |
| PcDNV_NS_ORF2 | 810..2516 | none | 1707 | 1462 | 15339.25 | 5.68 | Known | NS1 (PcDNV) | nonstructural protein NS1 with Parvo_NS1 region | |
| PcDNV_NS_ORF2_I8 | join(810..1187, 1300..1701) | PcDNV_I8 with depth 1 | 780 | 535 | 59.63 | 0.02 | ORF shift (C-terminal) | NS1 (PcDNV) | nonstructural protein | |
| PcDNV_NS_ORF3 | 880..1701 | none | 822 | 577 | 0 | 0.00 | Novel | Hypothetical protein MPH 12776 | nonstructural protein NS2 | |
| PcDNV_NS_ORF6_I1 | join(160..216, 880..1701) | PcDNV_I1 with depth 43 | 879 | 634 | 18452.28 | 6.83 | ORF shift (C-terminal) | putative nonstructural protein (PcDNV, PcdVgp4) | nonstructural protein | |
| PcDNV_NS_ORF6_I4 | join(160..303, 880..1701) | PcDNV_I4; splicing reads depth: 20 | 966 | 721 | 814.32 | 0.30 | ORF shift (C-terminal) | putative nonstructural protein (PcDNV, PcdVgp4) | nonstructural protein | |
| PcDNV_NS_ORF7_I2 | join(160..220, 880..2516) | PcDNV_I2 with depth 275 | 1698 | 1453 | 182580.84 | 67.63 | Combination | NS1 (PcDNV) | nonstructural protein with Parvo_NS1 region | |
| PcDNV_NS_ORF7_I3 | join(160..286, 880..2516) | PcDNV_I3 with depth 1 | 1764 | 1519 | 108.68 | 0.04 | Combination | NS1 (PcDNV) | nonstructural protein with Parvo_NS1 region | |
| PcDNV_NS_ORF7_I5 | join(160..688, 880..2516) | PcDNV_I5 with depth 10 | 2166 | 1921 | 729.65 | 0.27 | Combination | NS1 (PcDNV) | nonstructural protein with Parvo_NS1 region | |
| PcDNV_NS_ORF7_I6 | join(160..709, 880..2516) | PcDNV_I6 with depth 3 | 2187 | 1942 | 218.03 | 0.08 | Combination | NS1 (PcDNV) | nonstructural protein with Parvo_NS1 region | |
| PcDNV_NS_ORF7_I7 | join(160..769, 880..2516) | PcDNV_I7 with depth 4 | 2247 | 2002 | 177.05 | 0.07 | Combination; Mutation (D204Y) | NS1 (PcDNV) | nonstructural protein with Parvo_NS1 region | |
| PcDNV_VP_ORF4 | c(2531..4402) | none | 1872 | 1627 | 107736.5 | 39.91 | Known | putative structural protein (PcDNV, PcdVgp2) | structural protein with Denso_VP4 region | |
| PcDNV_VP_ORF4_I9 | c(join(2602..2720, 2821..4402)) | PcDNV_I9 with depth 3 | 1701 | 1456 | 212.52 | 0.08 | ORF shift (C-terminal) | putative structural protein (PcDNV, PcdVgp2) | structural protein | |
| PcDNV | PcDNV_VP_ORF4_I10 | c(join(2531..2739, 2821..4402)) | PcDNV_I10 with depth 2 | 1791 | 1546 | 0 | 0.00 | Deletion | putative structural protein (PcDNV, PcdVgp2) | structural protein |
| PcDNV_VP_ORF4_I11 | c(join(2531..3720, 3907..4402)) | PcDNV_I11 with depth 1 | 1686 | 1441 | 3190.9 | 1.18 | Deletion | putative structural protein (PcDNV, PcdVgp2) | structural protein | |
| PcDNV_VP_ORF4_I12 | c(join(3789..3823, 3898..4402)) | PcDNV_I12 with depth 291 | 540 | 295 | 319.37 | 0.12 | ORF shift (C-terminal) | putative structural protein (PcDNV, PcdVgp2) | structural protein | |
| PcDNV_VP_ORF4_I13 | c(join(3789..3823, 3907..4402)) | PcDNV_I13 with depth 44 | 531 | 286 | 210.38 | 0.08 | ORF shift (C-terminal) | putative structural protein (PcDNV, PcdVgp2) | structural protein | |
| PcDNV_VP_ORF4_I15 | c(join(4206..4248, 4341..4402)) | PcDNV_I15 with depth 1 | 105 | 0 | 0 | 0.00 | ORF shift (C-terminal) | putative structural protein (PcDNV, PcdVgp2) | structural protein | |
| PcDNV_VP_ORF5 | c(4392..5222) | none | 831 | 586 | 0 | 0.00 | Known | putative structural protein (PcDNV, PcdVgp1) | structural protein with PLA2 motif and Parvo_coat_N regions | |
| PcDNV_VP_ORF5_I19 | c(join(4336..4340, 4424..5222)) | PcDNV_I19 with depth 1 | 804 | 559 | 0 | 0.00 | Truncation (C-terminal) | putative structural protein (PcDNV, PcdVgp1) | structural protein with PLA2 motif and Parvo_coat_N regions | |
| PcDNV_VP_ORF5_I20(MI1) | c(join(4392..4402, 4481..5222)) | PcDNV_I20 with depth 429; or PcDNV_MI2 with depth 3 | 753 | 508 | 1659.65 | 0.61 | Deletion | putative structural protein (PcDNV, PcdVgp1) | structural protein with PLA2 motif and Parvo_coat_N regions | |
| PcDNV_VP_ORF5_I21 | c(join(4392..4774, 4853..5222)) | PcDNV_I21 with depth 2 | 753 | 508 | 119.93 | 0.04 | Deletion | putative structural protein (PcDNV, PcdVgp1) | structural protein with PLA2 motif and Parvo_coat_N regions | |
| PcDNV_VP_ORF8_I14 | c(join(2531..4197, 4481..5222)) | PcDNV_I14 with depth 3 | 2409 | 2164 | 614.74 | 0.23 | Combination | putative structural protein (PcDNV, PcdVgp2) | structural protein with PLA2 motif, Parvo_coat_N and Denso_VP4 regions | |
| PcDNV_VP_ORF8_I16 | c(join(2531..4248, 4424..5222)) | PcDNV_I16 with depth 1994 | 2517 | 2272 | 185726.27 | 68.79 | Combination; Mutation (E267K) | putative structural protein (PcDNV, PcdVgp2) | structural protein with PLA2 motif, Parvo_coat_N and Denso_VP4 regions | |
| PcDNV_VP_ORF8_I17 | c(join(2531..4248, 4481..5222)) | PcDNV_I17 with depth 7 | 2460 | 2215 | 166.07 | 0.06 | Combination | putative structural protein (PcDNV, PcdVgp2) | structural protein with PLA2 motif, Parvo_coat_N and Denso_VP4 regions | |
| PcDNV_VP_ORF9_I18 | c(join(4177..4280, 4424..5222)) | PcDNV_I18 with depth 3 | 903 | 658 | 103.55 | 0.04 | ORF shift (C-terminal) | putative structural protein (PcDNV, PcdVgp1) | structural protein with PLA2 motif and Parvo_coat_N regions | |
| PcDNV_VP_ORF10_I22 | c(join(4481..4774, 4899..5222)) | PcDNV_I22 with depth 6 | 618 | 373 | 203.45 | 0.08 | ORF shift (C-terminal) | putative structural protein (PcDNV, PcdVgp1) | structural protein | |
| PcDNV_VP_ORF10_I23 | c(join(4481..4774, 4959..5222)) | PcDNV_I23 with depth 1699 | 558 | 313 | 269981.32 | 100.00 | ORF shift (C-terminal) | putative structural protein (PcDNV, PcdVgp1) | structural protein |
Note:
c: abbreviation of complement.
ORF shift: the open reading frame had a novel reading frame pattern produced by splicing, which was different from previously reported genes.
Relative expression level: the FPKM value of one gene divided by the FPKM value of the highest expressed gene of the same virus.
Parvo_NS1 region: AdDNV_1KITE: nt 2119 to 2433, reading frame + 1; PcDNV_1KITE: nt 2034 to 2381, reading frame + 3.
Denso_VP4 region: AdDNV_1KITE: nt 3882 to 2674, reading frame -2; PcDNV_1KITE: nt 3952 to 2672, reading frame −1.
Parvo coat N region: AdDNV_1KITE: nt 4750 to 4613, reading frame -1; PcDNV_1KITE: nt 4748 to 4650, reading frame −3.
PLA2 motif region: AdDNV_1KITE: nt 4684 to 4649, reading frame -1; PcDNV_1KITE: nt 4682 to 4647, reading frame −3.
Figure 4Inferred viral gene products. Viral gene products were annotated according to viral genome positions. The NS genes were represented in forward direction (Panels A and C) and the VP genes were represented in reverse direction (Panels B and D). For the splicing products, numbers of the detected splicing reads/non-splicing reads (over the intron) were listed next to the gene ID (covering the donor and receptor junctions). Numbers of the non-splicing reads of the donor (d) and receptor (r) sites were also labeled. Positions of start codons, stop codons and amino acids at splicing junctions were shown in the reading frames of forward (NS) and reverse (VP) polarities.
RT-PCR summary.
| Number | Name | Length | Designed PCR product length | PCR gel results | Splicing detected by Snger | Detected Junctions | Primer |
|---|---|---|---|---|---|---|---|
| 1 | AdDNV_NS_ORF1 | 642 | 631 | 600~700 bp | √ | AdDNV_ORF1_F1, _R1 | |
| 2 | AdDNV_NS_ORF1_I2 | 207 | 207 | near 200 bp | √ | I2 | AdDNV_ORF1_F1, AdDNV_ORF1_I2_R1 |
| 3 | AdDNV_NS_ORF2 | 1731 | 1704 | near 2 kb | √ | AdDNV_ORF2_F1, _R1 | |
| 4 | AdDNV_NS_ORF3 | 861 | 842 | 700–1 kb | √ | AdDNV_ORF3_F1, _R1 | |
| 5 | AdDNV_NS_ORF4 | 1794 | 1794 | near 2 kb | √ | AdDNV_ORF4_F1, _R1 | |
| 6 | AdDNV_NS_ORF5 | 807 | 807 | 700–1 kb | √ | AdDNV_ORF5_F1, _R1 | |
| 7 | AdDNV_NS_ORF5_I5 | 702 | 702 | 600–1 kb | √ | I5 | AdDNV_ORF5_F1, AdDNV_ORF5_I5_R1 |
| 8 | Ad_DNV_NS_ORF6_I3 | 2451 | 2337 | 2kb–3kb | not detected | I4 | AdDNV_ORF6_F1, AdDNV_ORF4_R1 |
| Ad_DNV_NS_ORF6_I4 | 2337 | √ | |||||
| 9 | PcDNV_NS_ORF6_I1 | 879 | 966 | near 1 kb | √ | I1 | PcDNV_ORF6_F1, _R1 |
| PcDNV_NS_ORF6_I4 | 966 | not detected | |||||
| 10 | PcDNV_NS_ORF7_I2 | 1698 | 1660 | near 2 kb | √ | I2 | PcDNV_ORF7_F1, _R1 |
| PcDNV_NS_ORF7_I3 | 1764 | not detected | |||||
| PcDNV_NS_ORF7_I5 | 2166 | not detected | |||||
| PcDNV_NS_ORF7_I6 | 2187 | not detected | |||||
| PcDNV_NS_ORF7_I7 | 2247 | not detected | |||||
| 11 | PcDNV_VP_ORF4 | 1872 | 1872 | near 2 kb | √ | PcDNV_ORF4_F1, _R1 | |
| PcDNV_VP_ORF4_I10 | 1791 | not detected | |||||
| PcDNV_VP_ORF4_I11 | 1686 | not detected | |||||
| 12 | PcDNV_VP_ORF8_I14 | 2409 | 2517 | 2kb–3kb | not detected | I16 | PcDNV_ORF8_F1, PcDNV_ORF4_R1 |
| PcDNV_VP_ORF8_I16 | 2517 | √ | |||||
| PcDNV_VP_ORF8_I17 | 2460 | not detected | |||||
| 13 | PcDNV_VP_ORF10_I22 | 618 | 558 | 500–900 bp | not detected | I23 | PcDNV_ORF10_F1, _R1 |
| PcDNV_VP_ORF10_I23 | 558 | √ |