| Literature DB >> 27066498 |
Fang Wang1, Yu Sun2, Jishou Ruan3, Rui Chen4, Xin Chen2, Chengjie Chen5, Jan F Kreuze6, ZhangJun Fei7, Xiao Zhu8, Shan Gao2.
Abstract
Small RNA sequencing (sRNA-seq) can be used to detect viruses in infected hosts without the necessity to have any prior knowledge or specialized sample preparation. The sRNA-seq method was initially used for viral detection and identification in plants and then in invertebrates and fungi. However, it is still controversial to use sRNA-seq in the detection of mammalian or human viruses. In this study, we used 931 sRNA-seq runs of data from the NCBI SRA database to detect and identify viruses in human cells or tissues, particularly from some clinical samples. Six viruses including HPV-18, HBV, HCV, HIV-1, SMRV, and EBV were detected from 36 runs of data. Four viruses were consistent with the annotations from the previous studies. HIV-1 was found in clinical samples without the HIV-positive reports, and SMRV was found in Diffuse Large B-Cell Lymphoma cells for the first time. In conclusion, these results suggest the sRNA-seq can be used to detect viruses in mammals and humans.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27066498 PMCID: PMC4811048 DOI: 10.1155/2016/2596782
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
The 42 previous studies from the SRA database.
| Study ID | Runs | Sample source | Disease |
|---|---|---|---|
| DRP000998 | 3 | Whole saliva, salivary exosome | Healthy |
| ERP001908 | 63 | Tongue, laryngopharynx, oropharynx | HNSCC |
| ERP004592 | 23 | Prefrontal cortex | Huntington's disease |
| SRP001381 | 3 | HeLa cell line | HPV18(+) |
| SRP002118 | 14 | Hek293T cell line | NA |
| SRP002272 | 15 | Liver | HBV(+), HCV(+), HCC |
| SRP002326 | 38 | Cervical tumor | Cervical cancer |
| SRP002402 | 3 | Sperm | Healthy |
| SRP007825 | 67 | Skin | Psoriasis |
| SRP008258 | 2 | Hek293, HeLa cell line | NA |
| SRP009246 | 4 | Primary human fibroblast | NA |
| SRP014020 | 20 | Thyroid tumor | Follicular thyroid adenoma |
| SRP017809 | 4 | Dorsolateral prefrontal cortex | Healthy |
| SRP017979 | 4 | Colorectal tumor | Colorectal cancer |
| SRP018255 | 35 | Plasma, serum, placenta | Healthy |
| SRP021130 | 20 | Cerebral cortex | FTLD, PSP, BHS, DLB, Alzheimer's disease |
| SRP021193 | 40 | Heart | NIC, IC |
| SRP021911 | 12 | Cumulus granulosa cell, mural granulosa cell | NA |
| SRP021924 | 5 | Brain frontal cortex | NA |
| SRP022043 | 70 | Blood | Alzheimer's disease |
| SRP022054 | 26 | Sigma, liver, coecum, colon ascendens, lymph node | Colorectal cancer |
| SRP026081 | 2 | Penicillium marneffei | NA |
| SRP026558 | 2 | PBMC | Osteopetrosis |
| SRP026562 | 11 | Prefrontal cortex | Alzheimer's disease |
| SRP027589 | 42 | Serum | Breast cancer |
| SRP028291 | 78 | ACA, ACC tumor, adrenal tissue | ACA, ACC |
| SRP028738 | 16 | MiRQC, serum, liver | NA |
| SRP029599 | 9 | FFPE, serum | Nonkeratinizing NPC, NPC |
| SRP032650 | 4 | Serum | Latent PTB, PTB |
| SRP032953 | 12 | Alpha cell, beta cell, whole islet | Type 2 diabetes mellitus |
| SRP033505 | 3 | Plasma | Healthy |
| SRP033566 | 185 | Connective tissue, plasma, neuronal tissue, primary cell, cardiac muscle, epithelium, skeletal muscle | DCM, IC |
| SRP034547 | 4 | Primary fibroblast | Microcephaly |
| SRP034586 | 24 | Serum, PBMC | Healthy |
| SRP034590 | 14 | Plasma | NA |
| SRP034654 | 12 | Tensor fascia lata, quadricep vastus, vastus externe, rhomboid, iliopsoas | FSHD |
| SRP034698 | 8 | Skin, lymph node | MCC, SCC, melanoma, BCC |
| SRP040421 | 12 | Exosome in human semen | Healthy |
| SRP041082 | 2 | Seminal fluid | Prostate cancer |
| SRP046046 | 12 | Lymphoblastoid | DLBCL, Burkitt's lymphoma, EBV(+) |
| SRP046234 | 2 | Breast epithelium | Triple negative breast cancer |
| SRP048290 | 6 | Platelet | Healthy |
“Study ID” is uniq for each high-throughput project in the NCBI SRA database. ACA: adrenal cortical adenoma, ACC: adrenal cortical carcinoma, BCC: Basal Cell Carcinoma, BHS: bilateral hippocampal sclerosis, DCM: Dilated Cardiomyopathy, DLB: dementia with Lewy bodies, DLBCL: Diffuse Large B-Cell Lymphoma, FSHD: Facioscapulohumeral Muscular Dystrophy, FTLD: frontotemporal lobar dementia, HCC: HBV-related hepatocellular carcinoma, HNSCC: Head and Neck Squamous Cell Carcinoma, IC: Ischemic Cardiomyopathy, MCC: Merkel Cell Carcinoma, NIC: Nonischemic Cardiomyopathy, NPC: nasopharyngeal carcinoma, PBMC: Peripheral Blood Mononuclear Cell, PSP: Progressive Supranuclear Palsy, PTB: Pulmonary Tuberculosis, and SCC: Squamous Cell Carcinoma.
HBV and HCV from the SRP002272 study.
| Run ID | Sample_Source | Reference | Cov (%) | Depth |
|---|---|---|---|---|
| SRR039611 | Human Normal Liver Tissue | NA | NA | NA |
| SRR039612 | Human Normal Liver Tissue | NA | NA | NA |
| SRR039613 | Human Normal Liver Tissue | NA | NA | NA |
| SRR039614 | HBV-Infected Liver Tissue | JQ688405 | 423 (13.2) | 3.0 |
| SRR039615 | Severe Chronic Hepatitis B Liver Tissue | NA | NA | NA |
| SRR039616 | HBV(+) Distal Tissue | NA | NA | NA |
| SRR039617 | HBV(+) Adjacent Tissue | NA | NA | NA |
| SRR039618 | HBV(+) Side Tissue | NA | NA | NA |
| SRR039619 | HBV(+) HCC Tissue | NA | NA | NA |
| SRR039620 | HBV(+) Adjacent Tissue | JQ688404 | 1756 (54.6) | 6.0 |
| SRR039621 | HBV(+) HCC Tissue | GQ475344 | 321 (10) | 1.5 |
| SRR039622 | HCV(+) Adjacent Tissue | D85516 | 1032 (10.8) | 1.8 |
| SRR039623 | HCV(+) HCC Tissue | GU133617 | 805 (8.3) | 8.0 |
| SRR039624 | HBV(−) HCV(−) Adjacent Tissue | NA | NA | NA |
| SRR039625 | HBV(−) HCV(−) HCC Tissue | NA | NA | NA |
“Run ID” is uniq for each high-throughput fastq file in the NCBI SRA database. “Reference” uses the NCBI GenBank accession number. “Cov (%)” and “Depth” represent the genome coverage and the average depth, respectively. “Side Tissue” is close to the border between the tumor tissues and the normal tissues but 0–2 cm far from the tumor tissues. “Adjacent Tissue” is the normal tissues 2–5 cm far from the tumor tissues. “Distal Tissue” is the normal tissues at least 10 cm far from the tumor tissues. “SRR039619” should have contained HBV but it was not found by our pipeline.
SMRV and EBV from the SRP046046 study.
| Run ID | Sample_Source | Reference | Cov (%) | Depth |
|---|---|---|---|---|
| SRR1563015 | DLBCL | M23385 | 8714 (99.2) | 146.1 |
| SRR1563017 | DLBCL Exosome | M23385 | 8732 (99.4) | 494.5 |
| SRR1563018 | EBV(+) BL | KC207813 | 2765 (1.6) | 29.2 |
| SRR1563056 | EBV(+) BL Exosome | KC207813 | 33107 (19.3) | 9.6 |
| SRR1563057 | EBV(−) BL | NA | NA | NA |
| SRR1563058 | EBV(−) BL Exosome | NA | NA | NA |
| SRR1563059 | EBV(+) LCL | KC207813 | 13757 (8) | 358.2 |
| SRR1563060 | EBV(+) LCL Exosome | M80517 | 7444 (4) | 288.8 |
| SRR1563061 | EBV(+) LCL | M80517 | 18688 (10.2) | 151.1 |
| SRR1563062 | EBV(+) LCL Exosome | KC207814 | 7931 (4.6) | 198.2 |
| SRR1563063 | EBV(+) LCL | M80517 | 37898 (20.6) | 52.8 |
| SRR1563064 | EBV(+) LCL Exosome | M80517 | 57850 (31.4) | 17.6 |
“Run ID” is uniq for each high-throughput fastq file in the NCBI SRA database. “Reference” uses the NCBI GenBank accession number. “Cov (%)” and “Depth” represent the genome coverage and the average depth, respectively.
Figure 1Nucleotide polymorphism, hotspots, and siRNA duplexes of HIV-1. The x-axis represents positions on the HIV-1 reference genome (GenBank: M19921). The y-axis represents the read counts from the data SRR941591 on each position. The dots in the top black box represent positions with polymorphic nucleotides. #1, #2, and #3 are the size distributions of positive- and negative-strand viral reads in hotspot 1 (779–810 bp), hotspot 2 (2,017–2,045 bp), and hotspot 3 (12,006–12,044 bp). The read counts of 21 bp, 22 bp, 23 bp, and 24 bp siRNA duplexes are marked in parentheses.
Figure 2Distribution of the total and HIV-1 viral read length on both of the strands. The x-axis represents read length. The y-axis represents the read counts of each length in the data SRR941591. HIV-1 ×100 reads represent 100 times of reads which can be aligned to the HIV-1 reference genome (GenBank: M19921).
Figure 3The predicted miRNAs of EBV. The EBV detected in data SRR1563063 is represented using the reference genome (GenBank: M80517) in this study. The sequence of the predicted mature miRNA is represented using the lowercase letters. (a) The second structures of the miRNA were predicted using RNAfold. (b) The first repeating unit (50578-50702) contains the predicted mature miRNA (50624-50646). This mature miRNA is repeated 12 times in 13 repeated units.