| Literature DB >> 31253993 |
Sarah Mollerup1, Maria Asplund1, Jens Friis-Nielsen2, Kristín Rós Kjartansdóttir1, Helena Fridholm1, Thomas Arn Hansen1, José Alejandro Romero Herrera2,3, Christopher James Barnes1, Randi Holm Jensen1, Stine Raith Richter1, Ida Broman Nielsen1, Carlotta Pietroni1, David E Alquezar-Planas1, Alba Rey-Iglesia1, Pernille V S Olsen1, Ewa Rajpert-De Meyts4, Line Groth-Pedersen5, Christian von Buchwald6, David H Jensen6, Robert Gniadecki7, Estrid Høgdall8, Jill Levin Langhoff8, Imre Pete9, Ildikó Vereczkey9, Zsolt Baranyai10, Karen Dybkaer11, Hans Erik Johnsen12, Torben Steiniche13, Peter Hokland14, Jacob Rosenberg15, Ulrik Baandrup16, Thomas Sicheritz-Pontén2,17, Eske Willerslev1, Søren Brunak2,3, Ole Lund2, Tobias Mourier1, Lasse Vinner1, Jose M G Izarzugaza2, Lars Peter Nielsen18, Anders Johannes Hansen1.
Abstract
BACKGROUND: Viruses and other infectious agents cause more than 15% of human cancer cases. High-throughput sequencing-based studies of virus-cancer associations have mainly focused on cancer transcriptome data.Entities:
Keywords: cancer; enrichment; human; virome
Mesh:
Year: 2019 PMID: 31253993 PMCID: PMC6743825 DOI: 10.1093/infdis/jiz318
Source DB: PubMed Journal: J Infect Dis ISSN: 0022-1899 Impact factor: 5.226
Samples and Datasets Included in the Study
| Virion Enrichment | Capture | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Sample Type | Sample Material | Samples (n) | Total DNA | Total RNA | DNA | RNA | Circular DNA Enrichment | Retrovirus DNA | Retrovirus mRNA | Vert. Virus DNA | mRNA | Datasets (n) |
| Basal cell carcinoma (cutaneous) | Tumor biopsies | 11 | 11 | 11 | 11 | 4 | 6 | 11 | 54 | |||
| Mycosis fungoides (cutaneous) | Tumor biopsies | 11 | 11 | 11 | 11 | 10 | 10 | 11 | 64 | |||
| Melanoma (cutaneous) | Tumor biopsies | 10 | 10 | 10 | 10 | 8 | 10 | 48 | ||||
| Oral cancer | Tumor biopsies | 10 | 9 | 10 | 10 | 10 | 10 | 49 | ||||
| Oral healthy | Healthy tissue | 1 | 1 | 1 | 2 | |||||||
| Vulvar cancer | Tumor biopsies | 3 | 3 | 3 | 3 | 3 | 12 | |||||
| Bladder cancer | Tumor biopsies | 7 | 7 | 7 | 5 | 7 | 26 | |||||
| Bladder cancer urine | Urine | 10 | 2 | 10 | 4 | 16 | ||||||
| Colon cancer | Tumor biopsies | 16 | 12 | 11 | 3 | 3 | 6 | 6 | 41 | |||
| Colon healthy | Healthy tissue | 2 | 2 | 2 | ||||||||
| Breast cancer | Tumor biopsies | 20 | 20 | 19 | 17 | 20 | 15 | 91 | ||||
| Testicular cancer | Tumor biopsies | 20 | 5 | 20 | 20 | 45 | ||||||
| AML | Bone marrow (sorted cells) | 9 | 6 | 9 | 9 | 7 | 31 | |||||
| B-CLL | Blood/bone marrow (sorted cells) | 9 | 8 | 9 | 9 | 8 | 9 | 8 | 51 | |||
| BCP-ALL | Bone marrow | 8 | 8 | 8 | 8 | 24 | ||||||
| CML | Bone marrow (sorted cells) | 10 | 10 | 10 | 10 | 10 | 10 | 50 | ||||
| T-ALL | Bone marrow (nonsorted/sorted cells) | 11 | 9 | 11 | 11 | 9 | 40 | |||||
| DLBCL | Cell lines | 5 | 5 | 3 | 3 | 11 | ||||||
| Lymphoblastic lymphoma | Cell lines | 1 | 1 | 1 | 1 | 3 | ||||||
| Multiple myeloma | Cell lines | 6 | 6 | 2 | 2 | 10 | ||||||
| Colon cancer blood | Blood | 8 | 8 | 8 | ||||||||
| Colon cancer ascites | Ascites | 1 | 1 | 1 | 2 | |||||||
| Breast cancer ascites | Ascites | 1 | 1 | 1 | 1 | 1 | 1 | 5 | ||||
| Ovarian cancer ascites | Ascites | 5 | 5 | 4 | 3 | 3 | 5 | 20 | ||||
| Pancreatic cancer ascites | Ascites | 2 | 2 | 2 | 1 | 5 | ||||||
| NTC | 19 | 18 | 5 | 1 | 7 | 50 | ||||||
| Total (without NTC) | 197 | 107 | 72 | 143 | 146 | 114 | 33 | 6 | 75 | 14 | 710 | |
Abbreviations: AML, acute myeloid leukaemia; B-CLL, B-cell chronic lymphocytic leukaemia; BCP-ALL, B-cell precursor acute lymphoblastic leukaemia; CML, chronic myelogenous leukaemia; DLBCL, diffuse large B-cell lymphoma; DNA, deoxyribonucleic acid; mRNA, messenger ribonucleic acid; NTC, nontemplate control; RNA, ribonucleic acid; T-ALL, T-lineage acute lymphoblastic leukaemia; Vert., vertebrate.
Figure 1.Laboratory methods and analysis pipeline. (Top) Schematic illustration of the laboratory methods used. Total deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) was sequenced, or samples were exposed to one of the indicated enrichment methods before sequencing. (Bottom) Schematic illustration of the data analysis pipeline; de novo assembled contigs and human-depleted reads were analyzed with BLASTn and/or BLASTx/DIAMOND. Human viral hits were investigated in silico, and the reads were mapped to a database of selected viral reference genomes. *Applies to the majority of the datasets (see Methods).
Figure 2.Viruses detected from BLASTnx of contigs and read mapping. (Top) The number of contigs detected across cancer types (horizontal axis), indicated by color (right legend). Only confirmed viral hits are included. (Bottom) The fraction of viral reads in parts per million (ppm) detected across cancer types (horizontal axis), indicated by color (right legend). Only confirmed viral hits are included. AML, acute myeloid leukemia; B-CLL, B-cell chronic lymphocytic leukaemia; BCP-ALL, B-cell precursor acute lymphoblastic leukaemia; CML, chronic myeloid leukemia; T-ALL, T-lineage acute lymphoblastic leukaemia. NTC, nontemplate control.
Virus-Positive Samples From the Read Mapping Analysis
| Sample Type | Samples (n) |
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|
| Basal cell carcinoma | 11 | 8 | 3 | 5 | 10 | 1 | ||
| Mycosis fungoides | 11 | 7 | 6 | 8 | 9 | 2 | ||
| Melanoma | 10 | 7 | 3 | 6 | 8 | 1 | ||
| Oral cancer | 10 | 7 | 4 | 9 | 2 | 5 | ||
| Oral healthy | 1 | 1 | 1 | 1 | ||||
| Vulvar cancer | 3 | 2 | 1 | 3 | ||||
| Bladder cancer | 7 | 2 | 1 | 6 | 2 | 3 | ||
| Bladder cancer urine | 10 | 2 | 5 | 4 | 1 | 5 | ||
| Colon cancer | 16 | 2 | ||||||
| Breast cancer | 20 | 3 | 6 | 3 | 1 | 1 | ||
| Testicular cancer | 20 | 2 | 1 | 2 | ||||
| AML | 9 | 1 | 1 | 1 | 1 | 1 | ||
| B-CLL | 9 | 1 | 3 | 3 | ||||
| BCP-ALL | 8 | 1 | 2 | |||||
| CML | 10 | 1 | 3 | 1 | 7 | 1 | ||
| T-ALL | 11 | 1 | 1 | 1 | ||||
| Colon cancer blood | 8 | 2 | ||||||
| Colon cancer ascites | 1 | 1 | 1 | |||||
| Ovarian cancer ascites | 5 | 1 | 1 | |||||
| Pancreatic cancer ascites | 2 | 1 | ||||||
| Total no. of samples | 43 | 30 | 55 | 38 | 38 | 5 | 1 | |
| Total no. of sample types | 13 | 9 | 15 | 12 | 15 | 4 | 1 |
Abbreviations: AML, acute myeloid leukemia; B-CLL, B-cell chronic lymphocytic leukaemia; BCP-ALL, B-cell precursor acute lymphoblastic leukaemia; CML, chronic myeloid leukemia; T-ALL, T-lineage acute lymphoblastic leukaemia.
Notes: The number of samples positive for a given viral family is shown for each sample type. Extended counts are shown in Supplementary Table S8. Only confirmed viral hits are included.
Figure 3.Human papillomaviruses (HPVs) identified in skin and mucosal cancers. Genome coverage (%) for the different HPV types found in samples of skin and mucosal cancers, indicated by color (right legend) (the full dataset is shown in Supplementary Figure S3). Only confirmed viral hits are included.
Figure 4.Species co-occurrence network. Network inference between the viruses grouped at species level. Nodes represent viral species, with diameters proportional to the total number of occurrences of a species (ranging from 4 to 40) and colored segments representing the proportions of sample types in which a virus occurred. Green color tones represent skin-associated sample types, red/pink color tones represent mucosal, blue represent sample types originating from blood (mainly leukemias), orange/yellow represent other tissue, and gray tones represent ascitic fluid. AADvA, adeno-associated dependoparvovirus A; AML, acute myeloid leukemia; B-CLL, B-cell chronic lymphocytic leukaemia; BCP-ALL, B-cell precursor acute lymphoblastic leukaemia; BetaPV, Betapapillomavirus; CML, chronic myelogenous leukemia; GammaPV, Gammapapillomavirus; HHV, human herpesvirus; HPyV1, human polyomavirus 1 (BKV); MCPyV, merkel cell polyomavirus; MicroTTV, micro torque teno virus; PgvA, Pegivirus A; P.erythPV1, primate erythroparvovirus 1 (parvovirus B19); SENV, SEN virus; T-ALL, T-lineage acute lymphoblastic leukaemia; TTmidiV, torque teno midi virus; TTV, torque teno virus; Uncl Anello, Unclassified Anellovirus.
Datasets Positive for a Given Viral Family for the Laboratory Methods Applied
| All Samples |
|
|
|
|
| ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Method | Datasets (n) | Contigs | Reads | Contigs | Reads | Contigs | Reads | Contigs | Reads | Contigs | Reads |
| Vert. virus capt. DNA | 75 | 3 | 31 | 1 | 3 | 10 | 42 | 19 | 32 | 5 | 15 |
| Circular DNA | 114 | 5 | 5 | 4 | 4 | 1 | 3 | 6 | 6 | 13 | 13 |
| Virion DNA | 143 | 6 | 21 | 4 | 24 | 1 | 9 | 2 | 4 | 12 | |
| Virion RNA | 146 | 1 | 13 | 1 | 4 | 1 | 5 | 6 | 15 | ||
| Retrovirus capt. DNA | 33 | 1 | 6 | 1 | |||||||
| Total DNA | 107 | 2 | 1 | 6 | 3 | 3 | |||||
| Total RNA | 72 | 1 | 3 | 2 | 1 | ||||||
| Samples Processed With All 4 Methods | |||||||||||
| Vert. virus capt. DNA | 58 | 2 | 22a | 7 | 30b | 15 | 23c | 4 | 11 | ||
| Circular DNA | 58 | 4 | 4 | 1 | 2 | 6 | 6 | 6 | 8 | ||
| Virion DNA | 58 | 4 | 14d | 3 | 15 | 5 | 1 | 3 | 11 | ||
| Virion RNA | 58 | 1 | 12 | 2 | 1 | 3 | 5 | 11 | |||
Abbreviations: capt., capture; DNA, deoxyribonucleic acid; RNA, ribonucleic acid; Vert., vertebrate.
Notes: The number of datasets positive based on contig BLASTnx (leftmost column shown for each viral family) and read mapping (rightmost column shown for each viral family) are shown. The top part of the table shows the numbers for all datasets, the bottom part shows the number for datasets from samples processed with all 4 enrichment methods. Only the 5 most frequently detected families are shown, and only confirmed viral hits are included. Nontemplate controls are excluded.
a P = 9.5 × 10–5 vs circular DNA enrichment.
b P = 5.1 × 10–7 vs virion enrichment DNA (nonsignificant at contig level, P = .061).
c P = 4.6 × 10–4 vs circular DNA enrichment (nonsignificant at contig level, P = .052).
d P = .019 vs circular DNA enrichment.