| Literature DB >> 35628509 |
Michael P Wilczek1, Aiden M C Pike1, Sophie E Craig1,2, Melissa S Maginnis1,2, Benjamin L King1,2.
Abstract
JC polyomavirus (JCPyV) is the causative agent of the fatal, incurable, neurological disease, progressive multifocal leukoencephalopathy (PML). The virus is present in most of the adult population as a persistent, asymptotic infection in the kidneys. During immunosuppression, JCPyV reactivates and invades the central nervous system. A main predictor of disease outcome is determined by mutations within the hypervariable region of the viral genome. In patients with PML, JCPyV undergoes genetic rearrangements in the noncoding control region (NCCR). The outcome of these rearrangements influences transcription factor binding to the NCCR, orchestrating viral gene transcription. This study examines 989 NCCR sequences from patient isolates deposited in GenBank to determine the frequency of mutations based on patient isolation site and disease status. The transcription factor binding sites (TFBS) were also analyzed to understand how these rearrangements could influence viral transcription. It was determined that the number of TFBS was significantly higher in PML samples compared to non-PML samples. Additionally, TFBS that could promote JCPyV infection were more prevalent in samples isolated from the cerebrospinal fluid compared to other locations. Collectively, this research describes the extent of mutations in the NCCR that alter TFBS and how they correlate with disease outcome.Entities:
Keywords: JC polyomavirus; NCCR; PML; mutations; transcription factors; viral genome
Mesh:
Substances:
Year: 2022 PMID: 35628509 PMCID: PMC9144386 DOI: 10.3390/ijms23105699
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 6.208
Summary of the 989 NCCR Sequences.
| Tissue Source | Primary Disease State | Total | Secondary Disease | Number of Cases |
|---|---|---|---|---|
| Brain ( | PML | 26 (81%) | HIV | 11 (34%) |
| MS a | 1 (3%) | |||
| WAS 1 | 1 (3%) | |||
| HIGM 2 | 1 (3%) | |||
| N/A | 12 (38%) | |||
| JCPyVE 3 | 1 (3%) | H/O Lung Cancer | 1 (3%) | |
| GCN b | 1 (3%) | MS | 1 (3%) | |
| N/A | 4 (13%) | N/A | 4 (13%) | |
| Plasma/serum/PBMC (i.e., blood) ( | PML | 91 (82%) | HIV | 7 (6%) |
| MS | 81 (73%) | |||
| N/A | 3 (3%) | |||
| Consistent with PML | 7 (6%) | HIV | 7 (6%) | |
| N/A | 13 (12%) | HIV | 5 (5%) | |
| N/A | 8 (7%) | |||
| CSF ( | PML | 195 (90%) | HIV | 46 (21%) |
| HIV/MS | 2 (1%) | |||
| RA 4 | 1 (1%) | |||
| SLE c | 9 (4%) | |||
| MS | 80 (37%) | |||
| AML d | 15 (7%) | |||
| ALL e | 7 (3%) | |||
| CLL f | 4 (2%) | |||
| NHL | 4 (2%) | |||
| WM 5 | 8 (4%) | |||
| Other # | 14 (6%) | |||
| N/A | 5 (2%) | |||
| Consistent with PML | 6 (3%) | HIV | 6 (3%) | |
| Suspected of PML | 14 (6%) | HIV | 2 (1%) | |
| MS | 1 (1%) | |||
| N/A | 11 (5%) | |||
| N/A | 2 (1%) | HIV | 1 (1%) | |
| N/A | 1 (1%) | |||
| Kidney ( | JCPyVAN 6 | 2 (100%) | N/A | 2 (100%) |
| Kidney; Urine ( | JCPyVAN 6 | 3 (100%) | N/A | 3 (100%) |
| Brain; Kidney ( | N/A | 6 (100%) | N/A | 6 (100%) |
| CSF; Plasma ( | Consistent with PML | 2 (100%) | HIV | 2 (100%) |
| Urine ( | PML | 78 (14%) | HIV | 4 (1%) |
| MS | 74 (13%) | |||
| JCPyVAN 6 | 1 (0%) | N/A | 1 (0%) | |
| 4 (1%) | kidney transplant and subsequent antibody-mediated rejection | 4 (1%) | ||
| Healthy | 179 (15%) | N/A | 179 (32%) | |
| No PML | 25 (%) | Stable Kidney Transplant | 25 (4%) | |
| N/A | 279 (65%) | HIV | 21 (4%) | |
| SLE | 8 (1%) | |||
| MS | 12 (2%) | |||
| RA | 1 (0%) | |||
| N/A | 236 (42%) | |||
| No tissue reported ( | PML | 47 (94%) | HIV | 3 (6%) |
| MS | 44 (88%) | |||
| N/A | 3 (6%) | N/A | 3 (6%) |
1 Wiscott Aldrich syndrome; 2 Hyper IgM syndrome; 3 JC Virus encephalopathy; 4 Rheumatoid arthritis; 5 Waldenstrom macroglobulinemia; 6 JC Virus-associated nephropathy; a Multiple sclerosis; b Granule cell neuronopathy; c Systemic lupus erythematosus; d Acute myeloid leukemia; e Acute lymphoblastic leukemia; f Chronic lymphocytic leukemia; # Other include: Primary Immunodeficiency Syndrome (3); Sarcoidosis (3); Psoriasis (1); Leukemia (1); Lymphoma (1); HCV-related liver disease (5);% HG764413 has plasma, urine, CSF and kidney listed as the tissue.
Figure 1Cladogram of 989 NCCR sequences labeled with the sample isolation site and the disease status of the patients. 989 JCPyV NCCR sequences were aligned using Clustal Omega using the EMBOSS package. Colors used in the circular cladogram created using the R/ggtree package correspond to the site of sample isolation and disease status of individual patients.
Summary of the NCCR Block Codes by Tissue and PML Disease Status.
| NCCR Block Code | Total # | Number of PML Patient Samples (Total Samples) | |||||
|---|---|---|---|---|---|---|---|
| CSF | Urine | Blood | Brain | Other | Not Specified | ||
|
| 592 | 35 (36) | 75 (490) | 35 (44) | 1 (2) | 0 (6) | 13 (14) |
|
| 74 | 20 (23) | 1 (1) | 36 (38) | 2 (4) | 0 (1) | 7 (7) |
|
| 64 | 13 (13) | 2 (50) | 0 (1) | |||
|
| 28 | 19 (22) | 0 (1) | 2 (3) | 2 (2) | ||
|
| 22 | 9 (14) | 5 (6) | 2 (2) | |||
|
| 21 | 9 (9) | 10 (10) | 1 (2) | |||
|
| 17 | 9 (10) | 0 (1) | 2 (2) | 4 (4) | ||
|
| 13 | 2 (2) | 0 (10) | 0 (1) | |||
|
| 11 | 6 (7) | 0 (1) | 2 (2) | 0 (1) | ||
|
| 10 | 10 (10) | |||||
|
| 10 | 7 (7) | 3 (3) | ||||
|
| 10 | 5 (5) | 5 (5) | ||||
|
| 10 | 2 (2) | 0 (1) | 0 (2) | 5 (5) | ||
|
| 9 | 2 (4) | 2 (2) | 2 (2) | 0 (1) | ||
|
| 8 | 0 (8) | |||||
|
| 7 | 5 (5) | 1 (1) | 1 (1) | |||
|
| 5 | 4 (5) | |||||
|
| 4 | 3 (3) | 0 (1) | ||||
|
| 4 | 2 (2) | 0 (2) | ||||
|
| 4 | 1 (2) | 0 (1) | 0 (1) | |||
|
| 4 | 1 (1) | 0 (2) | 0 (1) | |||
|
| 4 | 4 (4) | |||||
|
| 3 | 3 (3) | |||||
|
| 3 | 3 (3) | |||||
|
| 3 | 1 (1) | 1 (1) | 1 (1) | |||
|
| 3 | 2 (2) | 0 (1) | ||||
|
| 3 | 1 (1) | 1 (1) | 1 (1) | |||
|
| 2 | 2 (2) | |||||
|
| 2 | 2 (2) | |||||
|
| 2 | 1 (1) | 0 (1) | ||||
|
| 2 | 2 (2) | |||||
|
| 2 | 1 (1) | 1 (1) | ||||
|
| 1 | 1 (1) | |||||
|
| 1 | 1 (1) | |||||
|
| 1 | 1 (1) | |||||
|
| 1 | 1 (1) | |||||
|
| 1 | 1 (1) | |||||
|
| 1 | 1 (1) | |||||
|
| 1 | 1 (1) | |||||
|
| 1 | 1 (1) | |||||
|
| 1 | 1 (1) | |||||
|
| 1 | 1 (1) | |||||
|
| 1 | 1 (1) | |||||
|
| 1 | 1 (1) | |||||
|
| 1 | 1 (1) | |||||
|
| 1 | 1 (1) | |||||
|
| 1 | 0 (1) | |||||
|
| 1 | 0 (1) | |||||
|
| 1 | 0 (1) | |||||
|
| 1 | 0 (1) | |||||
|
| 1 | 1 (1) | |||||
|
| 1 | 1 (1) | |||||
|
| 1 | 1 (1) | |||||
|
| 1 | 1 (1) | |||||
|
| 1 | 1 (1) | |||||
|
| 1 | 1 (1) | |||||
|
| 1 | 1 (1) | |||||
|
| 1 | 1 (1) | |||||
|
| 1 | 1 (1) | |||||
|
| 1 | 1 (1) | |||||
|
| 1 | 0 (1) | |||||
|
| 1 | 0 (1) | |||||
|
| 1 | 0 (1) | |||||
|
| 1 | 1 (1) | |||||
|
| 1 | 0 (1) | |||||
|
|
|
|
|
|
|
|
|
# = number of sequences analyzed.
Figure 2Transcription factors binding sites that are known to activate JCPyV infection are more prevalent in the NCCR of viral isolates from the brain, plasma, and especially, the CSF in patients that have PML or AIDS. The R package, ‘TFBStools’ was used to determine the TFBS for each block, using the 2020 JASPAR database. Known TFBS that have are associated with JCPyV transcription and replication were illustrated as a balloon plot using ‘ggplot2’. TFBS are faceted by the 6 blocks (A–F), disease status, and by tissue source: urine (A), blood (B), brain (C), and CSF (D). The size of the shape represents the normalized frequency of the TFBS (i.e., the number of times the TFBS is present) [(# of TFBS/Block)/(# of sequences)] in each of the 6 groups/locations and the color represents the activity that correlates with JCPyV infection. Sequences isolated from the blood represent serum, plasma, and PBMC sequences. Unadjusted p values were determined by Fisher’s exact test to compare the normalized frequency of TFBS (size of shapes) from the Urine (Healthy) to all other groups.
Figure 3Blocks C, D, and F of the JCPyV NCCR have the largest variation in the frequency of TFBS in sequences isolated from the brain, plasma, and CSF of diseased individuals compared to sequences isolated from the urine of healthy individuals. The overall distribution of the normalized frequency of TFBS [(# of TFBS/Block)/(# of sequences)] were plotted based on disease status and tissue source using the R package, ‘ggplot2’. Adjusted p values were determined using either the chi square test adjusting the p values using the Bonferroni method or using the fisher exact test, adjusting the p values using the Benjamini-Hochberg (BH) method. The selection of these statistical tests was determined by group size and compared the number of normalized frequencies of TFBS (y-axis) within each block from the Urine (Healthy) to all other groups.
Figure 4The number and frequency of TFBS in the NCCR, specifically in Block C, are more numerous from nonurine locations of diseased patients. The top 10% of TFBS is illustrated in a heat map, faceted by tissue source and block location. The heat map is colored by the normalized frequency of TFBS in each group.
Block sequences and criteria used to locate them in each sequence.
| Block Letter | Nucleotide Sequence | Maximum Mismatch Value |
|---|---|---|
| Block “a” | CCTGTATATATAAAAAAAAGGGAAGG | 9 |
| Block “b” | AGGGAGGAGCTGGCTAAAACTG | 8 |
| Block “c” | GATGGCTGCCAGCCAAGCATGAGCTCATACCTAGGGAGCCAACCAGCTGACAGCC | 27 |
| Block “d” | AGAGGGAGCCCTGGCTGCATGCCACTGGCAGTTATAGTGAAACCCCTCCCATAGTCCTTAATCACA | 31 |
| Block “e” | AGTAAACAAAGCACAAGG | 1 |
| Block “f” | GGAAGTGGAAAGCAGCCAAGGGAACATGTTTTGCGAGCCAGAGCTGTTTTGGCTTGTCACCAGCTGGCCAGT | 31 |
Summary comparing 100 NCCR sequences from the automated script to the initial, manual analysis.
| Description of Error | Frequency of Error (%) | Mean (SD) (Block Code or Base Pairs) | % of Error out of the Average Length of the NCCR with Error (SD) |
|---|---|---|---|
| Block code less than 85% accurate | 13% | 77.7% (±4.2%) | N/A |
| Block code greater than 100% (larger blocks from the initial analysis were interpreted as smaller and different block codes) | 6% | 123.5% (±26%) | N/A |
| Autogenerated block annotation had nucleotides counted in two sequential blocks, occurring when the start of a block occurs before the end of the previous | 63% | 2.93 (±1.88) | 0.998% (±0.0807%) |
| Predicted block lacks at least one base pair that was counted in the manual annotation | 93% | 8.84 (±9.12) | 3.07% (±3.29%) |
Figure 5Rearrangements of the NCCR and changes in TFBS. The archetype strain is predominantly detected in the urine of healthy patients (blue/green virus) with numerous TFBS (blue) that inhibit replication. Rearrangements in diseased patients (red virus) causes an increase in TFBS (green) that enhance viral replication of PML-type strains. TFBS, like SPIB (gray) or possibly OLIG3 (yellow), enhance cellular tropism from the rearrangements in the NCCR. Additionally, TFBS, including MEIS1, FOX family transcription factors (FOX), and HOX family transcription factors (HOX) (yellow) can also be observed from NCCR rearrangements in disease patients compared to TFBS in the NCCR from healthy individuals.