| Literature DB >> 31050737 |
Nicolás M Suárez1, Kunda G Musonda2,3, Eric Escriva2,4, Margaret Njenga2, Anthony Agbueze2,4, Salvatore Camiolo1, Andrew J Davison1, Ursula A Gompels2.
Abstract
BACKGROUND: In developed countries, human cytomegalovirus (HCMV) is a major pathogen in congenitally infected and immunocompromised individuals, where multiple-strain infection appears linked to disease severity. The situation is less documented in developing countries. In Zambia, breast milk is a key route for transmitting HCMV and carries higher viral loads in human immunodeficiency virus (HIV)-infected women. We investigated HCMV strain diversity.Entities:
Keywords: human cytomegalovirus; bioinformatics; breast milk; high-throughput sequencing; target enrichment; viral genomics
Mesh:
Substances:
Year: 2019 PMID: 31050737 PMCID: PMC6667993 DOI: 10.1093/infdis/jiz209
Source DB: PubMed Journal: J Infect Dis ISSN: 0022-1899 Impact factor: 5.226
Characteristics of Donors, Samples, and Datasets
| Donora | HIV Status | Breast Sample | Weeks Postpartum | HCMV Load, ge/mLb | Dataset | Strainsc Detected |
|---|---|---|---|---|---|---|
| 158 | Negative | Left | 16 | 818 244 | 158L16 | 2 |
| 166 | Negative | Left | 16 | 282 252 | 166L16 | 1 |
| 193 | Negative | Right | 16 | 215 217 |
|
|
| 232 | Negative | Right | 4 | 470 150 | 232R4 | 2 |
| 239 | Negative | Right | 4 | 4 752 875 | 239R4 | 1 |
| 263 | Negative | Left | 4 | 5 285 775 | 263L4 | 1 |
| 280 | Negative | Right | 4 | 7 505 536 | 280R4 | 1 |
| 141 | Positive | Right | 16 | 319 888 |
|
|
| 154 | Positive | Left | 16 | 2 195 856 | 154L16 | 2 |
| 173 | Positive | Right | 16 | 349 532 |
|
|
|
| Positive | Left | 16 | 371 027 |
|
|
|
| Positive | Right | 16 | 642 516 |
|
|
| 181 | Positive | Left | 4 | 1 972 365 | 181L4 | 2 |
|
| Positive | Left | 16 | 643 895 |
|
|
|
| Positive | Right | 4 | 65 511 020 |
|
|
|
| Positive | Right | 16 | 795 092 |
|
|
| 248 | Positive | Right | 4 | 441 679 |
|
|
| 258 | Positive | Right | 4 | 610 613 |
|
|
|
| Positive | Left | 16 | 2 366 193 |
|
|
|
| Positive | Right | 16 | 5 053 047 |
|
|
| 264 | Positive | Left | 16 | 388 519 |
|
|
| 277 | Positive | Right | 16 | 250 377 |
|
|
|
| Positive | Left | 16 | 3 751 776 |
|
|
|
| Positive | Right | 4 | 294 246 272 |
|
|
|
| Positive | Right | 16 | 4 370 800 |
|
|
| 281 | Positive | Right | 4 | 20 291 530 |
|
|
| 283 | Positive | Right | 16 | 274 391 |
|
|
| 288 | Positive | Right | 4 | 31 574 022 |
|
|
Abbreviations: HCMV, human cytomegalovirus; HIV, human immunodeficiency virus; ge, genomic equivalent.
aDonor IDs in bold and underlined are sequential or from paired tissue samples.
bMedian loads are higher in HIV-positive compared to negative and also in week 4 compared to week 16 as shown previously [3].
cNumber of strains detected are from Table 3, only those meeting quality thresholds noted are in bold, with the original data from the Supplementary Tables 1 and 3.
dMet all quality thresholds.
eMet all quality thresholds except that unique fragment coverage depth was 10–20 rather than ≥20 reads/nt.
Genotypes and Haplotypes Assigned to Datasets
| Donor | Dataset | Strainsa | Genotypesb | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RL5A | RL6 | RL12 | RL13 | UL1 | UL9 | UL11 | UL73 | UL74 | UL120 | UL146 | UL139 | |||
| 193 | 193R16c,d | 1 | 1 | 3 | 8 | 8 | 8 | 4 | 1 | 3A | 1B | 2B | 12 | 3 |
| 141 | 141R16 | 2 | 1, 6 | 2, 3 | 2, 4B | 2 | 2, 4 | 2, 3 | 5, 6 | 2, 3A | 1B, 2B | 4B | 2, 12 | 3, 8 |
| 173 | 173R16 | 5 | 1, 2, 6 | 2, 3, 4 | 1B, 3, 4B, 6, 8 | 1, 6, 8 | 1, 6, 8 | 2, 4, 6, 7, 9 | 1, 3, 5 | 1, 3A, 4A, 4B | 1A, 1B, 3, 4 | 1A, 4B | 3, 7, 9, 12 | 1A, 1B, 3, 7 |
| 174 | 174L16 | 3 | 2 | 2, 4 | 1B, 6 | 1, 6, 10 | 1, 4, 6 | 4, 6 | 1, 4 | 1, 2, 4A | 2B, 3, 4 | 2B, 3A | 7, 9 | 5, 7 |
| 174 | 174R16 | 3 | 1, 2 | 1, 2, 3, 4 | 1B, 6 | 1, 6, 10 | 1, 6 | 4, 6, 7 | 1, 4 | 2, 4A | 1B, 2B, 3 | 2B, 3A | 7, 9 | 5, 7 |
| 243 | 243L16c | 2 | 1 | 6 | 2, (10) | (1), 2 | 2 | 3 | 6 | 4B | 4 | 2A | 8, 9 | 7 |
| 243 | 243R4c,d | 1 | 1 | 6 | 2 | 2 | 2 | 3 | (1), 6 | 4B | 4 | 2A | 8 | 7 |
| 243 | 243R16c | 1 | 1 | 6 | 2 | 2 | 2 | 3 | 6 | 4B | 4 | 2A | 8 | 7 |
| 248 | 248R4c,d | 1 | 1 | 1 | 4A | 4A | 4 | 1 | 1 | 3B | 2A | 4B | [5] | 4, (7) |
| 258 | 258R4c | 2 | 6 | 3 | 3 | 3 | 3 | 3 | 6 | 3A, (4A) | 1B, (3) | (2A), 2B | (2), 9 | 5 |
| 259 | 259L16 | 3 | 1, 2 | 2, 3, 4 | 1A, 6, 8 | 1, 6, 8 | 1, 6, 8 | 1, 3, 6 | 1, 4, 6 | 1, 2, 4B | 1A, 2B, 4 | 4B | 1, 9, 12 | 2, 8 |
| 259 | 259R16 | 3 | 1, 2 | 2, 3, 4 | 1A, 6, 8 | 1, 6, 8 | 1, 6, 8 | 1, 3, 6 | 1, 4, 6 | 1, 2, 4B | 1A, 2B, 4 | 4B | 1, 9, 12 | 2, 8 |
| 264 | 264L16 | 2 | 1 | 1, 2 | 8, 10 | 8, 10 | 10 | 8 | 4, 7 | 3A, 4B | 4 | 1A, 3A | 10 | 3 |
| 277 | 277R16 | 2 | 1 | 3 | 4A, 6 | 4A, 6 | 4, 6 | 6, 9 | 1, 4 | 1, 4A | 1A, 3 | 3A, 4A | 1, 9 | 3, 4 |
| 278 | 278L16c | 2 | (2), 6 | 3, (4) | (1A), 9 | (1), [9] | (1), 9 | (1), 9 | (1), 6 | 3A, (4D) | 1B, (5) | (3A), 4B | 8, (9) | 2, (2) |
| 278 | 278R4c | 2 | (2), 6 | 3, (4) | (1A), 9 | (1), [9] | (1), 9 | (1), 9 | (1), 6 | 3A, (4D) | 1B, (5) | (3A), 4B | 8, (9) | 2, (2) |
| 278 | 278R16c | 2 | (2), 6 | 3, (4) | (1A), 9 | (1), [9] | (1), 9 | (1), 9 | (1), 6 | 3A, (4D) | 1B, (5) | (3A), 4B | 8, (9) | 2, (2) |
| 281 | 281R4c | 2 | (1), 6 | 3 | 1A, (6) | 1 | 1, (10) | 4, (7) | 1 | 4D | (1B), 5 | 4B | 11 | 5, (7) |
| 283 | 283R16c,d | 1 | [2] | 4 | 4B | [4B] | 4 | 2 | 5 | 3A, (3B) | 1B | 3A | 1 | 5 |
| 288 | 288R4 | 3 | 1, 2 | 3, 4 | 6, 7 | 6, 8 | 5, 6, 8 | 4, 6, 9 | 1 | 1, 4B, 4D | 1A, 4, 5 | 2B, 4B | 1, 3 | 4, 5 |
aDetermined using long motifs for 12 genes (Supplementary Table 1).
bGenotype (G) prefix omitted; round brackets indicate an assigned minority genotype; multiple genotypes with none in round parentheses indicate that majority and minority genotypes could not be distinguished; square brackets indicate a single mismatch in the motif.
cDatasets from which haplotypes were assigned.
dDatasets from which complete genome sequences were derived.
Short Motif Sequences in UL73 and UL74
| Gene | Positiona | Genotype | Motif Sequenceb | Sequences, No.c | Occurrences, No.c | Frequency, %c |
|---|---|---|---|---|---|---|
| UL73 | 5′ | G1 | GCGTATCAACTACC | 121 | 121 | 100 |
| G2 | GTGTGTCGACGAGT | 53 | 53 | 100 | ||
| G3A | GCGTGTCAACAAGC | 104 | 104 | 100 | ||
| G3B | GTGTATCAACGGTA | 47 | 47 | 100 | ||
| G4A | GCACCTTAACAACC | 114 | 113 | 99 | ||
| G4B | ACACCTCAACGACC | 55 | 55 | 100 | ||
| G4C | GCACCTCAACAACC | 39 | 38 | 97 | ||
| G4D | ACGCCTCAACAACC | 93 | 92 | 99 | ||
| UL74 | 5′ | G1A | AAACGACWATTT | 47 | 43 | 91 |
| G1B | AAAAGGATATCT | 60 | 60 | 100 | ||
| G1C | AAAGGGAACCTT | 19 | 19 | 100 | ||
| G2A | AACCTATTCCTT | 27 | 27 | 100 | ||
| G2B | AGAGCGACATAT | 38 | 38 | 100 | ||
| G3 | CGAGCCAGGATT | 66 | 64 | 97 | ||
| G4 | AAACAGGTGATT | 19 | 19 | 100 | ||
| G5 | TGTCTACATCAT | 38 | 38 | 100 | ||
| UL74 | C | G1A | CCTTGTGGTACTG | 47 | 47 | 100 |
| G1B | TCTTGCGGTACGG | 60 | 60 | 100 | ||
| G1C | TCTTGTGGTACAG | 19 | 19 | 100 | ||
| G2A | TCGTGTGGCGCAG | 27 | 27 | 100 | ||
| G2B | CCTTGCGGTACAG | 38 | 38 | 100 | ||
| G3 | TCTTGTGGCACTG | 66 | 66 | 100 | ||
| G4 | TCCTGTGGYACGA | 19 | 19 | 100 | ||
| G5 | CCTTGYGGCACAG | 38 | 38 | 100 | ||
| UL74 | 3′ | G1A | TATTACTACCGCC | 47 | 47 | 100 |
| G1B | TGTTACTACCACC | 60 | 60 | 100 | ||
| G1C | GGTTACCACCAGC | 19 | 19 | 100 | ||
| G2A | TGTTACCACCACC | 27 | 27 | 100 | ||
| G2B | TGTTACAACCACC | 38 | 38 | 100 | ||
| G3 | TGCTACCACCACT | 66 | 66 | 100 | ||
| G4 | TCCTATTGTCCCA | 19 | 19 | 100 | ||
| G5 | TGCTACCGCTGCT | 38 | 38 | 100 |
a5′, toward the 5′ end of the protein-coding region; C, in the central part of the protein-coding region; 3′, towards the 3′ end of the protein-coding region; in reference strain Merlin, the UL73 5′ motif is located at 104–117 nucleotides (nt) in UL73 (408 nt; G4D), and the UL74 5′, C and 3′ motifs are located at 206–217, 443–454, and 906–918 nt, respectively, in UL74 (1419 nt; G5).
bUL73 and UL74 are transcribed rightward and leftward, respectively, in the human cytomegalovirus genome; the sequences are presented 5′-3′ in relation to the direction of transcription; international union of pure and applied chemistry nucleotide codes are used.
cThe total number of sequences in the 243 genome set plus single-gene sequences, followed by the number and percentage of these sequences possessing the motif; one UL74 intergenic recombinant (BE/23/2010) was excluded. This provides a measure of motif sensitivity.
Figure 1.Unrooted phylogenetic trees for UL73 and UL74 based on amino acid sequences derived from 243 genome sequences, and a summary of genotypic linkages and frequencies. The site coverage cutoff value was 95%, leaving 134 sites in the UL73 tree (log likelihood, –1117.87) and 435 sites in the UL74 tree (log likelihood, –4153.86). Branch point robustness was inferred from 100 bootstrap replicates, and values of <70% are denoted by filled circles. Genotype branches are collapsed, and the numbers of substitutions per site, are shown by the scale. The UL73 sequence of strain HAN and the UL74 sequence of strain BE/23/2010 did not fall into the genotypes. The linkages between UL73 and UL74 genotypes are listed, followed by the frequencies of UL73 genotypes in the 243 genome sequences plus 383 single-gene sequences (243plus; 626 in total; Table 2), and the frequencies of the deduced linkages in the samples (milk; 26 in total; Table 3). The frequency of each genotype in the milk set was not significantly different (above P = .05) from that in the 243 plus 383 single gene set, as determined by random subsampling analysis (10 000 samplings of 26 genotypes from the set of 626). *Although no examples of this linkage were present in the datasets at levels in excess of the thresholds, at least one patient (258) was infected at subthreshold levels by a relevant strain (Supplementary Table 1).
Figure 2.UL73 and UL74 genotypes in milk samples collected from the left (L) and right (R) breasts of 4 human immunodeficiency virus–infected donors at 16 weeks postpartum (Table 1). The inner and outer rings show the results obtained using short and long motifs, respectively. Short motif 3′ was used for UL74 (Table 2). The color key for genotypes is shown at the foot. Reads that did not meet the inclusion criteria for genotyping are shown as “other.”