The HIV/AIDS epidemic in Russia is growing, with approximately 100,000 people infected annually. Molecular epidemiology can provide insight into the structure and dynamics of the epidemic. However, its applicability in Russia is limited by the weakness of genetic surveillance, as viral genetic data are only available for <1 per cent of cases. Here, we provide a detailed description of the HIV-1 epidemic for one geographic region of Russia, Oryol Oblast, by collecting and sequencing viral samples from about a third of its known HIV-positive population (768 out of 2,157 patients). We identify multiple introductions of HIV-1 into Oryol Oblast, resulting in eighty-two transmission lineages that together comprise 66 per cent of the samples. Most introductions are of subtype A (315/332), the predominant HIV-1 subtype in Russia, followed by CRF63 and subtype B. Bayesian analysis estimates the effective reproduction number Re for subtype A at 2.8 [1.7-4.4], in line with a growing epidemic. The frequency of CRF63 has been growing more rapidly, with the median Re of 11.8 [4.6-28.7], in agreement with recent reports of this variant rising in frequency in some regions of Russia. In contrast to the patterns described previously in European and North American countries, we see no overrepresentation of males in transmission lineages; meanwhile, injecting drug users are overrepresented in transmission lineages. This likely reflects the structure of the HIV-1 epidemic in Russia dominated by heterosexual and, to a smaller extent, people who inject drugs transmission. Samples attributed to men who have sex with men (MSM) transmission are associated with subtype B and are less prevalent than expected from the male-to-female ratio for this subtype, suggesting underreporting of the MSM transmission route. Together, our results provide a high-resolution description of the HIV-1 epidemic in Oryol Oblast, Russia, characterized by frequent interregional transmission, rapid growth of the epidemic, and rapid displacement of subtype A with the recombinant CRF63 variant.
The HIV/AIDS epidemic in Russia is growing, with approximately 100,000 people infected annually. Molecular epidemiology can provide insight into the structure and dynamics of the epidemic. However, its applicability in Russia is limited by the weakness of genetic surveillance, as viral genetic data are only available for <1 per cent of cases. Here, we provide a detailed description of the HIV-1 epidemic for one geographic region of Russia, Oryol Oblast, by collecting and sequencing viral samples from about a third of its known HIV-positive population (768 out of 2,157 patients). We identify multiple introductions of HIV-1 into Oryol Oblast, resulting in eighty-two transmission lineages that together comprise 66 per cent of the samples. Most introductions are of subtype A (315/332), the predominant HIV-1 subtype in Russia, followed by CRF63 and subtype B. Bayesian analysis estimates the effective reproduction number Re for subtype A at 2.8 [1.7-4.4], in line with a growing epidemic. The frequency of CRF63 has been growing more rapidly, with the median Re of 11.8 [4.6-28.7], in agreement with recent reports of this variant rising in frequency in some regions of Russia. In contrast to the patterns described previously in European and North American countries, we see no overrepresentation of males in transmission lineages; meanwhile, injecting drug users are overrepresented in transmission lineages. This likely reflects the structure of the HIV-1 epidemic in Russia dominated by heterosexual and, to a smaller extent, people who inject drugs transmission. Samples attributed to men who have sex with men (MSM) transmission are associated with subtype B and are less prevalent than expected from the male-to-female ratio for this subtype, suggesting underreporting of the MSM transmission route. Together, our results provide a high-resolution description of the HIV-1 epidemic in Oryol Oblast, Russia, characterized by frequent interregional transmission, rapid growth of the epidemic, and rapid displacement of subtype A with the recombinant CRF63 variant.
HIV-1 poses a substantial threat to public health in Russia. In 2019, the estimated prevalence of HIV in Russia was the highest among European countries, and Russia was responsible for the majority of new HIV infections in Europe and Central Asia (Roser and Ritchie 2018). Since its initial introduction in the late 1980s, HIV-1 has been rapidly spreading across Russia due to widespread injecting drug usage (Bobkov et al. 2004) and poor public awareness (The Lancet HIV 2016). In 2019, 97,176 new cases of HIV-1 were registered in Russia (In Russian 2019a). Its prevalence in the same year was estimated at 0.75 per cent on the basis of registered cases (In Russian 2019a); the actual prevalence is likely higher. Currently, the epidemic predominantly develops through heterosexual transmission (HET). In 2019, 63.9 per cent, 33.0 per cent, and 2.2 per cent of all new cases with reported transmission routes could be attributed to heterosexual, injecting drug-associated, and homosexual routes of transmission, respectively (In Russian 2019a). Low antiretroviral therapy (ART) coverage contributes to poor epidemic control. In 2019, only 48.5 per cent of the registered people living with HIV-1 in Russia were receiving therapy (In Russian 2019b).Molecular surveillance can provide insights into pathways and the rate of disease spread in an ongoing epidemic. However, the Russian diversity of HIV-1 remains poorly studied. A comprehensive description of the countrywide epidemic by methods of molecular epidemiology in Russia is hindered by the fact that genetic data on HIV-1 are available for less than 1 per cent of the infected population. Coverage provided by molecular epidemiology studies of HIV-1 in specific regions of Russia is also low (Thomson et al. 2009; Dukhovlinova et al. 2015; Gashnikova et al. 2016; Dmitry et al. 2019; Murzakova et al. 2019).Here, we report a detailed molecular epidemiology analysis of the HIV-1 epidemic in a single region of Russia by sampling a large fraction of its HIV-positive population. We focused on Oryol Oblast (https://en.wikipedia.org/wiki/Oblast), a region with a population of 736,483 in the southwestern part of the Central Federal District of Russia (https://en.wikipedia.org/wiki/Federal_district) (Supplementary Fig. S1). As of 2019, there were 2,157 registered HIV-positive people in Oryol Oblast (Supplementary Fig. S2); we collect and analyse HIV-1 genetic data from 768 patients, thus covering over a third of the registered epidemic. Using a phylogenetic analysis, we infer eighty-two imports of HIV-1 into Oryol Oblast that were further transmitted within this region forming transmission lineages, as well as 250 imports that did not result in observed onward transmission, indicating unhindered spread between regions of Russia. Transmission lineages were enriched in injecting drug users but not in males, reflecting the demographic properties of the epidemic. The epidemic is predominated by subtype A (87 per cent of our dataset) followed by the recombinant CRF63 variant (7 per cent). Using phylodynamic analysis, we show that subtype A is responsible for the moderate growth of the epidemic with Re of 2.8 [1.7–4.4], while CRF63 demonstrates a much higher growth rate with Re of 11.8 [4.6–28.7] and should be closely monitored.
Methods
Data collection and ethics
Patients were enrolled in the study between 1 January 2018 and 30 June 2019. Over this period, 681 blood samples were collected from HIV-infected people living in the Oryol city and the remainder of Oryol Oblast by the local AIDS centre through a routine surveillance programme and through regular check-ups of the registered HIV-infected people. Additionally, we included the 241 samples obtained between 31 March 2014 and 2 November 2019 in the course of a study on drug resistance, for a total of 922 samples from Oryol Oblast. HIV-1 RNA for sequencing was obtained from blood plasma left after the viral load analysis. Demographic, clinical, and epidemiological data for participants were obtained from their medical records. The assumed route of infection was recorded by interviewing the patients. Written informed consent was obtained from all subjects. The study was approved by the Local Ethics Committee of the Central Research Institute of Epidemiology (Protocol 93).
Sequencing
Sequencing was performed between 29 June 2019 and 30 March 2021. The sequence of the pol region covering the protease gene and part of the reverse transcriptase gene was obtained either by Sanger or by next-generation sequencing (NGS). In both cases, RNA was isolated from blood plasma using phenol–chloroform extraction. For Sanger sequencing, the AmpliSens HIV-Resist-Seq (CRIE, Russia) kit for in vitro diagnostics was used according to the manufacturer’s instructions. For NGS, a two-step amplification procedure was used. The first step of amplification was combined with reverse transcription. Amplification was performed according to the following protocol: 45°C—30 min; 95°C—15 min; thirty cycles: 95°C—30 s, 50°C—30 s, 72°C—1 min 30 s; 72°C—5 min. During this stage, a DNA fragment of approximately 1.5 kb was amplified (positions 2,074–3,539 in the reference HIV-1 strain HXB2, GenBank K03455
). The second step of amplification was performed in four independent tubes for each sample. Amplification produced four overlapping DNA amplicons that ranged in size from 427 to 586 bp. This approach made it possible to simplify library preparation and eliminated the need for DNA fragmentation. After purification with Sera-Mag Magnetic Speedbeads (GE Healthcare Biosciences) magnetic particles, the amplified fragments were mixed in equal proportions. After barcoding, NGS was performed on Illumina Miseq machine (Illumina, USA) with the MiSeq Reagent Kit V3 (600 cycles). Totally, out of the 681 samples collected in this study, 562 samples were sequenced using NGS, and the remaining 119 using Sanger technology. The 241 Oryol Oblast samples obtained prior to this study were also sequenced using Sanger technology.
Iterative consensus calling
The diversity of Russian HIV-1 differs significantly from the widely used HXB2 reference, complicating variant calling. Furthermore, the amplification of the pol fragment via four slightly overlapping amplicons prevented the use of de novo assembly tools like IVA (Hunt et al. 2015). To address these issues, we developed a custom consensus calling pipeline (described in Supplementary File 1 and available on https://github.com/noranekonobokkusu/HIV-in-Oryol) and applied it to each of the 562 samples sequenced on the Illumina platform.
Dataset preparation
To obtain data on HIV-1 diversity in Russia beyond Oryol Oblast, we downloaded the 14,365 sequences of HIV-1 collected in Russia from GenBank on 16 August 2021 (‘HIV-1’ AND ‘Russia’ query). Of those, we selected the 8,560 sequences that produced an at least 800-bp hit of blastn against the target pol fragment of at least one reference sequence from HIV Sequence Compendium (2018). We then additionally filtered out the 1,491 sequences that were sampled outside Russia and processed by Russian research groups, thus erroneously matching the ‘Russia’ query, and the twenty-six samples from Oryol Oblast already present in our dataset, leaving us with 7,043 samples. GenBank metadata was parsed using a custom python script for 6,252 samples; for the remaining 791 samples, fuller metadata was provided by our collaborators.Together with the 922 sequences collected in Oryol Oblast, our Russian dataset comprised 7,965 sequences. Among the 922 Oryol samples, we identified ninety-four patients who had more than one sample sequenced. For each of these patients, we marked all samples except the earliest one for exclusion later in the pipeline.
Sequence alignment and processing
Sequences were putatively aligned against the HXB2 reference pol region using mafft (Katoh et al. 2002) (option—auto) and cropped to include only the coding part of the pol fragment (HXB2 coords 2252-3539). Additionally, we filtered out the sequences that either (a) had insertions relative to the HXB2 reference longer than 50 bp or (b) were shorter than 1,100 bp, leaving us with 6,356 sequences. These sequences were further aligned more accurately by the HMM-align algorithm of HIValign (HIValign), allowing for up to ten codons to compensate for a frameshift. From this alignment, we excluded 154 sequences carrying premature stop codons, frameshifts, or more than ten Ns (missing data characters). Sanger sequences were somewhat shorter than sequences produced by NGS; to make sequences comparably informative, we excluded codons with more than 5 per cent of gaps or Ns from the alignment. The resulting alignment of 1,113 sites (positions 2,253–3,365 in the HXB2 reference) contained 6,202 sequences, including 864 samples from Oryol Oblast.
Subtyping and drug resistance mutation annotation
We used the SierraPy client (Stanford HIVDB) to assign HIV subtypes and predict drug resistance mutations (DRMs). A small fraction of samples were assigned to mixed variants (e.g. CRF02+A and A+B). For further analyses, we kept only those samples corresponding to the three genetic variants most abundant in Oryol Oblast: A, B, and CRF63. Each of these three variants showed monophyly on the Russian phylogeny, indicating that they can be identified robustly from a short genomic fragment. Importantly, subtype A in fact corresponds to the Russia-specific sub-subtype A6, which is genetically different from the most widespread sub-subtype A1 (Foley et al. 2016). We kept the broader ‘A’ identifier in the text for consistency with the results produced by the SierraPy subtyping tool that we used.To estimate the contribution of sites associated with drug resistance mutations (DRMs), we additionally defined a set of relevant DRM sites as the twenty-three codons that were reported to carry DRMs in at least 5 per cent of our samples and repeated some of our analyses on an alignment with these sites masked.
Phylogenetic analyses
To validate subtyping and check the locations of samples from repeatedly sequenced patients, we reconstructed a putative Russian-wide phylogeny for the 6,202 samples using Fasttree2 (Price, Dehal, and Arkin 2010) (double-precision release).For the final dataset, we kept only the sequences of the earliest samples for all patients whose samples were sequenced more than once. We also excluded the three pairs of samples coming from the same patient but separated by more than 10 per cent genetic distance (Supplementary Fig. S3), leaving us with 768 Oryol and 5,328 non-Oryol sequences; the flowchart in Supplementary Fig. S4 summarizes the dataset preparation procedure. We then used Fastree2 to reconstruct phylogenies for A, B, and CRF63 variants and analysed them separately using Treetime (Sagulenko, Puller, and Neher 2018) as follows: First, the trees were rerooted to maximize the temporal signal. Next, we reconstructed the ancestral nucleotide states of internal nodes and collapsed the branches that did not carry any mutations, thus converting the tree to non-binary. On the resulting non-binary tree, we then determined the geographical states of the internal nodes using a two-state (Oryol vs. non-Oryol) migration model. Finally, on the basis of sequences with a known year and month of sampling (later referred to as ‘complete sampling dates’), we inferred the dates of internal nodes that were used as estimates for the dates of last common ancestors (LCA) of the inferred transmission lineages (as mentioned later).F
st values were computed in R using the pairwise WCfst function of the hierfstat package based on the alignments and geographic labels of the samples.
Identification of imports
We identified imports by the depth-first search of the largest clades that (1) included at least 80 per cent of Oryol Oblast samples, (2) had a bootstrap support of at least 0.8, and (3) had the LCA that had an at least 80 per cent posterior probability of being an Oryol node. For imports resulting in just a single sequence (Oryol Oblast ‘singletons’), criteria (2) and (3) were ignored. Imports resulting in at least two sequences were denoted as ‘transmission lineages’. To study the dependency of the number of inferred transmission lineages on the number of Oryol and non-Oryol sequences available, we randomly subsampled different numbers of Oryol or non-Oryol samples in 1,000 trials, inferring in each trial the resulting number of imports using criterion (1) above.
Bayesian phylodynamics
In order to infer the epidemiological parameters of the subtype A sub-epidemic, we used a recently developed implementation of a birth–death model in BEAST2 that jointly analyses multiple independent clades and infers a single set of epidemiological parameters shared across all clades (Müller et al. 2021). This implementation is capable of aggregating clades spanning different time intervals and setting any arbitrary time points where epidemiological parameters can change in a piecewise-constant fashion, allowing us to use priors informed by data. As birth–death models require at least one parameter to be fixed or defined in a narrow range, we put a strict prior on sampling proportion. To obtain it, we selected samples that were sequenced soon (in less than a year) after the initial diagnosis which we denote as Dataset I (comprising 228 sequences, including 197 sequences of subtype A). In Dataset I, most samples were sequenced in 2018 and 2019, eight samples were sequenced in 2014–7, and none before 2014. We thus put a four-dimensional prior on the sampling proportion which was allowed to change on the first day of 2014, 2018, and 2019, with the mean equal to the number of samples sequenced in a time interval divided by the number of new diagnoses in this interval based on the reported case counts (irrespective of their subtype; see Supplementary Table S1). We then used these priors on both Datasets I and II composed of all samples (664 sequences of subtype A). In contrast to the sampling proportion, the reproduction number and the rate of becoming uninfectious were assumed to be time-independent and thus unidimensional. We used a relaxed lognormal uncorrelated clock with ulcd.mean and ulcd.std shared across all imports. For the ulcd.mean prior, we used the normal distribution with a mean equal to the median rate inferred in Patiño-Galindo and González-Candelas (2017) for the pol region of subtype A (0.0015 substitutions/site/year) and sigma equal to 0.001. All other priors were kept default. The priors used in the multi-tree birth–death analysis are summarized in Supplementary Table S2. The analysis was run for 250-million steps; we discarded the first 10 per cent steps as burn-in and ensured that all ESS values were higher than 200.We used the same set of priors to infer the epidemiological parameters of the largest clades of variants A and CRF63. CRF63 is a relatively young subtype; we could not find independent estimates of its evolutionary rate in the literature and thus used the estimate produced in Patiño-Galindo and González-Candelas (2017) for its ancestral recombinant variant CRF02 (0.0008 substitutions/site/year). Using a twofold higher value (0.0016, which is close to the estimate produced for the second parent of CRF63, subtype A) as a prior mean did not affect results qualitatively. The parameters were inferred using the BDSKY (Stadler et al. 2013) package of BEAST2 (Bouckaert et al. 2019). The skylinetools package (Louis du Plessis skylinetools) was used to define appropriate time intervals for sampling proportion. Priors used in this analysis are summarized in Supplementary Table S3. The analysis was run for 100-million steps; we discarded the first 10 per cent steps as burn-in.The logistic growth dynamics for the same two clades was inferred in BEAST v1.10.4 (Drummond and Rambaut 2007). Priors are provided in Supplementary Table S4. The analysis was run for 100-million steps; we discarded the first 10 per cent steps as burn-in.Convergence of the produced MCMC trajectories was assessed using Tracer (Rambaut et al. 2018).
EpiEstim
We used the EpiEstim package (Cori et al. 2013) implemented in R to infer the dynamics of Re from the reported incidence data. We constructed the distribution of serial intervals based on the estimates of HIV-1 transmission rates at different stages of infection inferred by Hollingsworth, Anderson, and Fraser (2008). As our incidence data are provided per year, we converted the reported transmission rates to be year-wise. The dynamics of Re was inferred using a 5-year sliding window.
Analysis of transmission lineages
We tested whether the two categories of samples, male gender and people who inject drugs (PWIDs), are overrepresented in transmission lineages compared to singletons. For this purpose, we reshuffled the gender or transmission route labels of Oryol Oblast samples 1,000 times and obtained the distribution of the expected number of samples of the tested category belonging to transmission lineages and the expected number of transmission lineages carrying such samples.To test whether transmission lineages are preferably seeded by PWIDs, we constructed two match-paired datasets. First, we sorted all transmission lineages by the earliest diagnosis in the lineage. Then, in this sorted list, we marked lineages as PWID- and HET-founded based on the earliest sample and selected time-matched pairs of PWID- and HET-founded lineages such that in a consecutive series of lineages with the same founder type, just the single earliest lineage was taken; this ensured that PWID-founded lineages could not be overrepresented due to the population structure being mostly comprised of PWIDs in the 1990s. We compared the time between the date of the LCA and the date of the first diagnosis in a lineage using the paired Wilcoxon test.Second, we sorted all lineages by the median date of diagnosis, selected the ones with both HET and PWID samples, and compared the dates of diagnosis of the earliest HET in odd lineages and the earliest PWID in even lineages, again using the paired Wilcoxon test.
Results
The Oryol epidemic is largely constituted by subtype A of HIV-1
We sequenced the pol region fragment of 858 HIV-1 samples obtained from 768 unique patients in Oryol Oblast (‘Oryol dataset’). This dataset covers more than a third of the registered HIV-positive population in Oryol Oblast and represents an unbiased and well-annotated dataset. Subtype composition in the Oryol dataset was more homogeneous compared to the non-Oryol Russian dataset represented by GenBank samples (‘non-Oryol dataset’; 5,328 sequences) (Fig. 1). Subtype A was the dominant subtype in Oryol Oblast (87.2 per cent), similar to the rest of Russia (Bobkov et al. 2004); it was followed by CRF63 (7.20 per cent) and subtype B (2.49 per cent). The transmission route was reported for 96 per cent of the samples in the Oryol dataset, compared to less than 50 per cent in the non-Oryol dataset; in both datasets, HET was the most prevalent, followed by transmission associated with PWID and men who have sex with men (MSM), although the fraction of PWID and MSM samples was higher in the non-Oryol samples. The male-to-female ratio was the same in both datasets (1.30), although sex has been reported for only 62 per cent of non-Oryol samples. The fractions of sexes and reported transmission routes in the Oryol dataset were representative of those in the Oryol Oblast as a whole as reported by the Oryol AIDS centre (Supplementary Fig. S2).
Figure 1.
Summary of the Oryol and non-Oryol datasets. The plots show the distribution of samples across subtypes (A), transmission routes (B), sexes (C), and sampling years (D). The seven subtypes most frequent in Russia are shown in (A). ‘Other’ in (B) corresponds to the reported transmission routes that could not be attributed to HET, PWID, or MSM (e.g. mother-to-child transmission). When several sequences were available per patient, the earliest sequence was used. Only sequences with complete sampling dates are shown in (D). The line in (D) shows the number of HIV cases registered per year in Oryol Oblast.
Summary of the Oryol and non-Oryol datasets. The plots show the distribution of samples across subtypes (A), transmission routes (B), sexes (C), and sampling years (D). The seven subtypes most frequent in Russia are shown in (A). ‘Other’ in (B) corresponds to the reported transmission routes that could not be attributed to HET, PWID, or MSM (e.g. mother-to-child transmission). When several sequences were available per patient, the earliest sequence was used. Only sequences with complete sampling dates are shown in (D). The line in (D) shows the number of HIV cases registered per year in Oryol Oblast.The phylogenetic tree reconstructed for the combined Oryol and non-Oryol dataset had separate clades corresponding to the three most abundant variants—subtypes A and B and variant CRF63, indicating that subtyping was mostly unambiguous (Fig. 2). In subsequent analysis, we focused on these three variants, covering a total of 739 Oryol Oblast samples from distinct patients.
Figure 2.
Phylogeny of the combined Russian HIV-1 dataset. When more than one sample per patient was available, only the earliest sample was used. Yellow dots mark Oryol Oblast samples. The two largest clades analysed separately, A and CRF63, are indicated with arrows. See Supplementary Fig. S3 for the phylogeny including all samples from repeatedly sequenced patients.
Phylogeny of the combined Russian HIV-1 dataset. When more than one sample per patient was available, only the earliest sample was used. Yellow dots mark Oryol Oblast samples. The two largest clades analysed separately, A and CRF63, are indicated with arrows. See Supplementary Fig. S3 for the phylogeny including all samples from repeatedly sequenced patients.
HIV-1 has been imported into Oryol Oblast hundreds of times
To understand the interregional transmission routes of HIV-1, we first reconstructed separate Russia-wide phylogenetic trees for subtypes A and B and CRF63. Overall, the Oryol dataset samples of each of the three variants were rather scattered across the Russian phylogeny (Fst = 0.01, 0.03, and 0.40 for variants A, B, and CRF63, respectively), supporting intensive transmission between regions. Still, many Oryol samples clustered on the phylogeny with other Oryol samples, indicating that many of the infections occurred within the region (Fig. 2).Using ancestral state reconstruction, we identified introductions of HIV-1 into Oryol Oblast as described in Section 2. We attributed 489 of the 739 Oryol samples to a total of eighty-two imports each resulting in one or more inferred transmissions within Oryol Oblast (‘Oryol transmission lineages’). The remaining 250 sequences each resulted from its own import (‘singletons’) for a total of 332 imports (Fig. 3). СRF63 was the most homogeneous variant with 94.5 per cent (52/55) of the samples resulting from a single import, in agreement with its substantially higher Fst compared to subtypes A and B. We found seven non-Oryol samples descendant from Oryol transmission lineages, indicating exports. Notably, the inferred numbers of imports and exports cannot be compared directly: in particular, we likely undercount exports as the sampled diversity outside the region is low.
Figure 3.
The number of Oryol sequences per import. The inset plot is a zoom-in of the imports of subtype A of size 5 and more.
The number of Oryol sequences per import. The inset plot is a zoom-in of the imports of subtype A of size 5 and more.The number of imports is robust to the number of non-Oryol sequences available, already reaching a plateau when a comparable number of Oryol and non-Oryol sequences is used (Supplementary Fig. S5A). This means that we estimate the minimal number of imports resulting in the sampled Oryol diversity robustly (at ∼332). Conversely, the number of imports depends strongly on the number of Oryol samples available, indicating that additional imports will likely be uncovered as more Oryol samples are obtained (Supplementary Fig. S5B).
Early imports disproportionally contributed to the epidemic in the region
To better understand the dynamics of transmission lineages, we dated the LCA of each lineage in subtype A and CRF63 as described in Section 2 (Fig. 4). The reconstructed LCAs for individual lineages dated between 1996 and 2018, indicating that the genetic diversity within the currently sampled transmission lineages has accumulated over the decades. Lineages with earlier LCAs tended to have earlier dates of diagnosis of the earliest patients in the lineage (. S6), validating the inferred LCA dates. In fifty-five out of eighty-two lineages, the first positive immunoblot of the lineage was obtained after this lineage was established based on the LCA date estimate, indicating that this lineage has already spread prior to detection. In sixteen cases, a single positive immunoblot predated the estimated LCA, suggesting transmission of a non-basal variant from a diagnosed individual. Finally, in the remaining eleven cases, two or more of the earliest diagnoses in a transmission lineage had earlier dates than the reconstructed LCA of this lineage; these cases can be explained by statistical uncertainty in LCA dating and/or by errors in LCA dating due to variation in substitution rates between branches, e.g. due to changes in the ART status (see Section 4). On average, the first positive immunoblot for a lineage was obtained 0.91 years after this lineage was established based on the LCA date estimate (. S7).
Figure 4.
Temporal dynamics of Oryol HIV-1 lineages. Each horizontal line corresponds to a transmission lineage (with ‘NODE’ prefix) or a singleton (with ‘leaf’ prefix). Individual patients are shown as empty diamonds at the date of diagnosis, with the colour indicating the reported transmission route; only samples with complete dates of diagnosis are shown. Lineages are sorted by the earliest date of diagnosis in a lineage; samples with unknown diagnosis dates, and lineages consisting exclusively of such samples, are not shown. Cyan asterisks show the lineage LCA dates estimated by Treetime. For subtype A, purple asterisks and lines show the lineage LCA dates and 95 per cent HPDs estimated by the multi-tree birth–death analysis. For subtype A, transmission lineages with at least four Oryol samples are shown. See . S9 for all subtype A lineages and singletons. Histograms show the distribution of transmission routes over time. The density plot (for subtype A) shows the distribution of LCA estimates produced by Treetime and BEAST2.
Temporal dynamics of Oryol HIV-1 lineages. Each horizontal line corresponds to a transmission lineage (with ‘NODE’ prefix) or a singleton (with ‘leaf’ prefix). Individual patients are shown as empty diamonds at the date of diagnosis, with the colour indicating the reported transmission route; only samples with complete dates of diagnosis are shown. Lineages are sorted by the earliest date of diagnosis in a lineage; samples with unknown diagnosis dates, and lineages consisting exclusively of such samples, are not shown. Cyan asterisks show the lineage LCA dates estimated by Treetime. For subtype A, purple asterisks and lines show the lineage LCA dates and 95 per cent HPDs estimated by the multi-tree birth–death analysis. For subtype A, transmission lineages with at least four Oryol samples are shown. See . S9 for all subtype A lineages and singletons. Histograms show the distribution of transmission routes over time. The density plot (for subtype A) shows the distribution of LCA estimates produced by Treetime and BEAST2.The lineages that were established early were also larger (linear regression P-value for the LCA date is 10–4; Fig. 6A), resulting in a disproportionate number of infections. Indeed, 50 per cent of the lineages established before May 2010 were responsible for 70 per cent of the observed cases (. S8).
Figure 6.
The comparison of the largest lineages of subtype A and CRF63. (A) Lineages with earlier LCAs tend to carry more Oryol samples. The regression line is shown for subtype A; the regression analysis for all samples also shows a negative dependency (the P-value of Pearson’s regression is 10−4 for all lineages and 3.5 × 10−8 for subtype A lineages). (B) Maximum-likelihood phylogenies of the largest lineages of subtypes A and CRF63. The colour indicates the reported transmission route, and the shape reflects the presence of ART at some point during the infection or its absence throughout. The first and the second dates for each sample correspond to the date of sampling and the date of diagnosis, respectively.
Epidemiological parameters of the subtype A sub-epidemic
Since subtype A represents 87 per cent of all HIV-1 cases in Oryol Oblast in our dataset, we focused on this subtype to estimate the dynamics of the HIV-1 epidemic in this region. We used BEAST2 to infer epidemiological parameters of the subtype A sub-epidemic by simultaneously analysing all transmission lineages and singletons of subtype A. The recently developed multi-tree implementation of BEAST2 (Müller et al. 2021) allows treating separate lineages as realizations of the same epidemiological process whose parameters can be jointly inferred. We modelled the subtype A sub-epidemic as a birth–death process with a constant time-independent reproductive number, constant rate of becoming noninfectious, and time-dependent multidimensional sampling proportion upon which we put a strong prior (see Section 2). The prior on the sampling proportion was estimated based on a set of samples that were sequenced less than a year after the initial diagnosis. We used this prior both for the dataset comprising only early infections (Dataset I) and for the full set of samples (Dataset II); the results did not differ drastically (Fig. 5, left vs. right column). The median effective reproductive number (Re) was 2.8 and 2.6 for Datasets I and II, respectively; the corresponding rates of becoming uninfectious were 0.37 and 0.20, implying the total duration of infection of 2.7 and 5 years, respectively. The 95 per cent HPD (highest posterior density) for Re inferred from the full Dataset II is strictly higher than 2, implying a growing epidemic. Re estimated from the reported yearly incidence by EpiEstim is also above 1, although it is lower than that obtained using the birth–death model (Supplementary Fig. S10).
Figure 5.
Epidemiological parameters inferred for the subtype A sub-epidemic. Dataset I, samples collected within a year of HIV-1 diagnosis (197 sequences); Dataset II, all samples (664 sequences). For reproduction number and becoming uninfectious rate, histograms are provided. For multi-dimensional sampling proportion, medians (solid line) and 50 per cent and 95 per cent HPDs (dark grey and light grey, respectively) are shown.
Epidemiological parameters inferred for the subtype A sub-epidemic. Dataset I, samples collected within a year of HIV-1 diagnosis (197 sequences); Dataset II, all samples (664 sequences). For reproduction number and becoming uninfectious rate, histograms are provided. For multi-dimensional sampling proportion, medians (solid line) and 50 per cent and 95 per cent HPDs (dark grey and light grey, respectively) are shown.
Bayesian phylogenetic analysis indicates the rapid growth of the CRF63 lineage in Oryol Oblast
The major CRF63 lineage (NODE435 in Fig. 4) is largely transmitted through PWID (PWID is the reported transmission route for 66 per cent of the cases, compared to 31 per cent of the epidemic as a whole; Fig. 6B). This lineage is unexpectedly large for its age (Fig. 6A), indicating rapid spread through the PWID population. Indeed, in less than 10 years, it has reached the same size as the largest lineage of subtype A (NODE690 in Fig. 4), which has circulated for at least two decades. Consistently, the phylogeny of the CRF63 samples is characterized by a more recent LCA and a higher fraction of multiple merger events compared to the phylogeny of the subtype A samples obtained at similar times (Fig. 6B), suggesting a more rapid spread of the CRF63 lineage. To test whether the two lineages indeed differ in their dynamics, we used two approaches. First, we utilized the coalescent approach to infer the growth rate of both lineages assuming logistic growth (Fig. 7A; Supplementary Fig. S11). Second, we used the BDSKY model to directly infer epidemiological parameters, i.e. the reproduction number and the rate of becoming uninfectious (Fig. 7B). Importantly, for the largest clade of subtype A, BDSKY produced Re estimates similar to those of a multi-tree BEAST2 implementation for multiple introductions of A (Fig. 5), indicating the robustness of Re estimates. Both the coalescent and the BDSKY models suggest a higher growth rate of the CRF63 lineage compared to the largest lineage of subtype A (Fig. 7).
Figure 7.
Phylodynamic inferences for the two transmission lineages shown in Fig. 6. (A) 95 per cent HPD of the growth rate parameter in the coalescent logistic growth model. (B) 95 per cent HPD of the reproductive number and the rate of becoming uninfectious inferred by the BDSKY model.
The comparison of the largest lineages of subtype A and CRF63. (A) Lineages with earlier LCAs tend to carry more Oryol samples. The regression line is shown for subtype A; the regression analysis for all samples also shows a negative dependency (the P-value of Pearson’s regression is 10−4 for all lineages and 3.5 × 10−8 for subtype A lineages). (B) Maximum-likelihood phylogenies of the largest lineages of subtypes A and CRF63. The colour indicates the reported transmission route, and the shape reflects the presence of ART at some point during the infection or its absence throughout. The first and the second dates for each sample correspond to the date of sampling and the date of diagnosis, respectively.Phylodynamic inferences for the two transmission lineages shown in Fig. 6. (A) 95 per cent HPD of the growth rate parameter in the coalescent logistic growth model. (B) 95 per cent HPD of the reproductive number and the rate of becoming uninfectious inferred by the BDSKY model.
No evidence for a preferred mechanism of transmission at the origin of lineages
In the late 1990s to the early 2000s, most newly reported HIV-1 infections were among PWIDs (Supplementary Fig. S12). Consistently, we find that most of the early lineages were first sampled in PWIDs (Supplementary Fig. S13), suggesting that they were founded by them. However, we see no evidence that PWIDs were more likely to be the originators of lineages when we control for the dates of lineage origins. Indeed, in the controlled matched-pair datasets, there is no difference in time from the LCA to the earliest diagnosis between PWID- and HET- founded clusters, nor is there a preference for the transmission route of the earliest sample in the lineage (Supplementary Fig. 14). This means that the high prevalence of PWIDs in lineage origins reflects the temporal shift in the outbreak composition in our data rather than higher inherent spreading by PWIDs.
Distribution of gender and transmission route categories across import lineages
We next studied whether transmission lineages in subtype A are associated with transmission route and/or sex categories by comparing lineages and singletons (see Section 2). We did not observe any overrepresentation of males within lineages (Fig. 8, left column), in contrast to results obtained previously for other countries (World Bank Publications 2011; Lubelchek et al. 2015; Dennis et al. 2018; Paraskevis et al. 2019). However, PWIDs were overrepresented within lineages, with more samples belonging to lineages than expected by chance (Fig. 8, right column). Furthermore, there were fewer lineages carrying PWIDs than expected randomly (P = 0.008); in other words, a PWID was more likely to fall into a lineage with another PWID. No such difference was observed for males (P = 0.230).
Figure 8.
PWIDs, but not males, are overrepresented within lineages. The histogram shows the expected distribution of the number of sequences belonging to lineages and the number of lineages carrying males (left) or PWIDs (right) within subtype A in 1,000 permutations of subtype A samples with reshuffled sex/transmission route labels. Vertical lines correspond to observed values, accompanied by P-values. For variants B and CRF63, see Supplementary Fig. S15.
PWIDs, but not males, are overrepresented within lineages. The histogram shows the expected distribution of the number of sequences belonging to lineages and the number of lineages carrying males (left) or PWIDs (right) within subtype A in 1,000 permutations of subtype A samples with reshuffled sex/transmission route labels. Vertical lines correspond to observed values, accompanied by P-values. For variants B and CRF63, see Supplementary Fig. S15.
MSM transmission route appears to be underreported in Oryol Oblast
The lack of overrepresentation of males in transmission lineages is perhaps unsurprising given the heterosexual nature of the Russian HIV-1 epidemic. Still, we might expect an excess of MSM reported transmission in subtype B, which is mainly associated with the MSM transmission route worldwide (Beyrer et al. 2012) and in Russia (Kazennova et al. 2017; Lebedev et al. 2019). Indeed, the male-to-female ratio in subtype B is much higher than that in subtype A where we do not expect an MSM-associated bias (16/3 vs. 369/296, Fisher’s exact test, P = 0.0169), and subtype B carries five out of six MSM-associated samples in our dataset (Fig. 9).
Figure 9.
Phylogenetic tree of subtype B in Oryol Oblast. Sampling dates/diagnosis dates are shown at tips.
Phylogenetic tree of subtype B in Oryol Oblast. Sampling dates/diagnosis dates are shown at tips.Is MSM transmission adequately reported in our data? Underreporting of this transmission route can be estimated from the sex ratio among those samples for which other transmission routes are reported. Among the fourteen subtype B samples with the non-MSM-reported transmission route, only three came from females. Based on the sex ratio in subtype A, the number of males expected from this number of females is ∼4. In fact, however, eleven males are observed. While the difference from the expected sex ratio is not statistically significant (Fisher’s exact test, P = 0.1054), if confirmed, such an excess of males would correspond to a ∼2.4-fold underreporting of the MSM transmission route.
Discussion
Since its beginning in the 1980s, the HIV-1 epidemic in Russia has been growing for multiple reasons including poor public awareness, stigmatization of the key risk groups, and insufficient funding (Balabanova et al., 2006; Tkatchenko-Schmidt et al. 2008; Katz et al. 2013; Kelly et al. 2014). Molecular epidemiology is informative of the characteristics of the epidemic, such as its genetic composition and reproductive number; it can also shed light on details of individual outbreaks, e.g. by identifying an increased transmission risk among a certain group or revealing a rapidly growing transmission cluster. However, molecular epidemiology methods work poorly when sampling is low, which is the case for the Russian epidemic where less than 1 per cent of reported cases is accompanied by genetic data. Moreover, HIV-1 genetic data available in GenBank are partially obtained through target studies focusing on specific transmission routes (Masharsky et al. 2010; Dukhovlinova et al. 2015, 2018; Kazennova et al. 2017) or outbreaks (Bobkov et al. 1994), making it unrepresentative of the epidemic as a whole.Unfortunately, given the magnitude of the HIV-1 epidemic in Russia, it is infeasible to rapidly achieve sufficient genetic coverage for the entire country. Here, we instead considered a single geographic region of Russia. We focused on Oryol Oblast, a relatively small region of Russia with a total population of 736,483 (In Russian. Витрина статистических данных;) and a registered HIV-positive population of 2,157 (Supplementary Figs S1 and S2) as of 2019. The registered HIV-1 incidence is lower than that in Russia as a whole (0.29 per cent vs. 0.73 per cent). The relatively small size of the HIV-positive population allowed us to attain representative coverage of the local epidemic.The dataset analysed in this work comprises sequences collected from 768 patients, or 36 per cent of the registered HIV-positive population. Compared to non-Oryol Russian sequences available from GenBank, the Oryol dataset is unbiased, better annotated, and more up-to-date (Fig. 1). It also agrees well with the official sequence-independent statistics reported by the Oryol Regional Center for AIDS (Supplementary Fig. S2). The fraction of PWIDs in our dataset is slightly below that in the official statistics (31.2 per cent vs. 37.0 per cent), which might be explained by lower adherence to AIDS centre visits and/or lower survival among the members of this risk group (May et al. 2015). In line with the official statistics, our dataset also captures the change in the predominant transmission route from PWID to HET in the 2000s, followed by an abrupt increase in the fraction of PWIDs since 2014 when a novel designer drug became widespread (In Russian. Increased Incidence of HIV-1 among IDUs in Oryol Oblast) (Supplementary Figs S2 and S12). The male-to-female ratio is also similar to that in the official statistics (1.7 vs. 1.5) and is close to the countrywide ratio of 1.6 (as of 31 December 2019; In Russian 2019a).The subtype composition in Oryol Oblast is more homogeneous compared to non-Oryol GenBank (Fig. 1A; Supplementary Fig. S16) due to interregional differences and/or targeting of specific subtypes in previous studies. Still, both in Oryol Oblast and in Russia in general, the most abundant subtype is A which has historically dominated the HIV-1 epidemic in Russia (Bobkov et al. 2004).For the three most abundant variants (A, B, and CRF63), we used the reconstructed maximum-likelihood phylogenetic trees to infer imports into Oryol Oblast. Our approach is based on the assumption that all imports are of Russian origin. While some imports could also directly come from other countries, such cases are probably rare as transborder travel is expected to be much less intense than within-country travel.The inferred number of transmission lineages was robust to the number of non-Oryol sequences used, meaning that we have successfully resolved the sequenced genetic diversity of HIV-1 in Oryol Oblast (Supplementary Fig. S5A). However, this number was strongly dependent on the amount of Oryol samples available: it did not saturate as we increased the number of Oryol samples in the analysis (Supplementary Fig. S5B). This implies that the diversity of HIV-1 in Oryol Oblast is higher than can be captured by sequencing of a third of the available population, and/or that there is a constant import of novel HIV-1 variants into the region.Lineages that were established early contained more samples, implying that early imports into the region significantly contributed to the current epidemic. Early imports were mostly associated with PWIDs; however, we did not find evidence for preferred seeding of lineages by PWIDs (Supplementary Fig. S14), suggesting that the prevalence of different transmission routes was shaped by social factors rather than the biology of transmission.Nearly two-thirds of all samples were attributed to transmission lineages. The remaining ‘singleton’ sequences did not result in observable transmission within Oryol. A fraction of singletons could actually correspond to non-Oryol residents. Indeed, in Russia, the HIV data collected by an AIDS centre are household-based, meaning that patients are assigned to centres on the basis of their household registration. The attribution of sequences comprising Oryol transmission lineages as actually coming from Oryol Oblast residents is probably more reliable.Using phylodynamic approaches, we characterized the epidemiological parameters of HIV-1 spread. These analyses have several limitations. First, we assumed that the evolutionary rate is independent of the age of the infection and the presence of therapy. Differences in infection duration and therapy between patients could have affected the estimated LCA dates. For example, the rate of evolution can be reduced by ART (Drummond, Forsberg, and Rodrigo 2001), pushing the LCA estimates to the present. Such biases could lead to some unexpected LCA datings that we observe. For example, in several instances (e.g. NODE619 and NODE2359 in subtype A or NODE184 in subtype B), two or more of the earliest diagnoses in a transmission lineage had earlier dates than the reconstructed LCA of this lineage, which is impossible as LCA by definition must correspond to the earliest transmission between samples of the lineage. Such discrepancy could have been caused by differences in the ART status between sampled individuals; indeed, while ∼40 per cent of our dataset was on therapy, one of the two earliest samples in both NODE619 and NODE2359 lineages (of subtype A) and both patients in the NODE184 lineage (of subtype B) receive(d) therapy (Fig. 4). In theory, it may be possible to account for differences between patients in viral evolution rates, e.g. by inferring two distinct evolutionary rates, but the amount of variation in duration and adherence to therapy would be hard to account for.Second, birth–death models are unidentifiable unless at least one of the parameters is fixed or strongly constrained. We put a strict prior on the sampling proportion, defined as the fraction of all infections being sampled. Sampling density was strongly non-uniform across years, with the vast majority of samples collected over just 2 years (2018–9); however, many of these patients became infected years ago. We thus constructed a four-dimensional prior on sampling proportion based on the subset of ‘rapidly sequenced’ samples (Dataset I, see Section 2) and used it for datasets that also included old infections and for which we could not make an informative assumption about sampling. While the parameters estimated from different datasets were similar (Fig. 5), those estimated on the basis of Dataset I should probably be considered more reliable.Third, birth–death models assume that sampling of an infection results in its ‘death’ due to host recovery or change of behaviour. Unfortunately, this assumption usually does not hold for long-term infectious diseases such as HIV-1, especially in countries like Russia where treatment is available to less than 50 per cent of HIV-positive people (In Russian 2019b). This may make our inferences about the reproduction number and the rate of becoming uninfectious under- and overestimated, respectively.Fourth, the obtained phylodynamic estimates are relevant for the identified part of the Oryol Oblast epidemic. The unidentified part of the epidemic, i.e. associated with non-registered cases (which may comprise up to 20 per cent in Russia; Continuum of HIV Care - Monitoring Implementation of the Dublin Declaration - 2018 Progress Report 2019), can grow more rapidly due to the lack of awareness of the HIV-1 diagnosis (Funk et al. 2009) and a definite absence of therapy among these people.Fifth, our analyses in the main text use all sites, including those of DRMs. Changes at such sites are frequently recurrent between patients and therefore may obscure phylogenetic analyses. Although DRMs were reported not to bias the composition of transmission lineages (Hué et al. 2004), they were shown to affect some analyses, e.g. the exact reconstructed history of transmissions (Lemey et al. 2005). To address this, we repeated the maximum-likelihood tree reconstruction, the definition of transmission lineages (Supplementary Fig. S17), the birth–death analysis on subtype A (Supplementary Table S5), and the comparison of the two largest clades in variants A and CRF63 analyses (Supplementary Fig. S18) on DRM-masked alignments. We found that the inclusion of DRM sites did not affect our conclusions.With these limitations in mind, we assessed the characteristics of the epidemic. As subtype A currently prevails in Oryol Oblast, epidemiological parameters of its sub-epidemic should tone the dynamics of HIV-1 in the region. We jointly inferred Re and the rate of becoming uninfectious for all imports of subtype A. The estimates produced from both Datasets I and II covering the subtype A sub-epidemic consistently inferred Re above 1 (median values are 2.8 and 2.6; Fig. 5), implying a growing epidemic. For the above-mentioned reasons, these values should be considered as lower boundaries; the actual growth rate is probably higher. The SIR (susceptible-infected-recovered) model-based R0 estimate obtained for Russia suggests a somewhat higher reproduction number (4.0–5.8) (Sokolov and Sokolova 2020), in agreement with the ART coverage in Oryol Oblast being higher than in the country overall (Supplementary Fig. S1B). The rate of becoming uninfectious is estimated as 0.37 and 0.20 for Datasets I and II, which is equivalent to the total duration of infection of 2.7 and 5 years, respectively; longer infections in Dataset II probably reflect the fact that this dataset contains a higher fraction of older infections with poorer access to ART compared to Dataset I.Independently, we inferred the Re dynamics from case count data using EpiEstim (Supplementary Fig. S10). These estimates of Re were lower, compared to the birth–death model, although still strictly higher than 1. There can be at least two reasons for this discrepancy. First, reliable incidence data are only available starting from 2000, and the sliding window approach implemented in EpiEstim does not allow us to obtain Re estimates for years before 2005. If Re in the 1990s and the early 2000s was in fact higher than that later in the epidemic, the EpiEstim estimate would not be reflective of the epidemic as a whole and would be an underestimate. Second, Re can be biased by heterogeneity in the sampling procedure; for instance, if the epidemic shifts to the heterosexual population but infected heterosexuals are diagnosed more slowly, the count-based Re estimate would be lowered.Besides subtype A, we separately studied CRF63 that recently evolved in the Siberian part of Russia and has been spreading across the regions of Russia since then (Baryshev, Bogachev, and Gashnikova 2014). We show that in Oryol Oblast, it is mostly represented by a single introduction event that resulted in fifty-two identified infections. This transmission lineage was unexpectedly large for its age, motivating us to compare epidemiological dynamics of subtype A and CRF63. We compared the two largest clades of these variants using two phylodynamic approaches. First, we inferred the growth rate of both clades assuming logistic growth; second, we inferred the reproduction number and the rate of becoming uninfectious using a birth–death model and a sampling prior used for multi-tree subtype A analysis. Both analyses indicate a higher growth rate of the CRF63 clade.Beyond Oryol Oblast, CRF63 has been rapidly expanding in several Russian regions in recent years and has become the dominant variant among new infections in some of them (Gashnikova et al. 2015, 2017; Rudometova et al. 2021). The differences in the rate of spread of A and CRF63 could result from biological and/or epidemiological differences between these two variants. CRF63 has originated from recombination between CRF02 and the major Russian strain A6 (Baryshev, Bogachev, and Gashnikova 2014; Shcherbakova et al. 2014), and its success could be due to the advantages of CRF02 (Fischetti et al. 2004; Palm et al. 2013, but see Easterbrook et al. 2010). Indeed, while the biological properties of CRF63 have not yet been studied extensively, this strain has been reported to be antigenically distinct from A6 (In Russian. Study of Properties of Serum Neutralizing HIV-Infected Patients with Non-progressors Diseases of HIV Isolates of Different Genetic Variants). Further experimental studies are required to test the possible biological differences between CRF63 and A6.Nevertheless, the rapid growth of CRF63 is probably more likely to result from its characteristic epidemiology. First and foremost, the CRF63 clade carries a much higher fraction of PWIDs. The activity aimed at limiting transmission among PWIDs in Oryol/Russia is mainly focused on education and rehabilitation; substitution therapy programmes are not implemented (Idrisov et al. 2017; Meylakhs, Aasland, and Grønningsæter 2017; Cepeda et al. 2018). The spread of the CRF63 clade in the late 2010s (Fig. 4) agrees with the increase in the fraction of infections in PWIDs among all infections at this period (Fig. S12). Better control of new infections in PWIDs could have curbed the spread of this clade. Second, a smaller fraction of CRF63-infected patients were reported to receive therapy (16 per cent vs. 40 per cent in our dataset; Fig. 6). Both these factors facilitate transmission. Third, as CRF63 was introduced into Oryol Oblast much later than subtype A, infections associated with it are expected to be younger even when controlling for the year of diagnosis, and younger infections are associated with higher transmission (Hollingsworth, Anderson, and Fraser 2008). Overall, the rapidly growing CRF63 transmission cluster in Oryol Oblast merits close monitoring.Outside CRF63, the distribution of transmission routes is also informative about the structure of the epidemic. Within the predominant subtype A, we did not observe excessive clustering among males such as has been reported in many countries where the MSM-associated subtype B is prevalent (World Bank Publications 2011; Lubelchek et al. 2015; Dennis et al. 2018; Paraskevis et al. 2019). This might be explained by a different structure of the HIV-1 epidemic in Russia. Compared to those areas where most new diagnoses come from MSM contacts resulting in a majority of the HIV-positive population being male, HIV-1 in Russia heavily affects the general population through HET contacts as well as the PWIDs, making the male-to-female ratio less biased (Fig. 1; In Russian 2019a).By contrast, we do observe a clustering of PWIDs in transmission lineages of subtype A. Part of this effect could arise from a possible fraction of non-Oryol residents outside transmission lineages (as mentioned earlier), e.g. if non-Oryol residents are less likely to be PWIDs. However, we also observe that PWIDs are more likely to co-occur within the same transmission lineages, and this bias cannot have a purely geographic nature. Therefore, the observed PWID clustering probably results from increased transmission within enclosed communities of drug users, although this partially may also come from PWID being the major transmission route in the early years of the epidemic in Russia.The MSM transmission route was rarely (0.8 per cent) reported in our dataset. Still, five of six MSM cases belong to subtype B in agreement with the historical association of subtype B with this route of transmission. The prevalence of males with the non-MSM-reported transmission route in subtype B suggests underreporting of MSM in this subtype.In our dataset, only 39 per cent of patients were on therapy; this fraction is lower than that reported by the Oryol AIDS center (65 per cent). This is because a significant fraction of our dataset corresponds to recently diagnosed patients who have not started receiving therapy yet. Prior to 2018, according to government regulations, therapy was only provided to patients with CD4 counts below 500 (In Russian 2014). While post-2018 recommendations propose therapy for all patients (In Russian 2016), the funding allocated in 2019 (In Russian 2019c) and in 2020 (In Russian 2020) only covered therapy for 43 per cent and 46 per cent of HIV-positive patients in Russia, and this number has actually declined to 34 per cent in 2021 (In Russian 2021). Our finding of large Re is in line with the insufficient effectiveness of the policies currently in place to curb the HIV-1 epidemic in Russia.Taken together, the results presented in this work offer the most in-depth molecular-based characteristic of the HIV-1 sub-epidemic within a region of Russia. Multiple lines of evidence indicate that the HIV-1 epidemic in Oryol Oblast is clearly growing, mainly due to subtype A. We provide evidence that the frequency of the recently introduced CRF63 grows faster compared to subtype A and suggest that the proportion of this variant in the Oryol epidemic may continue to increase. CRF63 should be a high-priority target for molecular surveillance and experimental studies in Oryol Oblast.Click here for additional data file.
Authors: Nadezhda B Rudometova; Nadezhda S Shcherbakova; Dmitry N Shcherbakov; Elena V Mishenova; Elena Delgado; Alexander A Ilyichev; Larisa I Karpenko; Michael M Thomson Journal: AIDS Res Hum Retroviruses Date: 2021-04-07 Impact factor: 2.205
Authors: Alexey E Masharsky; Elena N Dukhovlinova; Sergei V Verevochkin; Olga V Toussova; Roman V Skochilov; Jeffrey A Anderson; Irving Hoffman; Myron S Cohen; Ronald Swanstrom; Andrei P Kozlov Journal: J Infect Dis Date: 2010-06-01 Impact factor: 5.226
Authors: Javier A Cepeda; Ksenia Eritsyan; Peter Vickerman; Alexandra Lyubimova; Marina Shegay; Veronika Odinokova; Leo Beletsky; Annick Borquez; Matthew Hickman; Chris Beyrer; Natasha K Martin Journal: Lancet HIV Date: 2018-07-20 Impact factor: 12.767
Authors: Nicola F Müller; Cassia Wagner; Chris D Frazar; Pavitra Roychoudhury; Jover Lee; Louise H Moncla; Benjamin Pelle; Matthew Richardson; Erica Ryke; Hong Xie; Lasata Shrestha; Amin Addetia; Victoria M Rachleff; Nicole A P Lieberman; Meei-Li Huang; Romesh Gautom; Geoff Melly; Brian Hiatt; Philip Dykema; Amanda Adler; Elisabeth Brandstetter; Peter D Han; Kairsten Fay; Misja Ilcisin; Kirsten Lacombe; Thomas R Sibley; Melissa Truong; Caitlin R Wolf; Michael Boeckh; Janet A Englund; Michael Famulare; Barry R Lutz; Mark J Rieder; Matthew Thompson; Jeffrey S Duchin; Lea M Starita; Helen Y Chu; Jay Shendure; Keith R Jerome; Scott Lindquist; Alexander L Greninger; Deborah A Nickerson; Trevor Bedford Journal: Sci Transl Med Date: 2021-05-03 Impact factor: 17.956
Authors: Natalya M Gashnikova; Ekaterina M Astakhova; Mariya P Gashnikova; Evgeniy F Bocharov; Svetlana V Petrova; Olga A Pun'ko; Alexander V Popkov; Aleksey V Totmenin Journal: Biomed Res Int Date: 2016-11-13 Impact factor: 3.411
Authors: Mariya V Sivay; Lada V Maksimenko; Irina P Osipova; Anastasiya A Nefedova; Mariya P Gashnikova; Dariya P Zyryanova; Vasiliy E Ekushov; Alexei V Totmenin; Tatyana M Nalimova; Vladimir V Ivlev; Dmitriy V Kapustin; Larisa L Pozdnyakova; Sergey E Skudarnov; Tatyana S Ostapova; Svetlana V Yaschenko; Olga I Nazarova; Aleksander S Chernov; Tatyana N Ismailova; Rinat A Maksutov; Natalya M Gashnikova Journal: Front Microbiol Date: 2022-08-31 Impact factor: 6.064