| Literature DB >> 21611129 |
Viviana E Ré1, Andrés C A Culasso, Silvia Mengarelli, Adrián A Farías, Fabián Fay, María B Pisano, Osvaldo Elbarcha, Marta S Contigiani, Rodolfo H Campos.
Abstract
The Hepatitis C Virus Genotype 2 subtype 2c (HCV-2c) is detected as a low prevalence subtype in many countries, except in Southern Europe and Western Africa. The current epidemiology of HCV in Argentina, a low-prevalence country, shows the expected low prevalence for this subtype. However, this subtype is the most prevalent in the central province of Córdoba. Cruz del Eje (CdE), a small rural city of this province, shows a prevalence for HCV infections of 5%, being 90% of the samples classified as HCV-2c. In other locations of Córdoba Province (OLC) with lower prevalence for HCV, HCV-2c was recorded in about 50% of the samples. The phylogenetic analysis of samples from Córdoba Province consistently conformed a monophyletic group with HCV-2c sequences from all the countries where HCV-2c has been sequenced. The phylogeographic analysis showed an overall association between geographical traits and phylogeny, being these associations significant (α = 0.05) for Italy, France, Argentina (places other than Córdoba), Martinique, CdE and OLC. The coalescence analysis for samples from CdE, OLC and France yielded a Time for the Most Common Recent Ancestor of about 140 years, whereas its demographic reconstruction showed a "lag" phase in the viral population until 1880 and then an exponential growth until 1940. These results were also obtained when each geographical area was analyzed separately, suggesting that HCV-2c came into Córdoba province during the migration process, mainly from Europe, which is compatible with the history of Argentina of the early 20th century. This also suggests that the spread of HCV-2c occurred in Europe and South America almost simultaneously, possibly as a result of the advances in medicine technology of the first half of the 20th century.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21611129 PMCID: PMC3097208 DOI: 10.1371/journal.pone.0019471
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
HCV Subtype 2c E2 Primers.
| Name | Sequence | Start | End | Round |
| 2c-ES |
| 1298 | 1319 | First |
| 2c-EA |
| 2204 | 2183 | First |
| 2c-IS |
| 1435 | 1452 | Second |
| 2c-IA |
| 2111 | 2093 | Second |
*Positions relative to BEBE1 Subtype 2c HCV reference genome (accession # D50409).
Number and Location of Sequences.
| NS5B sequences | E2 sequences | Mean Age of Patients (SEM) | |
| Cruz del Eje | 49 | 22 | 66.15 (1.52) |
| Other Locations of Córdoba | 26 | 15 | 49.38 (2.63) |
*Mean Age calculated with patient's age at 2004. SEM: Standard Error of the Mean.
Figure 1Maximum likelihood tree for the NS5B region constructed using TVM+Gamma+I as model of nucleotide substitution with parameters suggested by ModelTest 3.7 (PhyML software).
Black bullets: Sequences from the Genotype Reference dataset; Hollow bullets: Sequences from the HCV-2c dataset. Light gray bullets: Sequences from the CdE data set. Dark gray bullets: Sequences from the OLC data set, Numbers above branches: bootstrap values over 100 pseudoreplica. Scale bar represents substitution per site.
Summary of association index for phylogeographic analysis.
| Statistics | observed mean | lower 95% CI | upper 95% CI | null mean | lower 95% CI | upper 95% CI | Significance level |
| AI | 6.065 | 4.999 | 7.129 | 12.734 | 11.934 | 13.472 |
|
| PS | 46.047 | 42.000 | 50.000 | 74.969 | 72.451 | 77.384 |
|
| MC (CdE) | 10.658 | 6.000 | 18.000 | 3.063 | 2.552 | 4.028 |
|
| MC (OLC) | 4.811 | 3.000 | 5.000 | 2.072 | 1.590 | 2.867 |
|
| MC (Martinique) | 2.135 | 1.000 | 3.000 | 1.019 | 1.000 | 1.067 |
|
| MC (France) | 4.002 | 2.000 | 7.000 | 1.934 | 1.442 | 2.609 |
|
| MC (Argentina) | 2.227 | 2.000 | 4.000 | 1.051 | 1.000 | 1.238 |
|
| MC (Italy) | 2.191 | 2.000 | 3.000 | 1.240 | 1.028 | 1.943 |
|
| MC (Esthonia) | 1.218 | 1.000 | 2.000 | 1.011 | 1.000 | 1.041 | 1.000 |
| MC (Germany) | 1.181 | 1.000 | 2.000 | 1.091 | 1.001 | 1.659 | 1.000 |
| MC (Russia) | 1.030 | 1.000 | 1.000 | 1.006 | 1.000 | 1.009 | 1.000 |
| MC (Canada) | 1.001 | 1.000 | 1.000 | 1.008 | 1.000 | 1.036 | 1.000 |
| MC (Lithuania) | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
Results of Bayesian Tips Significance Tests (BaTS). Association Index (AI) and Parsimony Score (PS) tests the global association between a trait and tree topology taking into account the level of uncertainty in the phylogenetic reconstruction. The Monophyletic Clade (MC) index allows to test whenever each trait is associated with phylogeny. The observed mean and its associated 95% confidence intervals (Upper and Lower CI) were obtained by analyzing trees sampled during the MCMC (Bayesian phylogenetic reconstruction). The null mean and its associated confidence intervals were obtained after randomly distributing the traits in the phylogeny (100 replica). Significance level is the p value for the statistical hypothesis test for equality between the observed index and that expected under no-association.
Summary for tMCRA estimations using the NS5B region.
| Dataset | Mean tMRCA (SEM) | Median tMRCA | HPD-95%(lower) | HPD-95%(upper) |
| CdE | 115.51 (0.44) | 111.12 | 68.85 | 168.54 |
| OLC | 111.71 (0.52) | 107.77 | 69.99 | 162.62 |
| Fr | 102.12 (0.49) | 99.37 | 66.21 | 143.79 |
| CdE+OLC | 141.15 (0.55) | 135.87 | 87.62 | 207.06 |
| CdE+OLC+Fr | 143.46 (0.43) | 138.46 | 88.71 | 210.06 |
tMRCA estimations for NS5B data sets. Results correspond to Bayesian Skyline Plots run under the GTR+Γ+I model of nucleotide substitution using a relaxed clock model (uncorrelated lognormal) and setting a rate of nucleotide substitution of 5×10−4 substitutions per site per year as prior. CdE: Cruz del Eje sequences; OLC: Other Locations of Córdoba sequences; Fr: Genbank HCV-2c NS5B sequences from France; CdE+OLC: combined data set containing CdE and OLC sequences; CdE+OLC+Fr: combined data set containing CdE and CdE and Fr sequences; t: time of most recent common ancestor (years before present); HPD-95%: High Posterior Probability Density 95% (years before present); SEM: Standard Error of the Mean.
Figure 2Bayesian Skyline Plots for Demographic Reconstruction using NS5B sequences.
X axis: Date in Years A. C.; Y axis: Estimated effective number of infections; Bold Dashed Line: Median time of most recent common ancestor (tMRCAtMCRA); Light Dashed Line: Upper HPD95% of tMRCA. Bold Line: Mean Effective Number of viral population. Blue Lines: Upper and Lower HPD95% of Effective Number of viral population. Analyzed data sets: A. CdE+OLC: Samples from Cruz del Eje and Other locations of Córdoba Province; B. CdE: Samples from Cruz del Eje; C. OLC: Samples from other locations of Córdoba Province; D. FR: Samples from South Western France (Cantaloube et al., 2008).
Summary for E2 substitution rate estimations.
| Dataset | Mean Substitution Rate | Standard Error of mean | Median Substitution Rate | HPD-95% (lower) | HPD-95% (upper) |
| (×10−3 s/s/y) | ( | ( | ( | ( | |
| OLC | 1.86 | 1.16 | 1.77 | 0.96 | 2.96 |
| CdE | 1.19 | 0.63 | 1.14 | 0.64 | 1.83 |
| CdE+OLC | 1.97 | 1.61 | 1.87 | 1.02 | 3.11 |
Substitutions rate for E2 data sets. Results correspond to Bayesian Skyline Plots run under appropriate model of nucleotide substitutions: GTR+Γ+I for CdE+OLC and CdE data sets and HKY+Γ+I for OLC data set. Relaxed clock model (uncorrelated lognormal) was used. The tMRCA for each data set was set as prior with gamma distribution (Shape; Scale): 18.03; 7.41 for CdE+OLC data set; 17.45; 6.36 for OLC data set; 16.42; 6.46 for CdE data set. CdE, Cruz del Eje E2 sequences, OLC, Other Locations of Córdoba E2 sequences. CdE+OLC, sequences from CdE and OLC data sets. s/s/y: substitutions per site per year; HPD-95%, High Posterior Probability Density 95%.