Literature DB >> 24086404

Multiple genomic recombination events in the evolution of saffold cardiovirus.

Lili Ren1, Yan Xiao, Jianguo Li, Lan Chen, Jing Zhang, Guy Vernet, Jianwei Wang.   

Abstract

BACKGROUND: Saffold cardiovirus (SAFV) is a new human cardiovirus with 11 identified genotypes. Little is known about the natural history and pathogenicity of SAFVs. METHODOLOGY/PRINCIPAL
FINDINGS: We sequenced the genome of five SAFV-1 strains which were identified from fecal samples taken from children with viral diarrhea in Beijing, China between March 2006 and November 2007, and analyzed the phylogenetic and phylodynamic properties of SAFVs using the genome sequences of every known SAFV genotypes. We identified multiple recombination events in our SAFV-1 strains, specifically recombination between SAFV-2, -3, -4, -9, -10 and the prototype SAFV-1 strain in the VP4 region and recombination between SAFV-4, -6, -8, -10, -11 and prototype SAFV-1 in the VP1/2A region. Notably, recombination in the structural gene VP4 is a rare event in Cardiovirus. The ratio of nonsynonymous substitutions to synonymous substitutions indicates a purifying selection of the SAFV genome. Phylogenetic and molecular clock analysis indicates the existence of at least two subclades of SAFV-1 with different origins. Subclade 1 includes two strains isolated from Pakistan, whereas subclade 2 includes the prototype strain and strains isolated in China, Pakistan, and Afghanistan. The most recent common ancestor of all SAFV genotypes dates to the 1710s, and SAFV-1, -2, and -3 to the 1940s, 1950s, and 1960s, respectively. No obvious relationship between variation and pathogenicity exists in the critical domains of the CD and EF loops of viral capsid proteins or the multi-functional proteins L based on amino acid sequence identity comparison between SAFV genotypes.
CONCLUSIONS/SIGNIFICANCE: Our findings suggest that intertypic recombination plays an important role in the diversity of SAFVs, highlighting the diversity of the five strains with the previously described SAFV-1 strains.

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 24086404      PMCID: PMC3781130          DOI: 10.1371/journal.pone.0074947

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

The newly identified Saffold cardiovirus (SAFV) is a member of the Cardiovirus genus and belongs to the Picornaviridae family [1]. SAFV is the member of the Theilovirus species, which includes Theiler’s murine encephalomyelitis virus (TMEV), Theiler-like rat virus (TRV), and Vilyuisk human encephalomyelitis virus (VHEV) [1]–[3]. Similar to the genomes of other cardioviruses, the SAFV genome is a positive, single-stranded RNA of approximately 8,050 nucleotides (nts). The genome contains a single reading frame, encoding the leader (L) protein, four capsid proteins (VP1 to VP4), and seven nonstructural proteins (2A, 2B, 2C, 3A, 3B, 3C and 3D) [1]. Unlike other members of the Theilovirus and Encephalomyocarditis virus (EMCV) species, SAFV shows high genetic diversity [1], [4]–[7]. The prototype strain of SAFV, designated as SAFV genotype 1 (SAFV-1), was identified in 2007 using genome sequencing [1]. It was first isolated in 1981, from a female child with fever of unknown origin in California, USA [1]. Since then, SAFVs have been identified in human stool and respiratory samples, and 11 genotypes have been documented based on phylogenetic analysis of the VP1 gene [1], [8]. A large-scale serological survey indicated that SAFV infection occurs early in life [9]. The wide detection of SAFVs in humans and serological study indicates that SAFVs are authentic human cardioviruses. Etiological studies have shown the difference in geographical distribution of SAFV genotypes. SAFV-2 and -3 are the predominant genotypes found in stool and respiratory specimens in America, Asia, and Europe [8]; whereas SAFV-4-11 are primarily found in stool samples from south Asia (Pakistan and Afghanistan) [4]. Although SAFV-1 was the first SAFV strain identified, it was not reported again until 2009, when a SAFV-1 epidemic was discovered in stool and respiratory samples from patients in China [10]–[12]. More recently, SAFV-1 has been isolated in samples from Pakistan and Afghanistan. Analysis of the SAFV genomes has identified recombination events between different SAFV genotypes. For example, recombination between the SAFV-5 strain Pak5152 and the SAFV-6 strain Pak6572 indicates that different genotypes interact during the transmission and spread of SAFVs [4], [13], [14]. Although SAFV is the first human cardiovirus identified, little is known about its natural history and it is unclear why there was a 20-year hiatus between its first isolation and subsequent detection. A powerful tool for tracing the origin and evolution of a virus is phylogenetic and phylodynamic analysis of viral sequences. Data obtained using such methods have contributed to the surveillance of viral spread and drug resistance as well as the identification of strains as vaccine candidates [15], [16]. However, the analysis of the SAFV genome is still insufficient and existing data has been based on very limited SAFV sequences that do not include all known SAFV genotypes. Genomic and phylodynamic analysis based on all available genotypes with updated SAFV genome sequences is needed to provide insight into the natural history and pathogenicity of SAFV. In this study, we employed phylogenetic and phylodynamic analysis using five additional SAFV-1 near-full-length genome sequences identified together with previously reported sequences of SAFV 1-11 to analyze the phylogenetic relationship as well as genome features of SAFV. We also used the VP1 sequences to estimate the divergence time and evolution rate of SAFV.

Results

Genome sequencing of SAFV-1 strains

To characterize the SAFV genomes, five near full-length genomes of SAFV-1 strains (BCHGL352, BCHGL362, BCHGL365, BCHGL368 and BCHGL371) were amplified and sequenced. The genome sequences include the entire coding regions of L, P1, P2, and P3 genes as well as the 3’UTR and partial 5’UTR genes, for a total length of 7,851 to 7,868 nts. The G+C content was 43.2%. These five genomes were aligned with other full-length or near-full-length genomes of SAFVs using clustal W implemented in Mega 4.0, and the 7,341nt fragment (corresponding to nt 290 to 7,360 of reference strain NC009448) was used for further analysis. We found that the five strains share nucleotide (nt) identities of 99.6–99.8% and amino acid (aa) identities of 99.3-99.7% between each other, and 88.9% and 97–97.2% nucleotide and aa identities with the prototype SAFV-1 strain (NC009448), respectively (Table 1). The SAFV-1 strains share less than 81% aa identity with the other genotypes in the VP1 gene region. However, the amino acid sequences of the five SAFV-1 strains were 100% identical to several strains of SAFV-2, -3, -4 and -10 in the VP4 gene and more than 94% identical in the P2 and P3 genes.
Table 1

Pair-wise sequence identity between amino acids of SAFV-1 genotypes identified in this study, prototype SAFV-1, SAFV-1 Pak3079, and other genotypes.

% identityb
SAFV-1SAFV-2SAFV-3SAFV-4SAFV-5SAFV-6SAFV-7SAFV-8SAFV-9SAFV-10SAFV-11
RegionBCHGL362BCHGL365BCHGL368BCHGL371NC009448Pak3097
5'-UTR a 99.810099.899.894.6–94.995.0–95.392.4–96.490.9–94.995.3–95.692.9–94.994.4–94.693.4–93.693.2–93.495.6–95.893.3–93.694.4–94.6
Polyprotein99.499.599.499.297.0–97.297.8–98.190.0–91.889.2–89.992.5–92.888.9–89.488.4–89.389.6–90.089.1–89.589.4–89.889.2–89.689.0–89.4
Leader protein10010010010084.584.578.8–95.784.5–94.391.585.9–87.380.2–81.678.887.390.192.985.9
P110099.810099.896.6–96.797.3–97.480.3–82.377.1–77.984.0–84.175.5–75.876.0–76.367.4–67.676.8–77.076.8–77.076.8–77.076.7–76.8
VP410010010010097.298.688.8–10097.2–10010098.698.694.497.298.610097.2
VP210010010099.696.2–96.698.5–98.882.8–84.377.8–78.982.8–83.279.7–80.477.4–78.278.5–78.976.0–76.378.2–78.577.8–78.277.8–78.2
VP310099.510010098.2–98.699.1–99.584.3–84.784.8–86.589.1–89.577.1–78.481.0–81.487.0–87.485.2–85.784.8–85.285.7–86.184.4–84.8
VP110010010010098.998.574.9–76.366.1–67.281.067.0–68.165.2–66.668.366.163.663.264.7
P298.998.998.498.496.5–97.697.4–98.494.3–98.895.8–97.996.9–97.996.7–98.193.9–98.297.2–98.297.2–98.297.9–98.996.7–97.796.9–97.9
2A10010010098.596.4–97.898.5–10095.0–10094.3–97.895.7–97.195.7–97.191.5–98.596.4–97.897.1–98.597.8–99.295.7–97.197.8–99.2
2B10099.210010097.6–98.497.6–98.492.9–99.296.0–98.498.4–99.297.6–98.496.8–98.497.6–98.497.6–98.497.6–98.496.8–97.696.0–96.8
2C98.198.497.197.795.2–97.495.8–98.193.3–99.094.6–98.495.8–98.195.8–98.795.5–98.496.5–98.796.2–98.497.1–99.396.2–98.495.8–98.1
P399.299.599.599.197.0–97.598.0–98.596.1–98.296.3–98.297.3–97.896.9–97.997.5–98.297.3–97.896.9–97.397.0–97.596.9–97.397.2–97.6
3AB10010010010098.199.098.1–10097.1–99.098.198.1–99.098.1–99.098.199.099.098.199.0
3C99.010010010098.198.1–98.694.4–98.695.8–98.698.6–99.097.2–98.198.6–99.098.6–99.096.7–97.297.2–97.698.1–98.698.1–98.6
3D99.199.199.198.396.9–97.797.5–98.395.5–98.395.2–97.796.1–96.995.5–97.596.3–97.596.1–96.996.1–96.996.1–96.995.5–96.395.8–96.6

Nucleic acid sequence identities between the SAFV-1 strains described here (BCHGL352 as query strain) and prototype SAFV-1, SAFV-1 Pak3079 and other genotypes.

The identities listed in NC009448, Pak3097, SAFV-2-11 are calculated between all the five strains identified in this study and the according strains.

Nucleic acid sequence identities between the SAFV-1 strains described here (BCHGL352 as query strain) and prototype SAFV-1, SAFV-1 Pak3079 and other genotypes. The identities listed in NC009448, Pak3097, SAFV-2-11 are calculated between all the five strains identified in this study and the according strains.

Phylogenetic analysis

To address the genetic relationships among different genotypes further, we carried out phylogenetic analysis of the full-length genomes and the P1 region of SAFVs. The results confirm that our five SAFV-1 strains are clustered with the prototype SAFV-1 (NC009448) (Fig. 1). The results of the analysis on the individual regions, including VP4, VP2, VP3 and VP1, also support this conclusion (data not shown).
Figure 1

Phylogenetic analysis of different SAFV genotypes.

Genome map of SAFVs (A). The trees of complete genomes (B), 5′-UTR (C), L(D), P1 (E), P2 (F), and P3 (G) regions were constructed using the neighbor-joining method. Triangles represent the prototype SAFV-1 strain. Circles represent SAFV-1 strains identified in this study.

Phylogenetic analysis of different SAFV genotypes.

Genome map of SAFVs (A). The trees of complete genomes (B), 5′-UTR (C), L(D), P1 (E), P2 (F), and P3 (G) regions were constructed using the neighbor-joining method. Triangles represent the prototype SAFV-1 strain. Circles represent SAFV-1 strains identified in this study. However, in the cases of the 5’UTRs, the L protein, and the P2 and the P3 regions, the five SAFV-1 strains were separated from the prototype strain of SAFV-1 in the phylogenetic tree, forming a different lineage (Fig. 1). The results of the phylogenetic analysis of the individual nonstructural genes (2A, 2B, 2C, 3AB, 3C and 3D) were consistent with those of the P2 and P3 regions (data not shown). Taken together, these data indicate that our SAFV-1 strains are genetically distinct from the prototype strain. As VP1 gene is responsible for genotype classification in Picornaviruses [4], to investigate the diversity of SAFV-1, we used our SAFV-1 strains and all of the SAFV VP1 sequences available in GenBank to construct a phylogenetic tree (Fig. 2). It shows that at least two distinct subclades exist within SAFV-1 strains. Subclade 1 (in red) includes only two strains, which were isolated from Pakistan. Most SAFV-1 strains identified belong to subclade 2 (in blue), which includes the prototype strain, the strains identified by our group, the Pakistan strains, the Afghanistan strains and other strains identified in China by other groups. In subclade 2, the strains identified in China clustered together, while the prototype and the strains identified in Pakistan and Afghanistan clustered, indicating different evolution characteristics. The mean pairwise p distances of nt and aa in the two subclades are 0.22 and 0.073, respectively.
Figure 2

Phylogenetic analysis of the VP1 gene sequence of SAFVs.

The trees were constructed using the neighbor-joining method. Each strain is labeled by number of genotype, accession number and strain name. Genotype I (SAFV-1) is highlighted in color, sublade 1 in blue, and subclade 2 in red. The triangle represents the prototype SAFV-1 strain. Circles represent the SAFV-1 strains identified in this study.

Phylogenetic analysis of the VP1 gene sequence of SAFVs.

The trees were constructed using the neighbor-joining method. Each strain is labeled by number of genotype, accession number and strain name. Genotype I (SAFV-1) is highlighted in color, sublade 1 in blue, and subclade 2 in red. The triangle represents the prototype SAFV-1 strain. Circles represent the SAFV-1 strains identified in this study.

Recombination analysis

The inconsistencies in the branching orders of the SAFV-1 clusters in the phylogenetic trees suggest that the SAFV-1 genotype was probably generated by a recombination event. To assess these potential recombination events, bootscanning-plot analyses were performed. For each bootscanning plot, the genome sequences of our SAFV-1 strain were used as query sequence. The prototype SAFV-1 sequence, two reference sequences positioned closely to the query sequence in the phylogenetic tree and a distantly related genome were used as the outgroup sequence. A representative result of bootscanning in which the BCHGL352 (JX122400) strain was used as a query sequence (Fig. 3), shows that in the regions of the 5′-UTR and L genes, BCHGL352 is most closely related to SAFV-2 (S4 strain, JN652231), SAFV-3 (BCH115 strain, GU943514), SAFV-4 (Pak3164 stain, AB747251), SAFV-9 (Pak2457 strain, AB747256), and SAFV-10 (Pak2325 stain, AB747257). Additionally, a potential recombination site was found in VP4 region. In the P1 region, BCGL352 was most closely related to the prototype SAFV-1 (NC009448). Around the boundary of the P1 and P2 gene regions, we found another potential recombination site between the prototype SAFV-1 (NC009448) and SAFV-4 (Pak3164 stain, AB747251), SAFV-6 (Pak6572 stain, FJ463617), SAFV-8 (Pak3486 stain, AB747255), SAFV-10 (Pak2325 stain, AB747257), and SAFV-11 (Pak2338 stain, AB747258). Similar bootscanning results were obtained when any one of the five SAFV-1 sequences was used as a query strain (data not shown).
Figure 3

Recombination analyses of SAFV-1.

Bootscanning plot analysis of the SAFV-1 sequence identified in this study in comparison to those of other genotypes. The graph generated by using BCHGL352 as representative query sequence and prototype strain sequence, a reference sequence positioned closely to the query sequence in the phylogenetic tree and two distantly related genomes were used as the outgroup sequences. The bootstrap value is 100 for a window of 200 bp. Arrows indicate putative recombination breakpoints.

Recombination analyses of SAFV-1.

Bootscanning plot analysis of the SAFV-1 sequence identified in this study in comparison to those of other genotypes. The graph generated by using BCHGL352 as representative query sequence and prototype strain sequence, a reference sequence positioned closely to the query sequence in the phylogenetic tree and two distantly related genomes were used as the outgroup sequences. The bootstrap value is 100 for a window of 200 bp. Arrows indicate putative recombination breakpoints. The potential recombination events were further confirmed using the GARD program and the KH test. Several potential recombination breakpoints were found in the five SAFV-1 strains. According to the regions identified by the bootscanning results, statistically significant recombination breakpoints are located at nt 1,078 and nt 3,651 (p = 0.0006), nt 1,078 and nt 3,703 (p = 0.0006), nt 1,078 and nt 3,632 (p = 0.0006), nt 1,078 and nt 3,682 (p = 0.0006), nt 1,078 and nt 3,703 (p = 0.0006) in the genome when BCHGL352, 362, 365, 368, and 371 are used as query sequences, respectively. The recombination sites were mapped in the VP4 and 2A genome regions. However, neither the SAFV-2 strains of the BR/118/2006 (EU681177), HTMV (NC010810), and UC6 (GU595289) isolates from Germany and the US nor the SAFV-3 stains of Pak3641 (AB747250), BCH1031 (GU943513), and NL1999 (HM181996) isolates from Pakistan, China, and the Netherlands appear to be involved in the recombination of the VP4 gene region. Taken together, these findings suggest that our five SAFV-1 strains might derive from complicated recombination events between the prototype strain and multiple genotypes that circulate in nature.

Variability analysis

The functional domain in the capsid and the L protein play an important role in the persistence and pathogenicity of cardioviruses [8]. To determine if the described SAFV-1 strain has characteristics distinct from the prototype strains, other genotypes, and animal cardioviruses, we analyzed the main immunogenetic domains located in the viral capsid proteins, including the CD loop in VP1, the EF loop in VP2, and the L protein. All the SAFV-1 strains identified in this study have the same sequence in the CDEF loops. In the CD loops, the aa sequence of the five SAFV-1 strains is 100% identical to that of the prototype strain and 29.4% to 100% among all known SAFVs and from 7.5% to 26.1% identical to that of animal cardioviruses. In the EF loops, the aa sequence of the five strains is 93% identical to that of the prototype strain, with aa sequence identity ranging from 35.4% to 100% among all known SAFVs and from 12.5% to 25.0% among animal cardioviruses (Fig. 4A and B).
Figure 4

Alignment of amino acid sequences of the major surface structures among all the genotypes of SAFV, together with homologues from other theiloviruses and EMCVs, including the VP1 CD loop (A), VP2 EF loop (B), L protein (C), and L* protein (D).

Periods indicate identity to the first sequence on the top, and dashes indicate amino acid deletions.

Alignment of amino acid sequences of the major surface structures among all the genotypes of SAFV, together with homologues from other theiloviruses and EMCVs, including the VP1 CD loop (A), VP2 EF loop (B), L protein (C), and L* protein (D).

Periods indicate identity to the first sequence on the top, and dashes indicate amino acid deletions. The L protein of our SAFV-1 strains contains the conserved putative zinc finger domain and the acidic domain, but has one more serine amino acid in the Ser/Thr phosphorylation domain than the prototype SAFV-1 strain. The five SAFV-1 strains we identified also contain a sequence of ME/DWTD/NLP in the carboxy terminal, which is referred to as the Theilo domain and conserved in theiloviruses (Fig. 4C). Studies have shown that the variations in the L protein may affect its pathogenesis [17]. Two strains (JN209931 and JN209932) were identified in aseptic samples from two patients with invasive SAFV infections, which were attributed to the viral pathogenesis of these strains [18]. To investigate the potential pathogenesis of our SAFV-1 strains, we compared the sequences of the functional domains or the L protein with the JN209931 and JN209932 strains. We found no significant difference between the aa in the four domains of these strains and the domains of other strains from different genotypes, indicating that the invasive SAFV strains might not be related to specific genome sequences in the functional domains. In addition, compared with the prototype strain, which has a 57 aa L* protein, our SAFV-1 strains have a 34 aa L* protein translated from a putative opening reading frame initiated from ACG [19], whereas some strains of SAFV-2 have only a 24 aa L* protein (Fig. 4D).

Evolutionary analysis of SAFV genotypes

To evaluate the selection pressure on the SAFV genes during evolution, we estimated the mean dN/dS values of all codons among branches of the SAFVs. We found that in the P1, P2, and P3 genome regions, the estimated values of mean dN/dS are 0.0473, 0.0346, and 0.0419, respectively for all genotypes, suggesting a purifying selection. The Bayesian relaxed molecular clock method with uncorrelated lognormal distribution was used to infer the evolutionary behavior of SAFVs. The mean nucleotide substitution rate of the VP1 gene was calculated to be 3.63 (2.29–4.99)×10−3 substitutions/site/year for the SAFVs. Molecular clock analysis using the VP1 gene showed that the mean time of the most recent common ancestor (tMRCA) of all SAFVs was 1,694 (95% highest posterior density [HPDs], 1,568 to 1,835). The tMRCA of SAFV-1 was 1,941 (HPDs, 1,910 to 1,967); that of SAFV-2 was 1,947 (HPDs, 1,912 to 1,977); and that of SAFV-3 was 1,964 (HPDs, 1,938 to 1,985) (Fig. 5). The tMRCA of other SAFVs is described in the MCMC tree, but the analysis of the evolutionary rate and tMRCA may not be reliable as the number of VP1 gene sequences available for other genotypes is limited. Consistent with results of the phylogenetic analysis of the VP1 gene, results of the Bayesian MCMC analysis of the VP1 gene also suggest that at least two distinct subclades of SAFV-1 strains exist.
Figure 5

Estimated time at which the most recent common ancestor of SAFV genotypes emerged.

The phylogeny of time-scale was summarized from MCMC phylogenies of the VP1 gene and the data was analyzed using a relaxed-clock model with uncorrelated exponential distribution in BEAST (V 1.6.1).

Estimated time at which the most recent common ancestor of SAFV genotypes emerged.

The phylogeny of time-scale was summarized from MCMC phylogenies of the VP1 gene and the data was analyzed using a relaxed-clock model with uncorrelated exponential distribution in BEAST (V 1.6.1).

Discussion

To assess the natural history of SAFVs, we sequenced the nearly complete genomes of five SAFV-1 strains and analyzed the phylogenetic and phylodynamic features of SAFVs using all the sequences of each SAFV genotype available in GenBank. Although all of the SAFV-1 strains have high identity in the P1 gene region, we found that our five SAFV-1 strains showed phylogenetic relationships distinct from those of the prototype stain of SAFV-1 in other gene regions. Bootscanning and GARD analysis showed that complicated recombination events occurred between the prototype SAFV-1 strain and SAFV-2, -3, -4, -9, and -10 in the VP4 gene region, and between the prototype SAFV-1 strain and SAFV-4, -6, -8, -10 and -11 in the 2A regions, leading to the generation of our SAFV-1 strains. However, the prototype strain of SAFV-1 does not have the same recombination profile as our SAFV-1 strains. This profile is also different from that observed in other SAFV-1 strains. In the genome of a SAFV-1 strain from Pakistan (Pak3097, AB747248), a potential recombination event was found in the L/VP4 region between several strains of SAFV-2, -4, -9 and the prototype strain, but no recombination was found in the VP1/2A region (data not shown). These data indicate that recombination may play an important role in the evolution of SAFV-1 strains (including those from China and Pakistan). However, we cannot exclude other natural recombination events among these genotypes as the whole genome sequences of SAFV-1 deposited in GenBank are very limited. Viral recombination is a powerful contributor to genetic variation, adaptation to new hosts, escape from the host immune response, and emergence of newly infectious agents [20]–[24]. Genetic recombination has been documented and confirmed as a common phenomenon in the Picornaviradae genus, which includes enterovirus, apthovirus, parechovirus, cosavirus, and cardiovirus [4], [13], [14], [25]–[29]. The prototype SAFV-1 strain has been reported to be involved in a recombination event with SAFV-2 strain BR/118/2006, which led to the generation of SAFV-2 strains HTMV (NC010810), Can112051-06, and HTMV-UC6 [14]. We found that the prototype strain served as a parent strain to form the new strains of SAFV-1 by intertypic recombination with other genotypes. This finding is consistent with our molecular clock model using the VP1 gene sequences, which estimated he tMRCA of the prototype strain to be 1971, earlier than that of the recombinant strains. These data support our conclusion that the recombinant strains were derived from the prototype SAFV-1. The role of recombination in SAFV-1 evolution is also supported by the results of natural selection analysis. The values of mean dN/dS ratios are very low, ranging from 0.0346 to 0.0473 in coding regions, although SAFV contains a high substitution rate similar to that of enteroviruses [30]. The low dN/dS ratios indicate a purifying selection pressure on the SAFV genome. Our results are consistent with a previous study, in which the dN/dS ratios of different SAFV genome regions were also very low [13]. These findings indicate an important role of inter- and intratypic recombination in driving SAFV evolution. We found a rare intertypic recombination site located in the VP4 gene. No intertypic recombination sites in the structural gene region have been described for SAFV, although an intratypic recombination in the VP4-VP2 genome region between the SAFV-2 strain BR/118/2006 and strain D/VI2229/2004 has been reported [13]. Recombination in structural regions appears to be uncommon in picornaviruses, as recombination hot spots are usually found at the boundary between the structural and nonstructural regions or between nonstructural regions, such as between P2 and P3 or the 5′-UTR/L and VP1/2A [26], [31]–[33]. A few reports have described recombination events in the structural gene regions, for example, at the junction of the VP2/VP3 gene in the foot-and-mouth disease virus, between the VP4/VP2 and VP1 gene regions of coxsakie B1 virus, and in the VP1 gene region of TMEV [3], [13], [32], [34]. It is unclear if these complicated recombination events have occurred in different genotypes and if the recombination site in the VP4 gene region provides a selective advantage for virus diversification and evolution; the molecular mechanisms behind these recombination events also need to be clarified [35], [36]. Based on the results of phylogenetic analysis and MCMC tree construction of VP1, there are at least two subclades of SAFV-1. The high genetic diversity of SAFV-1 indicates a complicated or very different etiological process of SAFV-1. There is no apparent difference in geographical distribution of the two subclades. Additional sequences are needed to determine if there is a distinct sublineage of subclade 2. Because the sequences of SAFVs are only available from a short period and the sequence numbers of some genotypes are limited, the phylodynamic features of SAFVs should be further addressed by the accumulation of genome sequences from different geographical locations and over a longer time-span. However, in our study, we identified complicated recombination in SAFVs through phylogenetic and phylodynamics analysis based on whole viral genome sequences. Additionally, we are the first to identify an uncommon recombinant site in the viral structural protein VP4. Our results reveal new phylogenetic relationships between SAFV genotypes and provide insight into the evolution of SAFVs.

Materials and Methods

Samples and sequencing

In our previous study, 12 SAFV-1 positive samples were identified in fecal samples from pediatric outpatients with gastroenteritis at the Beijing Children’s Hospital in November 2007 [10]. The patients were recruited for a prospective study on viral etiology of diarrhea between March 2006 and November 2007. Written informed consent was obtained from the next of kin, caretakers, or guardians on the behalf of the minor/children participants involved in this study. This study was approved by the ethical review committee of the Institute of Pathogen Biology, Chinese Academy of Medical Sciences. Five near-full genome sequences were amplified from those samples. Viral nucleic acids were extracted using the NucliSens easyMAG platform (bioMérieux, Marcy l’Etoile, France), according to the manufacturer’s instructions. A combined random priming and oligo (dT) priming strategy was used to derive cDNA from an RNA template. The primer sets used to amplify the SAFV sequences were designed from multiple alignments of SAFV-1 (NC009448) with other genotypes of SAFV (Table 2), and the full genomic sequences were obtained using a genome walking method. PCR products were cloned into the pGEM-T Easy vector (Promega, Madison, WI, USA), and at least three positive clones were verified by sequence analysis. The entire sequences were assembled manually to produce final viral genome sequences.
Table 2

Primers used for the genome amplification of the saffold cardiovirus genotype I.

Primer namePrimer direction, sequence (5'-3')Amplified gene
CF1887 Forward, CTAATCAGAGGAAAGTCAGCAT 5′-UTR
CR9907 Reverse, GACCACTTGGTTTGGAGAAGCT
SAF720Forward, AGCTGTAGCGACCTCACAATAGC L, VP4, VP2
SAF1878Reverse, GAAGTGTCAAATTCTGGAGCCAT
SAF1857Forward, TTATGGCTCCAGAATTTGACAC VP2, VP3
SAF2446Reverse, CCAAGAAAGGTTGGGATTTTACAT
SAF2382Reverse, TACATGTGCGGAGAATTCACTGA VP3
SAF3023Forward, GTCAGCTGATGCATTATCATCTGAAAC
SAF2840Forward, TCAGAATGCCAATCTCCCCAAC VP1
SAF3718Reverse, AAAGGTCCACCCGATACATTGA
SAF3693Forward, TGAGAGTBTTYTGYCCAAGRCCHA2AB
SAF4497Reverse, ATTCCAGCCATCAAAGAYAGACAGA
SAF4433Forward, TCAGTTCTTTACCTACATAATCCGGAC 2C
SAF5578Reverse, TGATTGAGCRACYAATGTGTTCATT
SAF5448Forward, AGAGGCGTACGGTTTGTATGTAACC 3A, 3B, 3C
SAF6453Reverse, AGTGCATTGCGTACACAGCCTTT
SAF6363Forward, TWCACTAYAGAGCYACAACACAYC3D, 3′-UTR
SAF7395Reverse, CTTCTYTCTTCAAAAGCATGTCG
SAF7310Forward, TTYACAMCAGAGAATGGATTTGAC3′-UTR
SAF8051Reverse, GTTCTCATTTCCAATTAAAAGCT

Sequence collection and phylogenetic analysis

All available SAFV sequences were retrieved from GenBank (www.ncbi.nlm.nih.gov) on 22 October 2012. A total of 42 near full-length genomes, including 37 genomes deposited in GenBank and 5 genomes from this study, and 136 partial VP1 genes from GenBank were subjected to phylogenetic analysis. The background information of all the sequences including accession numbers, collection dates, and isolation areas were provided in the supplemental materials (Table S1). The sequences were aligned using Clustal W and phylogenetic trees were constructed using the neighbor-joining (NJ) method with Kimura’s two-parameter model and 1000 bootstrap pseudo-replicates implemented in the MEGA 4.0 [37]. The pair-wise sequence identities in each region were calculated for the comparison of sequence divergence using BioEdit. The mean pairwise p distances of nucleotide and animo acid sequences in VP1 gene were calculated using MEGA 4.0 [37]. To analyze the recombination events, genomes of SAFV were aligned and analyzed using SimPlot (V3.5.1, thhp://sray.med.som.jhmi.edu/SCRoftware). The bootscanning method was used in SimPlot to assess the potential recombination relationships. The window and step sizes were determined according to the length of the sequences. Recombination sites were confirmed using the Kishino-Hasegawa (KH) test in the Genetic Algorithms for Recombination Detection (GARD) program implemented in the HYPHY package accessed through the DataMonkey facility (http://www.datamonkey.org) [38]. The best-fit nucleotide substitution model for the datasets used in this study was determined using ModelTest (v 3.7) [39].

Natural selection and adaption analysis

Using the codon-based phylogenetic method implemented in CODEML (distributed in PAML, version 4) [40], the selection pressure of each coding gene of SAFV was investigated through estimating the ratio of nonsynonymous to synonymous substitutions (dN/dS).

Bayesian MCMC evolutionary analysis

The Bayesian Markov Chain Monte Carlo (Bayesian MCMC) method implemented in BEAST (v1.6.1) was used to determine the divergence time, rates of evolution, and molecular clock phylogenies [41]. Datasets of the VP1 gene were constructed to perform the Bayesian phylogenetic analysis. Based on the results of the ModelTest, the best-fit model of evolution was used under the GTR substitution model with a gamma distribution ([GTR+G]). For the Bayesian MCMC, there were 500 million generations with sampling every 1,000 generations using a relaxed molecular clock model (the uncorrelated lognormal distributed model). Constant population size and Bayesian skyline distribution were adopted for making inferences. The resulting tree of each run was annotated using Tree Annotator, implemented in Beast (v1.6.1) and the maximum clade credibility tree was visualized using Figtree software (v1.1.2).

Nucleotide sequence accession numbers

The five near-full-length nucleotide sequences determined in this study were deposited in GenBank and assigned accession numbers JX122399, JX122340, JX122341, JX122342, and JX122343. Background information of the Saffold cardiovirus (SAFV) gene sequences used in this study. (DOC) Click here for additional data file.
  40 in total

1.  Recombination patterns in aphthoviruses mirror those found in other picornaviruses.

Authors:  Livio Heath; Eric van der Walt; Arvind Varsani; Darren P Martin
Journal:  J Virol       Date:  2006-09-13       Impact factor: 5.103

2.  PAML 4: phylogenetic analysis by maximum likelihood.

Authors:  Ziheng Yang
Journal:  Mol Biol Evol       Date:  2007-05-04       Impact factor: 16.240

Review 3.  Use of phylogenetics in the molecular epidemiology and evolutionary studies of viral infections.

Authors:  Tommy Tsan-Yuk Lam; Chung-Chau Hon; Julian W Tang
Journal:  Crit Rev Clin Lab Sci       Date:  2010 Jan-Feb       Impact factor: 6.250

4.  Recombination and selection in the evolution of picornaviruses and other Mammalian positive-stranded RNA viruses.

Authors:  Peter Simmonds
Journal:  J Virol       Date:  2006-09-06       Impact factor: 5.103

5.  Genomic features and evolutionary constraints in Saffold-like cardioviruses.

Authors:  Jan Felix Drexler; Sigrid Baumgarte; Luciano Kleber de Souza Luna; Andreas Stöcker; Patrícia Silva Almeida; Tereza Cristina Medrado Ribeiro; Nadine Petersen; Petra Herzog; Célia Pedroso; Carlos Brites; Hugo da Costa Ribeiro; Anatoly Gmyl; Christian Drosten; Alexander Lukashev
Journal:  J Gen Virol       Date:  2010-02-03       Impact factor: 3.891

6.  Phylogenetic analysis of the species Theilovirus: emerging murine and human pathogens.

Authors:  Zhiguo Liang; A S Manoj Kumar; Morris S Jones; Nick J Knowles; Howard L Lipton
Journal:  J Virol       Date:  2008-09-24       Impact factor: 5.103

7.  Genetic diversity of the genus Cosavirus in the family Picornaviridae: a new species, recombination, and 26 new genotypes.

Authors:  Beatrix Kapusinszky; Tung G Phan; Amit Kapoor; Eric Delwart
Journal:  PLoS One       Date:  2012-05-16       Impact factor: 3.240

8.  Analysis of the genomic homologous recombination in Theilovirus based on complete genomes.

Authors:  Guangming Sun; Xiaodan Zhang; Maoli Yi; Shihe Shao; Wen Zhang
Journal:  Virol J       Date:  2011-09-17       Impact factor: 4.099

9.  Saffold cardioviruses of 3 lineages in children with respiratory tract infections, Beijing, China.

Authors:  Lili Ren; Richard Gonzalez; Zhengde Xie; Yan Xiao; Yongjun Li; Chunyan Liu; Lan Chen; Qingqing Yang; Guy Vernet; Glaucia Paranhos-Baccalà; Qi Jin; Kunling Shen; Jianwei Wang
Journal:  Emerg Infect Dis       Date:  2010-07       Impact factor: 6.883

10.  Serious invasive Saffold virus infections in children, 2009.

Authors:  Alex Christian Yde Nielsen; Blenda Böttiger; Jytte Banner; Thomas Hoffmann; Lars Peter Nielsen
Journal:  Emerg Infect Dis       Date:  2012-01       Impact factor: 6.883

View more
  2 in total

1.  First Occurrence of Saffold Virus in Sewage and River Water Samples in Karaj, Iran.

Authors:  Mona Aminipour; Mostafa Ghaderi; Naser Harzandi
Journal:  Food Environ Virol       Date:  2019-11-15       Impact factor: 2.778

2.  Recombination Located over 2A-2B Junction Ribosome Frameshifting Region of Saffold Cardiovirus.

Authors:  Antônio Charlys da Costa; Adriana Luchs; Flávio Augusto de Pádua Milagres; Shirley Vasconcelos Komninakis; Danielle Elise Gill; Márcia Cristina Alves Brito Sayão Lobato; Rafael Brustulin; Rogério Togisaki das Chagas; Maria de Fátima Neves Dos Santos Abrão; Cassia Vitória de Deus Alves Soares; Xutao Deng; Ester Cerdeira Sabino; Eric Delwart; Élcio Leal
Journal:  Viruses       Date:  2018-09-24       Impact factor: 5.048

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.