| Literature DB >> 21731491 |
Arnaud Bataille1, Frank van der Meer, Arjan Stegeman, Guus Koch.
Abstract
Phylogenetic studies have largely contributed to better understand the emergence, spread and evolution of highly pathogenic avian influenza during epidemics, but sampling of genetic data has never been detailed enough to allow mapping of the spatiotemporal spread of avian influenza viruses during a single epidemic. Here, we present genetic data of H7N7 viruses produced from 72% of the poultry farms infected during the 2003 epidemic in the Netherlands. We use phylogenetic analyses to unravel the pathways of virus transmission between farms and between infected areas. In addition, we investigated the evolutionary processes shaping viral genetic diversity, and assess how they could have affected our phylogenetic analyses. Our results show that the H7N7 virus was characterized by a high level of genetic diversity driven mainly by a high neutral substitution rate, purifying selection and limited positive selection. We also identified potential reassortment in the three genes that we have tested, but they had only a limited effect on the resolution of the inter-farm transmission network. Clonal sequencing analyses performed on six farm samples showed that at least one farm sample presented very complex virus diversity and was probably at the origin of chronological anomalies in the transmission network. However, most virus sequences could be grouped within clearly defined and chronologically sound clusters of infection and some likely transmission events between farms located 0.8-13 Km apart were identified. In addition, three farms were found as most likely source of virus introduction in distantly located new areas. These long distance transmission events were likely facilitated by human-mediated transport, underlining the need for strict enforcement of biosafety measures during outbreaks. This study shows that in-depth genetic analysis of virus outbreaks at multiple scales can provide critical information on virus transmission dynamics and can be used to increase our capacity to efficiently control epidemics.Entities:
Mesh:
Year: 2011 PMID: 21731491 PMCID: PMC3121798 DOI: 10.1371/journal.ppat.1002094
Source DB: PubMed Journal: PLoS Pathog ISSN: 1553-7366 Impact factor: 6.823
Figure 1Map indicating the locations of farms infected during the 2003 HPAI H7N7 epidemic.
Farms are represented by coloured dots, according to their location and inclusion in a cluster of infection. Black dots in the main map correspond to farm samples not analyzed in this study. Farm samples represented by coloured squares were used for the within-flock viral genetic analyses. In order to maintain the clarity of the figure, only the names of the farms mentioned in the main text are shown. All samples are described in details in Table S1.
Mean nucleotide substitution rates and estimation of TMRCA of the H7N7 epidemic.
| BMCMC analysis | ||||
| Gene | Mean substitution rate (×10−2) | Substitution rate HPD (×10−2) | Mean TMRCA | HPD TMRCA |
| HA | 1.18 | 0.79–1.59 | 15/01/2003 | 05/12/2002–06/02/2003 |
| NA | 1.02 | 0.65–1.42 | 25/12/2002 | 24/10/2002–09/02/2003 |
| PB2 | 0.54 | 0.34–0.74 | 20/10/2002 | 14/03/2002–13/01/2003 |
HPD, 95% highest posterior density intervals. Dates are presented in day/month/year.
Figure 2Phylogenetic trees of H7N7 viruses.
Time-scaled phylogenies (dates on the horizontal axis) inferred using Bayesian MCMC analysis from (A) HA gene; (B) NA gene; (C) PB2 gene. Nodes supported by ≥0.7 posterior probability are indicated by a grey dot. Posterior probability values from the time-scaled BMCMC method, the MrBayes BMCMC method, and the Maximum Likelihood method (1,000 ML bootstrap replications) are shown for nodes delimitating clusters of transmission (tsBMCMC/MrBMCMC/ML; noted Cluster I–IV). The three samples with discordant phylogenies are indicated by black square (F45), circle (F76), and triangle (F145). Nodes and branches are coloured according the geographical origin of the farm samples. Yellow, Gelderland area; Blue, Central area; Red, Limburg area; Green, Southwest area. Fully annotated trees are available online in supplementary figures S2A–C.
Figure 3Median-joining phylogenetic network of H7N7 viruses.
The median-joining network was constructed from the combined HA, NA and PB2 sequence data. This network includes all the most parsimonious trees linking the sequences. Each unique sequence genotype is represented by a coloured circle sized relative to its frequency in the dataset. Genotypes are coloured according to the location of the farm sample and its inclusion in a cluster of infection. Branches in black represent the shortest trees; Additional branching pathways are in grey. Each node is separated by a specific number of mutations represented by grey dots. Mutations corresponding to specific amino acid changes are indicated. For genotypes containing a deletion in the NA stalk region, the type of deletion is indicated between brackets beside the name of the isolate (see Table S1 for the description of deletion types). Names of farm samples involved in likely inter-farm transmission events are in red (see Table 2). (*) positively selected amino acids linked to adaptation to mammalian hosts. G1: group of samples including F38, F54, F64, F113, F162, F194, F199; G2: F134, F160, F166; G3: F122, F161, F164, F171, F182; G4: F2, F5, F12, F21, F43, F60, F91; G5: F39, F70, F92, F129; G6: F15, F29, F37; G7: F16, F19, F52; G8: F193, F217, F223, F231; G9: F36, F68, F167, F191(d1), F205(d5), F207(d2); G10: F203 (d3), F219(d3), F228(d3); G11: F197, F242, F232.
Summary of the most likely transmission events identified either from pair of farm samples exclusively sharing the same sequence genotype, or pair of farm samples having sequence genotypes unambiguously linked in the network analysis.
| Identical genotypes | Direct network connections | ||||
| Sample pair | Location | Distance (km) | Sample pair | Location | Distance (km) |
| F10-F14 | G-G | 1.1 | F59-F121 | G-G | 13.6 |
| F23-F24 | G-G | 7.4 | F94-F141 | G-G | 2.9 |
| F25-F42 | G-G | 8.2 | F102-F180 | G-G | 13.3 |
| F33-F62 | G-G | 2.1 | F103-F107 | G-G | 11.2 |
| F46-F61 | G-G | 2 | F135-F163 | G-G | 2.6 |
| F56-F74 | G-G | 12.4 | F152-F179 | G-G | 1.4 |
| F58-F71 | G-G | 4.2 | F156-F185 | G-G | 3.3 |
| F99-F130 | G-G | 1.2 | F172-F173 | G-G | 2.6 |
| F110-F157 | G-G | 1.9 | F202-F216 | L-L | 2 |
| F111-F132 | G-G | 3.1 | F207-F219 | L-L | 10.2 |
| F142-F220 | C-C | 12.3 | F224-F234 | L-L | 3.4 |
| F219-F228 | L-L | 1.1 | F229-F239 | L-L | 0.8 |
| F36-F68 | G-G | 5.7 |
|
|
|
|
|
|
| |||
|
|
|
| |||
Probable long distance transmission events are in bold. C, Central area; G, Gelderland; L, Limburg; S, Southwest area.
Values of Log-likelihood (lnL) and d N /d S for HA, NA and PB2 genes using different selection models in the CODEML analysis, and LRT tests comparing the two models.
| M1a (nearly neutral) | M2a (positive selection) | LRT (M2a-M1a) | ||||
| Gene | lnL |
| lnL |
| 2Δl |
|
| HA | −3031.91 | 0.545 | −3028.44 | 0.736 | 6.94 |
|
| NA | −2449.32 | 0.493 | −2448.46 | 0.578 | 1.72 | 0.423 |
| PB2 | −3788.85 | 0.313 | −3788.85 | 0.313 | 0 | 1.000 |
We used the degree of freedom of 2 for these LRT tests that is expected to be too conservative.
Figure 4Recombination analysis on concatenated H7N7 virus sequences.
(A) Bootscan analysis on the full dataset; (B) Bootscan analysis on the dataset with the HA codon 143 removed. The Cluster IV virus group was used as query in the analysis, with an 800 bp window size and step size of 10 bp. A schematic diagram of the concatenated HA, NA and PB2 virus segments is shown on top.
Summary of results obtained from clonal sequencing.
| Sample | N | H | % Dom | dS | dN | deletion |
|
| ||||||
| F26 (March 6) | 52 | 18 | 65.4 | 2 | 21 | 0 |
| F26 (March 9) | 54 | 8 | 59.3 | 1 | 7 | 0 |
| F36 | 50 | 13 | 76 | 7 | 10 | |
| F167 | 56 | 15 | 73.2 | 8 | 9 | 1(1) |
| F191 | 53 | 35 | 17 | 14 | 11 | 52 (18) |
| F193 | 53 | 28 | 39.6 | 8 | 25 | 8(2) |
|
| ||||||
| F191 | 27 | 21 | 18.5 | 12 | 26 | 0 |
| F193 | 12 | 7 | 50 | 0 | 7 | 0 |
N, number of clones sequenced; H, total number of sequence variants identified; % Dom, percentage of the clones with the dominant sequence variant; dS, number of synonymous substitutions; dN, number of non-synonymous substitutions; deletion, number of variants found with a deletion (number of different type of deletions).
Figure 5Schematic diagrams summarizing within-flock genetic diversity in 6 farm samples.
The sequence variants found by clonal sequencing of partial NA and HA genes in 6 different samples are represented by coloured circles sized relatively to their frequency. Total number of clones sequenced per sample (n) is indicated. The exact number of copies of each genetic variant is indicated when >1. Variants in black correspond to the sequence originally isolated in each farm. Each variant is separated by nucleotide substitutions represented by filled black dots (non-synonymous changes) or open dots (synonymous changes), and by deletions represented by squares. The exact position of the deletion in the NA gene is indicated. The red node represents a variant similar to the sequence obtained for the F192 sample (see main text). The white node represents a potential missing variant.