| Literature DB >> 23029382 |
Eric T Beck1, Jie He, Martha I Nelson, Michael E Bose, Jiang Fan, Swati Kumar, Kelly J Henrickson.
Abstract
Thirty-nine human parainfluenza type 1 (HPIV-1) genomes were sequenced from samples collected in Milwaukee, Wisconsin from 1997-2010. Following sequencing, phylogenetic analyses of these sequences plus any publicly available HPIV-1 sequences (from GenBank) were performed. Phylogenetic analysis of the whole genomes, as well as individual genes, revealed that the current HPIV-1 viruses group into three different clades. Previous evolutionary studies of HPIV-1 in Milwaukee revealed that there were two genotypes of HPIV-1 co-circulating in 1991 (previously described as HPIV-1 genotypes C and D). The current study reveals that there are still two different HPIV-1 viruses co-circulating in Milwaukee; however, both groups of HPIV-1 viruses are derived from genotype C indicating that genotype D may no longer be in circulation in Milwaukee. Analyses of genetic diversity indicate that while most of the genome is under purifying selection some regions of the genome are more tolerant of mutation. In the 40 HPIV-1 genomes sequenced in this study, the nucleotide sequence of the L gene is the most conserved while the sequence of the P gene is the most variable. Over the entire protein coding region of the genome, 81 variable amino acid residues were observed and as with nucleotide diversity, the P protein seemed to be the most tolerant of mutation (and contains the greatest proportion of non-synonymous to synonymous substitutions) while the M protein appears to be the least tolerant of amino acid substitution.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23029382 PMCID: PMC3459887 DOI: 10.1371/journal.pone.0046048
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Strain Information for all 40 HPIV-1 Genomes Sequenced.
| GenBank Accession | Strain name | Collection date | Start to end of genomes | Sample type | ||
| Year | Month | Day | ||||
| JQ901971 | HPIV-1/c35/1957 (ATCC VR-94) | 1957 | n/a | n/a | 29-15550 | I |
| JQ901975 | HPIV-1/WI/629-001/1997 | 1997 | 10 | 3 | 29-15567 | I and S |
| JQ901972 | HPIV-1/WI/629-002/1997 | 1997 | 9 | 22 | 29-15539 | I |
| JQ901976 | HPIV-1/WI/629-003/1997 | 1997 | 10 | 16 | 76-15567 | I |
| JQ901973 | HPIV-1/WI/629-004/1997 | 1997 | 9 | 24 | 77-15560 | I |
| JQ901974 | HPIV-1/WI/629-005/1997 | 1997 | 9 | 25 | 78-15555 | I |
| JQ901977 | HPIV-1/WI/629-006/1997 | 1997 | 11 | 4 | 77-15560 | I |
| JQ901979 | HPIV-1/WI/629-007/1997 | 1997 | 11 | 22 | 28-15558 | I |
| JQ901978 | HPIV-1/WI/629-008/1997 | 1997 | 11 | 22 | 28-15568 | I |
| JQ901980 | HPIV-1/WI/629-009/1997 | 1997 | 12 | 3 | 29-15565 | I |
| JQ901981 | HPIV-1/WI/629-010/1999 | 1999 | 3 | 5 | 29-15549 | I |
| JQ901982 | HPIV-1/WI/629-A31/2005 | 2005 | 11 | 9 | 28-15555 | S |
| JQ901983 | HPIV-1/WI/629-003/2007 | 2007 | n/a | n/a | 29-15552 | I |
| JQ901984 | HPIV-1/WI/629-030/2007 | 2007 | n/a | n/a | 28-15534 | I |
| JQ901985 | HPIV-1/WI/629-001/2009 | 2009 | n/a | n/a | 27-15548 | I |
| JQ901991 | HPIV-1/WI/629-D01150/2009 | 2009 | 10 | 6 | 29-15568 | S |
| JQ901987 | HPIV-1/WI/629-D00057/2009 | 2009 | 9 | 29 | 78-15485 | S |
| JQ901998 | HPIV-1/WI/629-D00387/2009 | 2009 | 10 | 22 | 78-15557 | I and S |
| JQ901989 | HPIV-1/WI/629-D00712/2009 | 2009 | 10 | 2 | 29-15567 | S |
| JQ902003 | HPIV-1/WI/629-D01145/2009 | 2009 | 11 | 23 | 100-15548 | I and S |
| JQ901996 | HPIV-1/WI/629-D01164/2009 | 2009 | 10 | 20 | 100-15446 | I and S |
| JQ901992 | HPIV-1/WI/629-D01202/2009 | 2009 | 10 | 6 | 29-15568 | S |
| JQ902006 | HPIV-1/WI/629-D01250/2009 | 2009 | 12 | 4 | 100-15539 | I and S |
| JQ902000 | HPIV-1/WI/629-D01463/2009 | 2009 | 11 | 7 | 100-15481 | I and S |
| JQ902004 | HPIV-1/WI/629-D01575/2009 | 2009 | 11 | 30 | 100-15549 | I and S |
| JQ901988 | HPIV-1/WI/629-D01662/2009 | 2009 | 9 | 29 | 29-15539 | S |
| JQ901986 | HPIV-1/WI/629-D01681/2009 | 2009 | 9 | 29 | 27-15557 | I |
| JQ902008 | HPIV-1/WI/629-D01774/2009 | 2009 | 10 | 3 | 81-15539 | I |
| JQ902001 | HPIV-1/WI/629-D01790/2009 | 2009 | 11 | 7 | 100-15482 | I and S |
| JQ901995 | HPIV-1/WI/629-D01809/2009 | 2009 | 10 | 11 | 28-15539 | I |
| JQ902007 | HPIV-1/WI/629-D02039/2009 | 2009 | 12 | 10 | 100-15548 | I and S |
| JQ902002 | HPIV-1/WI/629-D02072/2009 | 2009 | 11 | 18 | 100-15544 | I and S |
| JQ901993 | HPIV-1/WI/629-D02130/2009 | 2009 | 10 | 6 | 29-15568 | S |
| JQ901999 | HPIV-1/WI/629-D02143/2009 | 2009 | 10 | 24 | 100-15492 | I and S |
| JQ901990 | HPIV-1/WI/629-D02161/2009 | 2009 | 10 | 3 | 29-15550 | S |
| JQ902005 | HPIV-1/WI/629-D02209/2009 | 2009 | 11 | 30 | 100-15488 | I and S |
| JQ901994 | HPIV-1/WI/629-D02401/2009 | 2009 | 10 | 6 | 29-15568 | S |
| JQ901997 | HPIV-1/WI/629-D01900/2009 | 2009 | 10 | 21 | 100-15492 | I and S |
| JQ902010 | HPIV-1/WI/629-D02071/2010 | 2010 | 3 | 2 | 100-15488 | S |
| JQ902009 | HPIV-1/WI/629-D02211/2010 | 2010 | 2 | 15 | 100-15156 | I and S |
These sequences still have some gaps. Gene sequences with gaps in the coding region were not used for the coding region (ORF) analysis for that specific gene.
The sample type indicates whether the genome was sequenced from a clinical isolate (I), a clinical specimen (S) or a combination of the two (I and S).
Sequence of HPIV-1/WI/629-005/1997 has gaps at nts 8270–9308, 9520–10136, 10172–10175, 10640–10701, 11474–11883, 12475–12507 and 13196–13590.
Sequence of HPIV-1/WI/629-D00057/2009 has gaps at nts 2199–2262, 8172–8653, 9750–10223, and 11823–11926.
Sequence of HPIV-1/WI/629-D01250/2009 has gaps at nts 8271–8802.
Sequence of HPIV-1/WI/629-D01774/2009 has gaps at nts 2301–2326, 3671–3772, 4070–4253, and 11906–13627.
Sequence of HPIV-1/WI/629-D02211/2010 has gaps at nts 2995–3041, 3676–3740, 6408–7517, and 15157–15445.
HPIV-1 Primer Information for Amplification and Sequencing.
| Name | Sequence | % GC | Tm | Fragment |
|
|
| 37 | 56.8 | 1 |
|
|
| 43.5 | 55.9 | 2 |
|
|
| 45.8 | 56.6 | 1 & 2 |
|
|
| 50 | 60 | 3 |
|
|
| 50 | 59.3 | 3 |
|
|
| 50 | 58.8 | 4 |
|
|
| 50 | 56.9 | 4 |
|
|
| 50 | 59.6 | 5 |
|
|
| 50 | 60 | 5 |
|
|
| 50 | 58.9 | 6 |
|
|
| 50 | 56.9 | 6 |
|
|
| 56.5 | 58.9 | 7 |
|
|
| 50 | 59.9 | 7 |
|
|
| 50 | 58.2 | 8 |
|
|
| 50 | 59.9 | 8 |
|
|
| 50 | 59 | 9 |
|
|
| 50 | 56.9 | 9 |
|
|
| 45.8 | 57.6 | 10 |
|
|
| 36 | 51.5 | 10 |
| P1R278+24 |
| 50 | 59 | |
| P1F710−24 |
| 50 | 60.1 | |
| P1R1865+24 |
| 50 | 59.7 | |
| P1F2783−24 |
| 50 | 59.8 | |
| P1R3764+27 |
| 48.1 | 57.8 | |
| P1F4630−24 |
| 50 | 59.6 | |
| P1R5775+24 |
| 45.8 | 56.6 | |
| P1F4455−24 |
| 50 | 58.8 | |
| P1R6065+24 |
| 50 | 59.1 | |
| P1F6803−24 |
| 45.8 | 58.6 | |
| P1F7677−23 |
| 40.9 | 53.9 | |
| P1R7856+26 |
| 50 | 58.7 | |
| P1F8740−21 |
| 47.6 | 55.9 | |
| P1F8513−24 |
| 50 | 60.2 | |
| P1R8819+23 |
| 43.5 | 55.7 | |
| P1F9519−23 |
| 47.8 | 57.2 | |
| P1R9760+23 |
| 47.8 | 57 | |
| P1R9667+22 |
| 45.5 | 55.5 | |
| P1F10576−24 |
| 50 | 57.3 | |
| P1R10701+23 |
| 47.8 | 58.6 | |
| P1R11565+24 |
| 50 | 58.8 | |
| P1F12388−24 |
| 50 | 58 | |
| P1R13287+24 |
| 50 | 57.6 | |
| P1F14090−25 |
| 44 | 56.1 | |
| P1R15017+23 |
| 43.5 | 55.8 |
Bold font represents oligonucleotides used for amplification. All primers were used for sequencing. The primer names are designed as follows: 1) The “P1” at the beginning indicates that these are HPIV-1 primers, 2) The “F” or “R” located at the third position of the primer name indicates whether the primer is in the forward or reverse orientation, 3) the number between the “F” or “R” and the hyphen corresponds to the nucleotide position of the 5′ end of the oligonucleotide in the HPIV-1 genome, and 4) adding or subtracting the number at the end of the primer name from the nucleotide position of the 5′ end of the oligonucleotide is the nucleotide position of the 3′ end of the oligonucleotide.
Figure 1The phylogenetic relationships of the 40 genomes and sequence AF457102 using the BEAST program.
The phylogeny of 40 recently sequenced HPIV-1 genomes and one sequence from GenBank (AF457102) was estimated using a Bayesian Markov Chain Monte Carlo (MCMC) method with a strict molecular clock. Strain HPIV-1/c-35/1957 showed the greatest distance from the 39 1997–2010 Milwaukee viruses which is consistent with its isolation time. Colored rectangles (labeled clade 1–3) represent the three clades of the 39 1997–2010 Milwaukee viruses. The sequence from strain HPIV-1/WI/629-007/1997 is a singleton. The scale bar shows the unit for branch age. The numbers following the underscore in each name represent the collection date in number of years since collection date of the oldest HPIV-1 strain. To make the figure more legible identical sequences were removed from the table.
Figure 2The phylogenetic relationships of HN genes including 37 present sequences and 221 sequences from GenBank.
The phylogeny of 258 HN gene sequences was analyzed using a Bayesian Markov Chain Monte Carlo (MCMC) method with a strict molecular clock in the BEAST program. Colored rectangles labeled clades 1–3 represent the three clades identified in Figure 1 containing the 36 HPIV-1 sequences from the present study. Colored rectangles labeled genotypes C and D represent the HN gene sequences from HPIV-1 viruses collected in Milwaukee, WI in 1991. The number following the underscore in each name represents the collection date in years since collection date of the oldest HPIV-1 strain. Branch color corresponds to clade name/HN sequence seen in Figure 3.
Figure 3Amino acid sequence of the HN protein across all HPIV-1 clades.
This figure shows the amino acid sequence of the HN protein across all HPIV-1 clades. Amino acid substitutions were determined with the MacClade program and the consensus HN gene sequence for each clade. The color of the branches in Figure 2 correspond to the color of the clade name in this figure indicating which viruses have each of the sequences listed above.
Clade-Scale Amino Acid Substitutions from the N, P, M, F and L Proteins of 40 Recently Sequenced HPIV-1 Complete Genomes (the HN protein is described in greater detail in Table 4 and is not included here).
| Protein | Substitutions by clades | Multiple substitutions | ||
| from | Clade 1 | Clade 2 | Clade 3 | |
|
| None | None | N498S | None |
|
| N56S, P123S, P125L, | G110D, D185N, P231Q, S247P, K270R, T282A, V534A | N70D, I90V, I111V, P141S, S162F, P182S, T267I, S268G, S/P274L, S321G, S324P, V534A | F8S/L |
|
| None | None | None | None |
|
| E5K | N321S | T493K, V526I, R546K | F8L/I |
|
| M718I, L1605F | N199D, I210F, V2165I | Q4L, V93I, T728A, I1176V, V1506I, N1598H, K1747R, T2209I | None |
These residues have more than two amino acid variations.
Amino Acid substitutions of the Seven Regions of HN Gene (calculated using all available sequences from GenBank).
| Region Name | Location of region | Length of the region (Residues) | No. of Substitutions in each region | Ratio of substitution/residue |
|
| 1–35 | 35 | 4 | 0.114 |
|
| 36–60 | 25 | 4 | 0.160 |
|
| 61–191 | 131 | 9 | 0.069 |
|
| 192–341 | 150 | 5 | 0.033 |
|
| 342–457 | 116 | 12 | 0.103 |
|
| 458–504 | 47 | 3 | 0.064 |
|
| 505–575 | 71 | 11 | 0.155 |
Genetic Diversity Across the HPIV-1 Genome.
| Region | Positions | Sites analyzed | Polymorphic (Segregating) Sites | Average nucleotide differences ( | Nucleotide Diversity ( | Theta/Site from S |
| 5′ Leader | 1–55 | N/A | 0 | 0.000 | 0.000 | 0.000 |
| N 5′ NCR | 56–116 | 17 | 7 | 1.319 | 0.078 | 0.099 |
| N Gene | 120–1694 | 1575 | 157 | 23.444 | 0.015 | 0.024 |
| N 3′ NCR | 1695–1737 | 43 | 7 | 1.438 | 0.033 | 0.039 |
| N/P – IG Spacer | 1738–1740 | 3 | 0 | 0.000 | 0.000 | 0.000 |
| P 5′ NCR | 1741–1843 | 103 | 13 | 2.073 | 0.020 | 0.030 |
| P Gene | 1844–3550 | 1706 | 233 | 39.595 | 0.023 | 0.033 |
| P 3′ NCR | 3551–3633 | 83 | 14 | 2.454 | 0.030 | 0.041 |
| P/M – IG Spacer | 3634–3636 | 3 | 0 | 0.000 | 0.000 | 0.000 |
| M 5′ NCR | 3637–3668 | 32 | 5 | 0.383 | 0.012 | 0.038 |
| M Gene | 3669–4715 | 1046 | 113 | 17.873 | 0.017 | 0.026 |
| M 3′ NCR | 4716–4809 | 94 | 16 | 2.506 | 0.027 | 0.041 |
| M/F – IG Spacer | 4810–4812 | 3 | 0 | 0.000 | 0.000 | 0.000 |
| F 5′ NCR | 4813–5087 | 269 | 80 | 13.433 | 0.050 | 0.072 |
| F Gene | 5088–6755 | 1667 | 179 | 29.105 | 0.017 | 0.026 |
| F 3′ NCR | 6756–6843 | 88 | 17 | 2.433 | 0.028 | 0.047 |
| F/HN – IG Spacer | 6844–6846 | 3 | 0 | 0.000 | 0.000 | 0.000 |
| HN 5′ NCR | 6847–6902 | 56 | 14 | 1.663 | 0.030 | 0.060 |
| HN Gene | 6903–8630 | 1728 | 209 | 35.375 | 0.020 | 0.029 |
| HN 3′ NCR | 8631–8740 | 109 | 23 | 5.429 | 0.050 | 0.051 |
| HN/L – IG Spacer | 8741–8743 | 3 | 0 | 0.000 | 0.000 | 0.000 |
| L 5′ NCR | 8744–8771 | 28 | 1 | 0.056 | 0.002 | 0.009 |
| L Gene | 8772–15443 | 6668 | 588 | 96.384 | 0.014 | 0.021 |
| L 3′ NCR | 15444–15543 | 3 | 1 | 0.356 | 0.119 | 0.080 |
| Tail Sequence | 15544–15606 | N/A | ||||
| WHOLE GENOME | 1–15606 | 15333 | 1677 | 275.319 | 0.018 | 0.026 |
Sites could not be analyzed due to missing sequence data.
Partial sequence of this region was not analyzed due to missing sequence data. The calculated nucleotide difference and diversity is not the accurate representation of the 5′ NCR.
Figure 4Sliding Scale Analysis of Nucleotide Diversity of the HPIV-1 Genome.
Analysis of nucleotide diversity was done in 100 nt windows in 25 nt increments. A table showing each window and its diversity can be seen in supplementary document 1. A schematic of the HPIV-1 genome can be found at the bottom of the figure. The coding region of each gene is labeled.
Nonsynonymous (Ka) vs. Synonymous Mutations in the Coding Region of HPIV-1 Genes.
| Region | Positions | Average Ka | Average Ks | Ka/Ks |
| N Gene | 120–1694 | 0.002 | 0.060 | 0.040 |
| P Gene | 1844–3550 | 0.014 | 0.061 | 0.233 |
| M Gene | 3669–4715 | 0.002 | 0.067 | 0.034 |
| F Gene | 5088–6755 | 0.004 | 0.062 | 0.071 |
| HN Gene | 6903–8630 | 0.011 | 0.057 | 0.185 |
| L Gene | 8772–15443 | 0.002 | 0.060 | 0.041 |
| ALL GENES | 1–15606 | 0.005 | 0.060 | 0.083 |