| Literature DB >> 21507270 |
Wenfu Li1, Weifeng Shi, Huijie Qiao, Simon Y W Ho, Arong Luo, Yanzhou Zhang, Chaodong Zhu.
Abstract
BACKGROUND: Since its emergence in March 2009, the pandemic 2009 H1N1 influenza A virus has posed a serious threat to public health. To trace the evolutionary path of these new pathogens, we performed a selection-pressure analysis of a large number of hemagglutinin (HA) and neuraminidase (NA) gene sequences of H1N1 influenza viruses from different hosts.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21507270 PMCID: PMC3094300 DOI: 10.1186/1743-422X-8-183
Source DB: PubMed Journal: Virol J ISSN: 1743-422X Impact factor: 4.099
Positively selected sites in hemagglutinin from viruses from different clusters
| Dataset | Description | Number of sequences | Length of alignment (bp) | Number of sites under positive selection | |||||
|---|---|---|---|---|---|---|---|---|---|
| HA1 | HA2 | ||||||||
| SLAC | FEL | SLAC | FEL | SLAC | FEL | ||||
| 1.1 | Avian strains | 75 | 1647 | 0 | 0 | ||||
| 1.2 | North American swine strains | 196 | 1647 | 0 | 1 | 138 | |||
| 1.3 | Eurasian swine strains | 90 | 1647 | 0 | 1 | 399 | |||
| 1.4 | Seasonal human strains | 1404 | 1647 | 8 | 8 | 82B,94B, 141B,162BG,186B,187BR,222BR | 82B,94B,160BG,162BG,186B,187BR,222BR | 451T | 451T |
| 1.5 | Pandemic 2009 human strains | 1891 | 1647 | 7 | 9 | 186B,197,203,205B,222BR,223B,261BT | 186B,197,203,222BR,261BT | 411T,451T,460T,530T | |
1In these columns, B indicates that the site lies in the B-cell antigenic regions [18]. G means that it is a potential glycosylation site. T indicates that the site lies in the T-cell antigenic regions [19]. R indicates that it is a receptor-binding site [11]. We use the same numbering strategy as Deem and Pan [18] and start numbering from the amino acids DTLC.
Figure 1Selection pressure on HA genes from different clusters. Selection pressures for the global ω of HA genes of each cluster. Error bar shows 95% confidence interval.
Figure 2Selection pressure on NA genes from different clusters. Selection pressures for the global ω of NA genes of each cluster. Error bar shows 95% confidence interval.
Positively selected sites in neuraminidase from viruses from different clusters
| Dataset | Description | Number of sequences | Length of alignment (bp) | Number of sites under positive selection | |||
|---|---|---|---|---|---|---|---|
| SLAC | FEL | SLAC | FEL | ||||
| 2.1 | Avian strains | 137 | 1368 | 0 | 0 | ||
| 2.2 | North American swine strains | 171 | 1368 | 3 | 7 | 46TG, 396B,453T | 46TG,53T,75,127,285,339B,453T |
| 2.3 | Eurasian swine strains | 92 | 1368 | 0 | 1 | 75 | |
| 2.4 | Seasonal human strains | 1735 | 1371 | 5 | 6 | 84,151,222,275A,382 | 84,151,275A,344B,365BG,382 |
| 2.5 | Pandemic 2009 human strains | 1397 | 1368 | 0 | 2 | 35,453T | |
1In this column, B indicates that the site lies in the B-cell antigenic regions [21]. G means that it is a potential glycosylation site. T indicates that the site lies in the T-cell antigenic regions [19]. We start numbering from the amino acids MNPN. A means that the site is associated with drug-resistance [7].
Sites under differential selection between isolates from seasonal human and the pandemic 2009 clusters
| Protein | P-value | |||
|---|---|---|---|---|
| Pandemic 2009 strains | Seasonal human strains | |||
| HA | 34BT | 0.00286 | E1886/G2/X2/K1 | E1404 |
| 39 | 0.00054 | G1887/E3/R1 | G1404 | |
| 86B | 0.00233 | D1856/G31/N2/Y1/E1 | E1383/K19/G1/Q1 | |
| 94B | 0.00014 | D1885/N4/E1/X1 | Y769/H574/D38/N19/Q2/A1/R1 | |
| 153BT | 0.00786 | K1891 | G1132/E202/R28/K20/X18/V4 | |
| 160BG | 0.00073 | K1890/E1 | N1345/K27/X12/S7/T7/D2/E2/A1/I1 | |
| 187BR | 0.00445 | D1884/G5/X2 | D1072/X135/N132/V30/E23/G9/A2/I1 | |
| 197 | 0.00073 | A1851/T23/S17 | A1401/T3 | |
| 202B | 0.00318 | G1886/X4/W1 | V1378/L17/A5/M4 | |
| 203 | 0.00828 | T1341/S542/X7/A1 | S1392/T10/F1/X1 | |
| 224B | 0.00973 | E1885/Q1/X5 | E1342/A45/H5/P5/K4/T2/S1 | |
| 234 | 0.00628 | V1883/I7/L1 | L1404 | |
| 237B | 0.00445 | G1890/X1 | G1379/R19/E6 | |
| 250T | 0.00002 | V1884/I4/A2/L1 | A1404 | |
| 282 | 0.00653 | P1872/L10/S7/T1/X1 | P1404 | |
| 302B | 0.00098 | K1880/E8/N1/T1/R1 | E1403/G1 | |
| 339 | 0.00174 | G1888/R2/E1 | G1404 | |
| 374 | 0.00550 | E1608/K275/G8 | G1398/R4 | |
| 391 | 0.00564 | T1888/A2/I1 | T1404 | |
| 399 | 0.00303 | H1890/X1 | K1383/N16/R3/E1/T1 | |
| 430T | 0.00605 | E1889/D1/K1 | E1404 | |
| 473T | 0.00803 | N1890/D1 | D1001/N403 | |
| 527T | 0.00513 | V1873/I12/A2/L2/X2 | L1403/V1 | |
| 541T | 0.00052 | G1866/W12/H6/M3/Y1/-3 | G1404 | |
| NA | 2 | 0.00001 | N1313/-79/I2/S1/X1 | N1532/-200/K3 |
| 6 | 0.00009 | K1350/-42/N3/R3 | K1634/R14/-87 | |
| 35 | 0.00151 | S1392/G2/C1/I1/V1 | S1733/N1/G1 | |
| 52TG | 0.00510 | S1397 | R1686/K17/S17/N9/G4/X1/-1 | |
| 254 | 0.00188 | K1396/X1 | K1581/R154 | |
| 257 | 0.00119 | R1388/K8/X1 | K1735 | |
1In this column, B indicates that the site lies in the B-cell antigenic regions [18]. G means that it is a potential glycosylation site. T indicates that the site lies in the T-cell antigenic regions [19]. R indicates that it is a receptor-binding site [11]. We use the same numbering strategy for HA as Deem and Pan [18] and start numbering from the amino acids DTLC. The NA numbering starts from MNPN.
2In these columns, capital letters stand for amino acids and numbers following them indicate the number of times they occur in the alignment. X indicates codons that are not translated properly.
Sites in hemagglutinin under differential selection between isolates from North American swine and the pandemic 2009
| P-value | |||
|---|---|---|---|
| North American swine | Pandemic 2009 | ||
| 31T | 0.00512 | N196 | N1879/D8/X3/S1 |
| 32T | 0.00222 | L196 | L1795/I94/X2 |
| 34BT | 0.00190 | E196 | E1886/G2/X2/K1 |
| 39 | 0.00266 | G196 | G1887/E3/R1 |
| 48B | 0.00008 | A196 | A1879/X6/T3/V1/P1/S1 |
| 154BT | 0.00719 | K195/R1 | K1885/E2/X2/N1/T1 |
| 189B | 0.00876 | Q192/E3/L1 | Q1891 |
| 197 | 0.00014 | A192/T3/S1 | A1851/T23/S17 |
| 203 | 0.00519 | S178/T17/P1 | T1341/S542/X7/A1 |
| 205B | 0.00828 | K166/R20/T8/Q2 | R1852/K32/G2/X2/T1 |
| 207B | 0.00286 | N144/S43/Y5/D4 | S1891 |
| 223BR | 0.00199 | Q196 | Q1858/R18/X15 |
| 232 | 0.00890 | T196 | T1887/A1/I1/K1/X1 |
| 263B | 0.00231 | S196 | S1889/F1/P1 |
| 304 | 0.00468 | P196 | P1875/S15/X1 |
| 306B | 0.00929 | Y196 | Y1888/H3 |
| 374 | 0.00001 | G183/R13 | E1608/K275/G8 |
| 411T | 0.00062 | V196 | V1839/I51/X1 |
| 427T | 0.00159 | V196 | V1886/I5 |
| 434T | 0.00550 | T196 | T1885/N4/X2 |
| 458T | 0.00687 | K196 | K1885/N3/R2/E1 |
| 478T | 0.00558 | S196 | S1889/G1/N1 |
| 479T | 0.00355 | V180/I16 | V1891 |
| 530T | 0.00409 | L196 | L1889/G1/X1 |
| 547T | 0.00645 | I195/-1 | I1847/-22/K8/V5/T4/T3/C1/X1 |
1In this column, B indicates that the site lies in the B-cell antigenic regions [18]. T indicates that the site lies in the T-cell antigenic regions [19]. R indicates that it is a receptor-binding site [11]. We use the same numbering strategy as Deem and Pan [18] and start numbering from the amino acids DTLC.
2In these columns, capital letters stand for amino acids and numbers following them indicate the number of times they occur in the alignment. X indicates codons that are not translated properly.
Sites in neuraminidase under differential selection between isolates from Eurasian swine and the pandemic 2009
| P-value | |||
|---|---|---|---|
| Eurasian Swine | Pandemic 2009 | ||
| 35 | 0.00312 | S92 | S1392/G2/C1/I1/V1 |
| 321T | 0.00425 | I58/V34 | I1395/X2 |
| 381 | 0.00634 | T92 | T1391/I4/N2 |
| 452T | 0.00049 | T92 | T1385/-7/A2/I1/S1/N1 |
| 453T | 0.00003 | V92 | V1383/-8/M4/G2 |
1In this column, T indicates that the site lies in the T-cell antigenic regions [19]. The NA numbering starts from MNPN.
2In these columns, capital letters stand for amino acids and numbers following them indicate the number of times they occur in the alignment. X indicates codons that are not translated properly.