| Literature DB >> 27439701 |
Shunpu Zhang1, Zhong Li2, Kevin Beland3, Guoqing Lu4.
Abstract
BACKGROUND: Clustering is a common technique used by molecular biologists to group homologous sequences and study evolution. There remain issues such as how to cluster molecular sequences accurately and in particular how to evaluate the certainty of clustering results.Entities:
Keywords: Bootstrap; Certainty; Influenza A hemagglutinin (HA); Model-based clustering; Multidimensional scaling
Mesh:
Substances:
Year: 2016 PMID: 27439701 PMCID: PMC4955158 DOI: 10.1186/s12859-016-1147-x
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Mardia criterion for selecting d, the number of dimensions for MDS
Fig. 2The 3D MDS plot of highly pathogenic avian influenza (HPAI) H5N1 HA sequences
Fig. 3The BIC values corresponding to different numbers of clusters for highly pathogenic avian influenza (HPAI) H5N1 HA sequences
Fig. 4The Mclust results from 3D-MDS location data of highly pathogenic avian influenza (HPAI) H5N1 HA sequences
Fig. 5The 3D representation of highly pathogenic avian influenza (HPAI) H5N1 HA sequences
The certainties of clusters and overall clustering for highly pathogenic avian influenza HPAI H5N1 HA sequences
| Cluster ID | Cluster | Overall | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ||
| Subset-bootstrap (7.5 %) | 0.93 | 1.00 | 0.98 | 0.97 | 0.92 | 0.96 | 1.00 | 0.99 | 1.00 | 0.95 |
| Standard bootstrap method | 0.76 | 0.99 | 0.72 | 0.90 | 0.67 | 0.66 | 0.83 | 0.76 | 0.97 | 0.69 |
Fig. 6The BIC values corresponding to different numbers of clusters for influenza H7 HA sequences
Fig. 7The 3D representation of 10 clusters for influenza H7 HA sequences
Certainties of influenza A (H7) HA sequences assigned to a specific cluster a
| Cluster ID | Strain Name | Certainty |
|---|---|---|
| 1 | A/chicken/NJ/17206/99 | 0.88 |
| A/Goose/New_Jersey/8600–3/98 | 0.97 | |
| 2 | A/chicken/FL/90348–4/01 | 0.54 |
| A/avian/NY/74211–2/00 | 0.98 | |
| A/chicken/Pennsylvania/143586/2002 | 0.99 | |
| A/avian/NY/81746–5/00 | 0.95 | |
| A/avian/NY/70411–12/00 | 0.99 | |
| A/unknown/NY/85161/2000 | 0.77 | |
| A/chicken/NY/1398–6/99 | 0.97 | |
| A/chicken/NY/22409–4/99 | 0.98 | |
| A/avian/NY/76247–3/00 | 0.99 | |
| A/Chicken/New_Jersey/20621/99 | 0.99 | |
| A/Chicken/NJ/16224–6/99 | 0.99 | |
| 3 | A/mallard/Delaware/418/2005 | 0.96 |
| 6 | A/turkey/England/647/77 | 0.84 |
| 8 | A/swan/Shimane/42/1999 | 0.87 |
| A/turkey/Italy/4479/2004 | 0.73 | |
| A/turkey/Italy/2856/2003 | 0.91 | |
| A/turkey/Germany–NW/R655/2009 | 0.78 | |
| A/turkey/Germany-NW/R655/2009 | 0.78 | |
| A/duck/Mongolia/47/2012 | 0.76 | |
| A/wild_goose/Dongting/PC0360/2012 | 0.80 | |
| A/duck/Fukui/160104/2012 | 0.99 | |
| A/duck/Iwate/0303001/2012 | 0.99 | |
| A/mallard/Poland/01/08 | 0.82 | |
| A/duck/Turkey/55/Cetinkaya/49/2006 | 0.90 | |
| A/teal/Crimea/2027/2008 | 0.98 | |
| 9 | A/duck/Mongolia/720/2007 | 0.57 |
| A/turkey/Italy/3337/2004 | 0.96 | |
| A/quail/Italy/3347/2004 | 0.96 | |
| A/turkey/Italy/4130/2004 | 0.84 | |
| A/turkey/Italy/3439/2004 | 0.89 | |
| A/turkey/Italy/3829/2004 | 0.97 | |
| A/turkey/Italy/3399/2004 | 0.82 | |
| A/turkey/Italy/3477/2004 | 0.87 | |
| A/turkey/Italy/3807/2004 | 0.87 | |
| A/turkey/Italy/4042/2004 | 0.82 | |
| A/turkey/Italy/2685/2003 | 0.59 | |
| A/turkey/Italy/2043/2003 | 0.62 | |
| A/duck/Italy/4609/2003 | 0.87 | |
| A/quail/Italy/4610/2003 | 0.98 | |
| A/chicken/Italy/1285/2000 | 0.98 | |
| A/duck/Denmark/53–147–8/2008 | 0.90 | |
| A/shoveler/Italy/2698–27/2006 | 0.85 | |
| A/mallard/Netherlands/22/2007 | 0.65 | |
| A/mallard/Sweden/95/2005 | 0.96 | |
| A/Mallard/Sweden/S90597/2005 | 0.73 | |
| A/chicken/England/4054/2006 | 0.96 | |
| A/tufted_duck/PT/13771/2006 | 0.82 | |
| A/mute_swan/Hungary/5973/2007 | 0.98 |
a sequences not listed with a certainty of over 0.99
The certainties of clusters and overall clustering for influenza A (H7) HA sequences
| Cluster ID | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | Overall |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Subset bootstrap (10 %) | 0.84 | 0.85 | 0.93 | 0.73 | 0.67 | 0.43 | 0.66 | 0.40 | 0.90 | 0.98 | 0.82 |
| Standard bootstrap method | 0.72 | 0.78 | 0.83 | 0.39 | 0.65 | 0.34 | 0.59 | 0.24 | 0.78 | 0.90 | 0.67 |