| Literature DB >> 27679478 |
Yun Zhang1, Brian D Aevermann1, Tavis K Anderson2, David F Burke3, Gwenaelle Dauphin4, Zhiping Gu5, Sherry He5, Sanjeev Kumar5, Christopher N Larsen6, Alexandra J Lee1, Xiaomei Li5, Catherine Macken7, Colin Mahaffey5, Brett E Pickett1, Brian Reardon1, Thomas Smith5, Lucy Stewart1, Christian Suloway5, Guangyu Sun6, Lei Tong5, Amy L Vincent2, Bryan Walters5, Sam Zaremba5, Hongtao Zhao5, Liwei Zhou5, Christian Zmasek1, Edward B Klem5, Richard H Scheuermann8,9,10.
Abstract
The Influenza Research Database (IRD) is a U.S. National Institute of Allergy and Infectious Diseases (NIAID)-sponsored Bioinformatics Resource Center dedicated to providing bioinformatics support for influenza virus research. IRD facilitates the research and development of vaccines, diagnostics and therapeutics against influenza virus by providing a comprehensive collection of influenza-related data integrated from various sources, a growing suite of analysis and visualization tools for data mining and hypothesis generation, personal workbench spaces for data storage and sharing, and active user community support. Here, we describe the recent improvements in IRD including the use of cloud and high performance computing resources, analysis and visualization of user-provided sequence data with associated metadata, predictions of novel variant proteins, annotations of phenotype-associated sequence markers and their predicted phenotypic effects, hemagglutinin (HA) clade classifications, an automated tool for HA subtype numbering conversion, linkouts to disease event data and the addition of host factor and antiviral drug components. All data and tools are freely available without restriction from the IRD website at https://www.fludb.org.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27679478 PMCID: PMC5210613 DOI: 10.1093/nar/gkw857
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.A phylogenetic tree constructed from a combination of user-provided sequences (downloaded from GISAID) and IRD sequences, and visualized in the IRD tree viewer. Tree leaves are color-coded by subtype. The green brace indicates user-provided sequences colored by user-provided HA and NA subtype metadata.
Figure 2.Variant protein annotations in Influenza Research Database (IRD). (A) The IRD Protein Sequence Search page supports queries based on ‘classical proteins’, ‘variant proteins’ and sequence-associated metadata. (B) A portion of the Protein Sequence Search Results page from a query of PA-X, showing annotations of three PA-X variants: PA-X (+41), PA-X (+61) and PA-X (other). Selected records from this page can be input to any of the analysis tools under the ‘Run Analysis’ dropdown menu (red arrow), or downloaded to a local computer.
Variant protein annotations in IRD
| Variant Protein | Variant Protein from Complete Genomes | Percentage | Source |
|---|---|---|---|
| PB1-F2 | 19 701 | 69.8% | GenBank |
| PB1-N40 | 27 909 | 98.9% | IRD |
| PA-N155 | 28 086 | 99.3% | IRD |
| PA-N182 | 26 099 | 92.2% | IRD |
| PA-X | 27 996 | 98.9% | IRD |
| PA-X protein(+41) | 8199 | 29.0% | GenBank & IRD |
| PA-X protein(+61) | 19 721 | 69.7% | GenBank & IRD |
| PA-X protein(other) | 76 | 0.3% | GenBank & IRD |
| PA-X protein | 2834 | 10.0% | GenBank |
| M42 | 69 | 0.2% | IRD |
| NS3 | 30 | 0.1% | IRD |
Figure 3.Phenotypic Variant Type (PVT) annotation in IRD. (A) A portion of the Strain Details page for A/Nanjing/1/2013 (H7N9) shows that this human isolate carries the PVTs Influenza A_PB2_determinant-of-virulence_591(1)_591K_increased-virulence, which confers increased virulence, and Influenza A_PB2_tissue-tropism_701(1)_701D_Systemic-infection, which confers systemic infection in mouse models, but does not carry Influenza A_PB2_polymerase-activity_28(5)_28I, 274T, 526R, 553V, 607V_Decreased-polymerase-activity, which confers reduced polymerase activity. (B) The Sequence Feature Details page for the Influenza A_PB2_determinant-of-virulence_591(1)_591K_increased-virulence showing the SF metadata and variant type (VT) calculation. Within the IRD database, 88 strains, including 8 human strains, carry this PVT (VT-4). The strain count column links to all strains harboring the corresponding VT. (C) Host and subtype distribution of VT-4 from panel B.
Figure 4.HA subtype numbering conversion in IRD. (A) The HA Subtype Numbering Conversion Result page showing the sequence alignment and mapping table for a query H1 sequence mapped into coordinate space for other HA subtypes. The mapping table was used to map all H1 B cell epitopes to all other subtypes. (B) A schematic view of all experimentally determined H1 B cell epitopes in the HA protein. Epitopes are colored based on the average percent amino acid identity cross all HA subtypes.
Figure 5.Host factor component in IRD. (A) A portion of the Host Factor Experiments page, showing a list of experiments using A/Vietnam/1203/2004 human isolate (VN1203 (H5N1)) as the viral agent. (B) The Host Factor Bioset Patterns table showing the expression pattern of IFNb in this experiment. (C) A portion of the Enrichment Analysis Result page displaying the terms, the collections (GO categories in this case) and the P-values calculated by the CLASSIFI algorithm using a hypergeometric distribution function.