| Literature DB >> 32064792 |
Diane Eisler1, Dan Fornika1,2, Lauren C Tindale1,2, Tracy Chan1, Suzana Sabaiduc1, Rebecca Hickman1, Catharine Chambers1, Mel Krajden1,2, Danuta M Skowronski1,2, Agatha Jassem1,2, William Hsiao1,2.
Abstract
Influenza viruses continually evolve to evade population immunity, and the different lineages are assigned into clades based on shared mutations. We have developed a publicly available computational workflow, the Influenza Classification Suite, for rapid clade mapping of sequenced influenza viruses. This suite provides a user-friendly workflow implemented in Galaxy to automate clade calling and antigenic site extraction. Workflow input includes clade definition and amino acid index array files, which can be customized to identify any clades of interest. The Influenza Classification Suite provides rapid, high-resolution understanding of circulating influenza strain evolution to inform influenza vaccine effectiveness and the need for potential vaccine reformulation.Entities:
Keywords: Galaxy; Influenza; data analysis; genomics; workflow
Mesh:
Year: 2020 PMID: 32064792 PMCID: PMC7182599 DOI: 10.1111/irv.12722
Source DB: PubMed Journal: Influenza Other Respir Viruses ISSN: 1750-2640 Impact factor: 4.380
Description of Galaxy tools in the Influenza Classification Suite
| Galaxy tool name | Workflow description |
|---|---|
| Assign Clades | Uses a clade definition file to assign and append clade designations to sequence names in influenza FASTA files. |
| Antigenic Site Extraction | Uses an influenza subtype‐specific amino acid index array to extract antigenic amino acids from influenza sequences and output to FASTA. |
| Line List | Transforms FASTA files of influenza antigenic maps into line lists. |
| Aggregate Line List | Transforms FASTA files of influenza antigenic maps into line lists, summarizing occurrences of each sequevar. |
| Change FASTA Deflines | Changes sequence names in a FASTA file, according to old and new names specified in a text (.csv ortxt) file. |
Figure 1Flowchart illustrating the Influenza Classification Suite in Galaxy. We have created an automated workflow as shown in the center diagram, although all tools can also be run independently
Figure 2Cladogram demonstrating how child clades evolve from parent clades over time. Underlined bolded numbers represent the depth parameter as specified in the clade definition file
Example of clade definition file format (.csv). A template is available to use and edit in our GitHub repository
| Clade Name | Depth | AA | AA Identity 1 | … | AA Position N | AA Identity N |
|---|---|---|---|---|---|---|
| A | 1 | 3 | I | … | 171 | K |
| B | 1 | 3 | I | … | 171 | K |
| A1 | 2 | 3 | I | … | 225 | D |
| A2 | 2 | 3 | I | … | 160 | T |
| B1 | 2 | 3 | I | … | 160 | T |
| B2 | 2 | 3 | I | … | 188 | M |
| A2.1 | 3 | 3 | I | … | 171 | A |
| A2.2 | 3 | 24 | K | … | 188 | R |
AA, amino acid.
Depth is an integer greater than 0, defining the relative ancestry of the clades (eg, parent clade depth = 2 and child clade depth = 3).