| Literature DB >> 25876137 |
Unitsa Sangket1, Sukanya Vijasika1, Hasnee Noh1, Wasun Chantratita2, Chonticha Klungthong3, In Kyu Yoon3, Stefan Fernandez3, Wiriya Rutvisuttinunt3.
Abstract
Influenza virus (IFV) can evolve rapidly leading to genetic drifts and shifts resulting in human and animal influenza epidemics and pandemics. The genetic shift that gave rise to the 2009 influenza A/H1N1 pandemic originated from a triple gene reassortment of avian, swine and human IFVs. More minor genetic alterations in genetic drift can lead to influenza drug resistance such as the H274Y mutation associated with oseltamivir resistance. Hence, a rapid tool to detect IFV mutations and the potential emergence of new virulent strains can better prepare us for seasonal influenza outbreaks as well as potential pandemics. Furthermore, identification of specific mutations by closely examining single nucleotide polymorphisms (SNPs) in IFV sequences is essential to classify potential genetic markers associated with potentially dangerous IFV phenotypes. In this study, we developed a novel R library called "SNPer" to analyze quantitative variants in SNPs among IFV subpopulations. The computational SNPer program was applied to three different subpopulations of published IFV genomic information. SNPer queried SNPs data and grouped the SNPs into (1) universal SNPs, (2) likely common SNPs, and (3) unique SNPs. SNPer outperformed manual visualization in terms of time and labor. SNPer took only three seconds with no errors in SNP comparison events compared with 40 hours with errors using manual visualization. The SNPer tool can accelerate the capacity to capture new and potentially dangerous IFV strains to mitigate future influenza outbreaks.Entities:
Mesh:
Year: 2015 PMID: 25876137 PMCID: PMC4395159 DOI: 10.1371/journal.pone.0122812
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Detected SNPs in HA gene fragment from sample #VIROAF1 (“AF1_HA.csv”).
|
|
|
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|---|---|
|
|
| VIROAF1 | H3N2_CY121792_HA.seq |
| 13099 | SNP |
| 1 | 426 | LowGQ |
|
|
| VIROAF1 | H3N2_CY121792_HA.seq |
| 16707 | SNP |
| 1 | 567 | LowGQ |
|
|
| VIROAF1 | H3N2_CY121792_HA.seq |
| 23020 | SNP |
| 1 | 864 | LowGQ |
|
|
| VIROAF1 | H3N2_CY121792_HA.seq |
| 23194 | SNP |
| 1 | 872 | LowGQ |
|
|
| VIROAF1 | H3N2_CY121792_HA.seq |
| 23154 | SNP |
| 1 | 852 | LowGQ |
|
|
| VIROAF1 | H3N2_CY121792_HA.seq |
| 17047 | SNP |
| 1 | 620 | LowGQ |
|
|
| VIROAF1 | H3N2_CY121792_HA.seq |
| 16337 | SNP |
| 1 | 580 | LowGQ |
|
|
| VIROAF1 | H3N2_CY121792_HA.seq |
| 21306 | SNP |
| 1 | 791 | LowGQ |
|
|
| VIROAF1 | H3N2_CY121792_HA.seq |
| 20528 | SNP |
| 1 | 760 | LowGQ |
|
|
| VIROAF1 | H3N2_CY121792_HA.seq |
| 20446 | SNP |
| 1 | 768 | LowGQ |
Each line describes the SNPs at each position in the HA gene fragment of VIROAF1. For instance, position 51 of the HA gene in VIROAF1 is G while the reference allele is A.
Fig 1An example of SNPer usage.
The user loads the SNPer library to current environment before using the SNPer function.
Fig 2SNPs comparison workflow.
SNPer, an R library, analyzes SNPs data using RMySQL and MySQL producing SNPs comparison data as its output.
List of SNP input files ("input_files_list.csv") to be compared by SNPer.
|
|
|
|
|---|---|---|
| AF1_HA.csv | AF2_HA.csv | AF6_HA.csv |
| AF1_M.csv | AF2_M.csv | AF6_M.csv |
| AF1_NA.csv | AF2_NA.csv | AF6_NA.csv |
| AF1_NEP.csv | AF2_NEP.csv | AF6_NEP.csv |
| AF1_NP.csv | AF2_NP.csv | AF6_NP.csv |
| AF1_PA.csv | AF2_PA.csv | AF6_PA.csv |
| AF1_PB1.csv | AF2_PB1.csv | AF6_PB1.csv |
| AF1_PB2.csv | AF2_PB2.csv | AF6_PB2.csv |
SNPs from eight fragments of sp1, sp 2 and sp3 were compared by SNPer. Each computational SNPs comparison among three viral subpopulations was conducted according to the name of the files listed in each row. For instance, for row #1, SNPer compared the SNPs data in file “AF1_HA.csv” of population 1 (sp1), “AF2_HA.csv” file from population 2 (sp2) and “AF6_HA.csv” file from population 3 (sp3).
The required table structure of the input data for SNPer computational analysis based on MySQL.
|
|
|
|
|
|
|
|---|---|---|---|---|---|
| id | int(5) | NO | PRI | NULL | |
| sample_id | char(20) | YES | NULL | ||
| sample_name | char(20) | YES | NULL | ||
| chr | char(100) | YES | NULL | ||
| position | int(20) | YES | NULL | ||
| score | int(20) | YES | NULL | ||
| variant_type | char(20) | YES | NULL | ||
| call_ | char(8) | YES | NULL | ||
| frequency | int(5) | YES | NULL | ||
| depth | int(10) | YES | NULL | ||
| filter | char(20) | YES | NULL |
The table structure of the input data in the SNPer database can be retrieved by sql command (DESCRIBE
Fig 3Comparison of SNPs from three different IFV subpopulations.
Area A contains universal SNPs. Areas B, C and D consist of likely common SNPs. Areas E, F and G contain unique SNPs.
The outputs of SNPer for the eight fragments of the three different IFVs (“summary.csv”).
|
|
|
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|---|---|
| AF1_HA_vs_AF2_HA_vs_AF6_HA | 27 | 22 | 29 | 12 | 6 | 10 | 8 | 0 | 0 | 9 |
| AF1_M_vs_AF2_M_vs_AF6_M | 8 | 6 | 7 | 4 | 1 | 2 | 0 | 0 | 0 | 3 |
| AF1_NA_vs_AF2_NA_vs_AF6_NA | 13 | 12 | 16 | 5 | 3 | 6 | 5 | 0 | 1 | 5 |
| AF1_NEP_vs_AF2_NEP_vs_AF6_NEP | 5 | 4 | 5 | 2 | 0 | 2 | 0 | 0 | 0 | 3 |
| AF1_NP_vs_AF2_NP_vs_AF6_NP | 11 | 9 | 10 | 3 | 3 | 6 | 2 | 0 | 0 | 5 |
| AF1_PA_vs_AF2_PA_vs_AF6_PA | 15 | 8 | 18 | 3 | 3 | 5 | 6 | 0 | 0 | 9 |
| AF1_PB1_vs_AF2_PB1_vs_AF6_PB1 | 18 | 21 | 21 | 9 | 3 | 11 | 5 | 0 | 1 | 6 |
| AF1_PB2_vs_AF2_PB2_vs_AF6_PB2 | 31 | 16 | 26 | 11 | 7 | 5 | 2 | 0 | 0 | 13 |
Each row shows the number of SNPs of VIROAF1 (sp1), VIROAF2 (sp2) and VIROAF6 (sp3) for each fragment after comparison by SNPer. For example, in the first row, the HA fragment contains 27 SNPs in VIROAF1, 22 SNPs in VIROAF2 and 29 SNPs in VIROAF6. Twelve universal SNPs are in VIROAF1, VIROAF2, and VIROAF6. There are six SNPs in only VIROAF1, 10 SNPs in only VIROAF2 and eight SNPs in only VIROAF6. Only nine SNPs exist in both VIROAF6 and VIROAF1.(1)
Fig 4The SNPs composition output chart of three HA sequences from three IFV subpopulations.
Each circle represents the number of SNPs of VIROAF1 (sp1), VIROAF2 (sp2) and VIROAF6 (sp3) for the HA fragment; universal SNPs; unique SNPs; and likely common SNPs.
The allelic list of the universal SNPs found in HA fragment (“AF1_HA_vs_AF2_HA_vs_AF3_HA_sim_all.csv”).
| id | sample_id | sample_name | id | sample_id | sample_name | id | sample_id | sample_name | position | call_ |
|---|---|---|---|---|---|---|---|---|---|---|
| 8 | VIROAF1 | VIROAF1 | 5 | VIROAF2 | VIROAF2 | 6 | VIROAF6 | VIROAF6 | 405 | A->[G/G] |
| 9 | VIROAF1 | VIROAF1 | 6 | VIROAF2 | VIROAF2 | 7 | VIROAF6 | VIROAF6 | 413 | G->[A/A] |
| 11 | VIROAF1 | VIROAF1 | 9 | VIROAF2 | VIROAF2 | 11 | VIROAF6 | VIROAF6 | 482 | A->[G/G] |
| 13 | VIROAF1 | VIROAF1 | 10 | VIROAF2 | VIROAF2 | 14 | VIROAF6 | VIROAF6 | 629 | C->[T/T] |
| 14 | VIROAF1 | VIROAF1 | 11 | VIROAF2 | VIROAF2 | 15 | VIROAF6 | VIROAF6 | 640 | G->[T/T] |
| 17 | VIROAF1 | VIROAF1 | 13 | VIROAF2 | VIROAF2 | 16 | VIROAF6 | VIROAF6 | 715 | G->[A/A] |
| 19 | VIROAF1 | VIROAF1 | 16 | VIROAF2 | VIROAF2 | 21 | VIROAF6 | VIROAF6 | 973 | A->[G/G] |
| 21 | VIROAF1 | VIROAF1 | 17 | VIROAF2 | VIROAF2 | 22 | VIROAF6 | VIROAF6 | 1195 | A->[C/C] |
| 23 | VIROAF1 | VIROAF1 | 18 | VIROAF2 | VIROAF2 | 25 | VIROAF6 | VIROAF6 | 1323 | A->[G/G] |
| 24 | VIROAF1 | VIROAF1 | 19 | VIROAF2 | VIROAF2 | 26 | VIROAF6 | VIROAF6 | 1341 | T->[G/G] |
| 26 | VIROAF1 | VIROAF1 | 21 | VIROAF2 | VIROAF2 | 28 | VIROAF6 | VIROAF6 | 1606 | C->[T/T] |
| 27 | VIROAF1 | VIROAF1 | 22 | VIROAF2 | VIROAF2 | 29 | VIROAF6 | VIROAF6 | 1671 | A->[G/G] |
Twelve universal SNPs detected among three subpopulations in HA fragment are A405G, G413A, A482G, C629T, G640T, G715A, A973G, A1195C, A1323G, C1606T, A1671G when compared to the influenza A/H3N2 reference (GenBank CY121792). Each row displays each SNP position.