| Literature DB >> 32496546 |
Jorge Oscanoa1, Lavanya Sivapalan1, Emanuela Gadaleta1, Abu Z Dayem Ullah1, Nicholas R Lemoine1, Claude Chelala1.
Abstract
SNPnexus is a web-based annotation tool for the analysis and interpretation of both known and novel sequencing variations. Since its last release, SNPnexus has received continual updates to expand the range and depth of annotations provided. SNPnexus has undergone a complete overhaul of the underlying infrastructure to accommodate faster computational times. The scope for data annotation has been substantially expanded to enhance biological interpretations of queried variants. This includes the addition of pathway analysis for the identification of enriched biological pathways and molecular processes. We have further expanded the range of user directed annotation fields available for the study of cancer sequencing data. These new additions facilitate investigations into cancer driver variants and targetable molecular alterations within input datasets. New user directed filtering options have been coupled with the addition of interactive graphical and visualization tools. These improvements streamline the analysis of variants derived from large sequencing datasets for the identification of biologically and clinically significant subsets in the data. SNPnexus is the most comprehensible web-based application currently available and these new set of updates ensures that it remains a state-of-the-art tool for researchers. SNPnexus is freely available at https://www.snp-nexus.org.Entities:
Year: 2020 PMID: 32496546 PMCID: PMC7319579 DOI: 10.1093/nar/gkaa420
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Summary of SNPnexus features present in the current version
| Feature | SNPnexus v4 | |
|---|---|---|
|
| Human assembly | GRCh37 / GRCh38 |
| Variation format | Genomic coordinates (known/unknown) | |
| Chromosomal region (known) | ||
| dbSNP rs# (known) | ||
| Supported alterations | Single based substitutions, Insertions/deletions (InDel), Block substitutions | |
| Multiple alterations | ||
| IUPAC code supported | ||
| Batch query size | up to 100 000 | |
| Batch query format | VCF or Text file in the SNPnexus variation format (See | |
|
| Gene annotation systems | GRCh37 and GRCh38: Ensembl, RefSeq, UCSC, CCDS |
| GRCh37: Vega, AceView, H-Invitational | ||
| Coding: Synonymous / Nonsynonymous / Stop-gain/loss / Frame-shift / Peptide-shift | ||
| Intronic (splice site) | ||
| Non-coding, 5′/3′-UTR, up/down-stream | ||
| Protein deleterious effects | SIFT (known and novel) and PolyPhen (known) | |
| Population data | gnomAD Exomes: AFR, AMR, EAS, FIN, NFE, OTH, SAS | |
| gnomAD Genomes: AFR, AMR, EAS, FIN, NFE, OTH | ||
| 1000Genomes: AFR, AMR, EAS, EUR, SAS | ||
| HapMap: ASW, CEU, CHB, CHD, GIH, HCB, JPT, LWK, MEX, MKK, TSI, YRI | ||
| Regulatory elements | GRCh37 and GRCh38: miRBASE, CpG islands, TarBase, microRNAs / snoRNAs / scaRNAs, Ensembl Regulatory Build, ENCODE Project, Roadmap Epigenomics | |
| GRCh37: Transcription factor binding sites, Vista enhancers, TargetScan | ||
| Conservation scores | GERP++ scores, PHAST | |
| Disease studies | GRCh37: GAD | |
| GRCh37 and GRCh38: COSMIC, NHGRI-GWAS, ClinVar | ||
| Non-coding scoring | GRCh37: fitCons, EIGEN, FATHMM GWAVA, DeepSEA, ReMM | |
| GRCh37 and GRCh38: CADD, FunSeq2 | ||
| Structural variations | Gain, Loss, Gain+Loss, Duplication, Deletion, Insertion, Complex, Inversion, Tandem duplication, Novel sequence insertion, Mobile element insertion, Sequence alteration | |
| Pathway Analysis | Reactome pathways | |
| Biological / Clinical Interpretation | Cancer genome interpreter | |
|
| Web format | Html, graphical |
| Filters for query set | Known / Novel variant | |
| Minor allele global frequency | ||
| Genomic consequence | ||
| Predicted protein effect | ||
| In conserved region | ||
| Known phenotype association | ||
| Predicted cancer driver | ||
| Genes involved in query set | ||
| Pathways involved in query set | ||
| Filters per annotation | Filter per dbSNP or Variation ID in each annotation table | |
| Exporting options | Export per annotation or all annotations | |
| Exporting formats | VCF or Tab-delimited texts |
Figure 1.Schematic overview of the SNPnexus architecture. This version comprises a three-tiered framework: The Web Application Layer that deals with interactions with the user (submitting a query, showing results, generating visualizations and the filtering system); the Scheduler that acts as a communication interface and as a load balancer for the server; and the Annotation Layer that performs the annotation process on the input query set.
Figure 2.Contingency table for enrichment analysis. N = Number of affected genes; M (≤N) = affected genes in the pathway; V = number of genes in the Reactome pathway; U = number of genes in the Reactome universe.
Figure 3.Analysis of sequence variations from TCGA-IB-7651. (A) Variant filtering method applied for the analysis of queried variants. Nonsynonymous SNVs were identified and prioritized according to their oncogenic classifications and targetable potential. (B) Several nonsynonymous variants were identified as potential biomarkers of response to targeted therapies for PDAC, including PARP inhibition and platinum-based chemotherapies. Pathways analyses of these variants identified enrichments within DNA damage repair (DDR) processes, including (C) DNA double-strand break response and (D) base excision repair. Please, refer to https://www.snp-nexus.org/v4/results/nar2020/ to have a better look of these figures and explore the results for the Case Study.
Comparison of computational speeds for a subset of annotations between the current and previous version of SNPnexus. Ensembl, RefSeq and COSMIC annotations are the most computationally intensive. Using an input file of 100k variants, SNPnexus now operates 16×, 15.5× and 82× faster than its predecessor for these annotations, respectively. SIFT is included to show that even for non-intensive annotations there is a noticeable improvement
| Processing time (s) | |||
|---|---|---|---|
| Annotation | Number of variants | SNPnexus v.3 | SNPnexus v.4 |
|
| 5k | 172 | 7.36 |
| 20k | 2170 | 65.75 | |
| 50k | 3909 | 158.19 | |
| 100k | 7941 | 486.69 | |
|
| 5k | 137 | 4.28 |
| 20k | 1405 | 35.82 | |
| 50k | 4265 | 347.31 | |
| 100k | 7283 | 469.7 | |
|
| 5k | 15 | 0.63 |
| 20k | 28 | 2.96 | |
| 50k | 49 | 39.67 | |
| 100k | 59 | 46.06 | |
|
| 5k | 2632 | 2.4 |
| 20k | 4623 | 10.46 | |
| 50k | 8706 | 114.51 | |
| 100k | 14279 | 173.53 | |