| Literature DB >> 34570217 |
Dandan Huang1,2, Yao Zhou1,3, Xianfu Yi4, Xutong Fan1,3, Jianhua Wang1,3, Hongcheng Yao5, Pak Chung Sham5, Jihui Hao6, Kexin Chen7, Mulin Jun Li1,3,7.
Abstract
Interpreting the molecular mechanism of genomic variations and their causal relationship with diseases/traits are important and challenging problems in the human genetic study. To provide comprehensive and context-specific variant annotations for biologists and clinicians, here, by systematically integrating over 4TB genomic/epigenomic profiles and frequently-used annotation databases from various biological domains, we develop a variant annotation database, called VannoPortal. In general, the database has following major features: (i) systematically integrates 40 genome-wide variant annotations and prediction scores regarding allele frequency, linkage disequilibrium, evolutionary signature, disease/trait association, tissue/cell type-specific epigenome, base-wise functional prediction, allelic imbalance and pathogenicity; (ii) equips with our recent novel index system and parallel random-sweep searching algorithms for efficient management of backend databases and information extraction; (iii) greatly expands context-dependent variant annotation to incorporate large-scale epigenomic maps and regulatory profiles (such as EpiMap) across over 33 tissue/cell types; (iv) compiles many genome-scale base-wise prediction scores for regulatory/pathogenic variant classification beyond protein-coding region; (v) enables fast retrieval and direct comparison of functional evidence among linked variants using highly interactive web panel in addition to plain table; (vi) introduces many visualization functions for more efficient identification and interpretation of functional variants in single web page. VannoPortal is freely available at http://mulinlab.org/vportal.Entities:
Mesh:
Year: 2022 PMID: 34570217 PMCID: PMC8728305 DOI: 10.1093/nar/gkab853
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Database architecture, function structure and representative features.
Figure 2.Result page and distinctive web components of VannoPortal. (A) Conservation scores and positive selection scores in the ‘Evolution’ panel. (B) A composite viewer showing LD structure, disease/trait association tracks and evidence table in the ‘Phenotype’ panel. (C) Tissue/cell type-specific regulatory variant prioritization function in the ‘Regulatory potential’ panel. (D) Two rich tables displaying critical histone marks, chromatin states, and TF binding sites across hundreds of tissue/cell type-specific samples at variant locus, from Roadmap Epigenomics or EpiMap projects. (E) A circular plot showing significant 5 kb Hi-C chromatin interactions between variant locus and its target region. (F) Real-time motif scanning table for the predicted allele-specific effect of TF binding. (G) Genome-scale pathogenic scores in the ‘Pathogenicity’ panel.
Figure 3.Supporting evidence from VannoPortal for the regulatory potential and cancer-driven roles of chr5:g.1295228:G > A (GRCh37, rs1242535815). (A) rs1242535815 overlaps active chromatin states (e.g. DNase-seq and ATAC-seq), histone marks (e.g. H3K27ac, H3K4me2, H3K4me3 and H3K9ac) and TF binding sites (e.g. POLR2A, RAD21 and SMC3) across many human tissues, particularly in cancers. (B) rs1242535815 A allele potentially increases the binding affinity of HDAC1 and GABPA. (C) rs1242535815 is a likely pathogenic mutation supported by many genome-scale base-wise pathogenicity prediction methods. (D) rs1242535815 is a highly recurrent mutation in cancer patients. (E) rs1242535815 is a likely cancer driver mutation supported by regBase-CAN and other tools, and it could be a prognostic marker in cancer therapy.