| Literature DB >> 32402084 |
Sumaiya Iqbal1,2,3, David Hoksza4,5, Eduardo Pérez-Palma6, Patrick May4, Jakob B Jespersen7, Shehab S Ahmed8, Zaara T Rifat8, Henrike O Heyne2,3,9, M Sohel Rahman8, Jeffrey R Cottrell2, Florence F Wagner1,2, Mark J Daly2,3,9, Arthur J Campbell1,2, Dennis Lal2,6,10,11.
Abstract
Human genome sequencing efforts have greatly expanded, and a plethora of missense variants identified both in patients and in the general population is now publicly accessible. Interpretation of the molecular-level effect of missense variants, however, remains challenging and requires a particular investigation of amino acid substitutions in the context of protein structure and function. Answers to questions like 'Is a variant perturbing a site involved in key macromolecular interactions and/or cellular signaling?', or 'Is a variant changing an amino acid located at the protein core or part of a cluster of known pathogenic mutations in 3D?' are crucial. Motivated by these needs, we developed MISCAST (missense variant to protein structure analysis web suite; http://miscast.broadinstitute.org/). MISCAST is an interactive and user-friendly web server to visualize and analyze missense variants in protein sequence and structure space. Additionally, a comprehensive set of protein structural and functional features have been aggregated in MISCAST from multiple databases, and displayed on structures alongside the variants to provide users with the biological context of the variant location in an integrated platform. We further made the annotated data and protein structures readily downloadable from MISCAST to foster advanced offline analysis of missense variants by a wide biological community.Entities:
Year: 2020 PMID: 32402084 PMCID: PMC7319582 DOI: 10.1093/nar/gkaa361
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.MISCAST architecture, main modules, and data flow. (A) Online resources used for collecting missense variants, gene symbols, protein sequences and structures. (B) Databases searched to aggregate gene- and residue-wise biological context annotations. (C) MISCAST development using Shiny R package and deployment using Google Cloud services. (D) Three main visualization schemes of MISCAST. Missense variants and protein features are displayed in 1D, 2D and concurrent 1D > 3D views.
Figure 2.Selected visual and textual output of the Variant Analysis Suite track of MISCAST web server for the discussed case study. (A) Selection of a gene (DDX3X) opens up the ‘Information page’, displaying general overview of the encoded protein. (B) 1D visualization page to explore amino acid-wise missense variants alongside biologically-relevant protein feature annotations. (C) 2D visualization page to display missense variants in the context of feature annotations for the full protein sequence.
Figure 3.Illustration of MISCAST’s display of missense variants and protein feature annotations simultaneously on protein sequence (left panel) and structure (right panel). Mapping for pathogenic and population variants on the structure of DDX3X encoded ATP-dependent RNA helicase, along with highlighted (yellow) protein features annotation, shows a potential 3D mutational hotspot (cluster of pathogenic variants) in the C-terminal helicase domain.