Literature DB >> 32251504

MaveQuest: a web resource for planning experimental tests of human variant effects.

Da Kuang^1,2,3,4, Jochen Weile^1,2,3,4, Roujia Li^1,2,3,4, Tom W Ouellette^1,2, Jarry A Barber^1,2, Frederick P Roth^1,2,3,4.

Abstract

SUMMARY: Fully realizing the promise of personalized medicine will require rapid and accurate classification of pathogenic human variation. Multiplexed assays of variant effect (MAVEs) can experimentally test nearly all possible variants in selected gene targets. Planning a MAVE study involves identifying target genes with clinical impact, and identifying scalable functional assays for that target. Here, we describe MaveQuest, a web-based resource enabling systematic variant effect mapping studies by identifying potential functional assays, disease phenotypes and clinical relevance for nearly all human protein-coding genes.
AVAILABILITY AND IMPLEMENTATION: MaveQuest service: https://mavequest.varianteffect.org/. MaveQuest source code: https://github.com/kvnkuang/mavequest-front-end/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities: Disease Species

Year: 2020 PMID： 32251504 PMCID： PMC7320626 DOI： 10.1093/bioinformatics/btaa228

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

1 Introduction

Driven by the advancement of genomic sequencing technologies, and by rapid increases in the number of identified disease-related genes and variants (Brunham and Hayden, 2013), clinical genetic testing is gaining increasingly broad use. An accompanying challenge is the frequent occurrence of (often extremely rare) variants that are difficult to interpret (Blazer ). In ClinVar, a popular resource for submitting genetic variants seen in clinical settings, approximately 40% of all variants are missense variants (Landrum ). Unfortunately, the majority of missense variants in ClinVar are now classified as ‘variants of uncertain significance’ (VUS) (Starita ; Weile and Roth, 2018), which makes any corresponding genetic tests not ‘clinically valid’ (Hoffman-Andrews, 2017), where clinical validity is defined by the extent to which a genetic test reveals a patient’s clinical phenotype or risk (Burke, 2014; Holtzman and Watson, 1999). Many purely computational methods, such as Polyphen-2 (Adzhubei ), have been established for predicting the functional effect of given variants. However, experimental functional assays can detect far more disease-associated variants with high confidence than can computational approaches (Sun ). Functional evidence is also considered important under the American College of Medical Genetics and Genomics/Association for Molecular Pathology guidelines (Richards ), and thus could help shift many VUS variants to more clinically useful categories (e.g. pathogenic or benign). However, conventional functional assays, such as complementation (Osborn and Miller, 2007), are often resource-intensive, and results from such assays are not generally available for rare clinical VUS variants. Multiplexed assays of variant effect (MAVEs) provide a systematic, experimental approach to study nearly all missense variants in selected gene targets (Starita ). Indeed, some variant effect maps have been shown to outperform smaller-scale validated in vitro functional assays in quantitatively predicting disease phenotypes (Sun ). The growing interest in MAVE studies (Weile and Roth, 2018) has presented bioinformatic challenges unique to the early planning stage. For example, to explore the clinical relevance of potential target genes and to identify scalable functional assays for these genes, information must be assembled from multiple database and literature resources. Here, we developed MaveQuest, a web-based service simplifying access to diverse aggregated information about potential functional assays, disease phenotypes and clinical relevance of genes for systematic variant effect mapping.

2 The database

The current version of the MaveQuest database curates literature for information related to 19 200 human genes from the Human Genome Organization’s Gene Nomenclature Committee collection (Braschi ). Of these genes, MaveQuest identified cellular phenotypes (each having the potential to enable a scalable functional assay) for 18 979 genes, disease phenotypes for 8460 genes and evidence of clinical relevance for 5203 genes. Figure 1A presents the three categories of data sources that were included in the MaveQuest database.

Fig. 1.

The architecture of MaveQuest. (A) Data from other sources were parsed and imported into the MaveQuest database and are retrieved by the API or the front-end user interface. (B) Three major components of the MaveQuest front-end service The first data category points the user to potential functional assays. GenomeCRISPR (Rauscher ), GenomeRNAi (Schmidt ) and the Online Gene Essentiality (Chen ) lead for human cell-based phenotypes that could form the basis of a scalable assay. The Human Reference Interactome Mapping project (Luck ) provides information on assays to identify variants that ablate specific protein interactions or generally reduce protein folding or stability. Data from InParanoid (Sonnhammer and Östlund, 2015), P-POD (Heinicke ) and Alliance of Genome Resources (Howe ) databases identify orthologs in non-human species, with links that allow the user to explore whether there are phenotypes associated with disruption of these orthologous genes that might be complemented by human genes to yield a scalable functional assay. The second category provides data on disease phenotypes with which the query gene has been associated. ClinVar (Landrum ) provides clinically-interpreted variants reported for the query gene, which we can visualize to highlight regions enriched for pathogenic or benign variants, together with secondary structures, protein domains and families extracted from InterPro (Mitchell ) and Uniprot (UniProt Consortium, 2019) databases. To enable users to further evaluate the clinical significance of query genes, Online Mendelian Inheritance in Man (Hamosh ), Orphanet (INSERM, 1997), COSMIC Cancer Gene Census (Sondka ) and PharmGKB (Whirl-Carrillo ) databases summarize disease- and/or drug-related phenotypes, their mode of inheritance and, in some cases, molecular mechanisms. The third category contains sequencing panels from three clinical genetic testing providers, Invitae, Ambry and GeneDx, who have each contributed many variant interpretations to ClinVar. The presence of a query gene in clinical genetic sequencing panels from multiple providers suggests clinical interest.

3 The application programming interface

The application programming interface (API) serves as an intermediate between the database and the front-end web application. The API, based on the RESTful standard (Richardson ), can be accessed directly using any common programming language. The API currently provides six functions (Supplementary Table S1) that could be further integrated with other MAVE resources as they emerge, e.g. MaveDB (Esposito .)

4 The front-end web application

The front-end interface—enabling queries related to functional assays, disease phenotypes and clinical interests—contains three components (Fig. 1B). The first component, which also serves as the starting page, is a search panel that allows users to look up genes using identifiers. The second component is the gene summary page which lists cell-based phenotypes, disease phenotypes and evidence of clinical interest for each query gene. When the user has searched for partial matches, this information is included for all matching genes. Users can select a specific gene to bring up a detail page. This is the third component, which contains all data in the database associated with that gene. The detail page includes an overview of variants in ClinVar database when available, displaying the distribution of single-nucleotide variants along the protein sequence. Secondary structures, protein domains and families are also visualized for users to identify potential ‘variational hotspots’ (i.e. regions where variants are enriched). This feature is particularly useful for studying large proteins, allowing prioritization of regions harboring more pathogenic or benign variants. Click here for additional data file.

28 in total

1. Genetic tests: clinical validity and clinical utility.

Authors: Wylie Burke
Journal: Curr Protoc Hum Genet Date: 2014-04-24

Review 2. Pharmacogenomics knowledge for personalized medicine.

Authors: M Whirl-Carrillo; E M McDonagh; J M Hebert; L Gong; K Sangkuhl; C F Thorn; R B Altman; T E Klein
Journal: Clin Pharmacol Ther Date: 2012-10 Impact factor: 6.875

Review 3. The COSMIC Cancer Gene Census: describing genetic dysfunction across all human cancers.

Authors: Zbyslaw Sondka; Sally Bamford; Charlotte G Cole; Sari A Ward; Ian Dunham; Simon A Forbes
Journal: Nat Rev Cancer Date: 2018-11 Impact factor: 60.716

4. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology.

Authors: Sue Richards; Nazneen Aziz; Sherri Bale; David Bick; Soma Das; Julie Gastier-Foster; Wayne W Grody; Madhuri Hegde; Elaine Lyon; Elaine Spector; Karl Voelkerding; Heidi L Rehm
Journal: Genet Med Date: 2015-03-05 Impact factor: 8.822

5. GenomeCRISPR - a database for high-throughput CRISPR/Cas9 screens.

Authors: Benedikt Rauscher; Florian Heigwer; Marco Breinig; Jan Winter; Michael Boutros
Journal: Nucleic Acids Res Date: 2016-10-26 Impact factor: 16.971

6. UniProt: a worldwide hub of protein knowledge.

Authors:
Journal: Nucleic Acids Res Date: 2019-01-08 Impact factor: 16.971

7. A proactive genotype-to-patient-phenotype map for cystathionine beta-synthase.

Authors: Song Sun; Jochen Weile; Marta Verby; Yingzhou Wu; Yang Wang; Atina G Cote; Iosifina Fotiadou; Julia Kitaygorodsky; Marc Vidal; Jasper Rine; Pavel Ješina; Viktor Kožich; Frederick P Roth
Journal: Genome Med Date: 2020-01-30 Impact factor: 11.117

8. ClinVar: public archive of interpretations of clinically relevant variants.

Authors: Melissa J Landrum; Jennifer M Lee; Mark Benson; Garth Brown; Chen Chao; Shanmuga Chitipiralla; Baoshan Gu; Jennifer Hart; Douglas Hoffman; Jeffrey Hoover; Wonhee Jang; Kenneth Katz; Michael Ovetsky; George Riley; Amanjeev Sethi; Ray Tully; Ricardo Villamarin-Salomon; Wendy Rubinstein; Donna R Maglott
Journal: Nucleic Acids Res Date: 2015-11-17 Impact factor: 16.971

9. Genenames.org: the HGNC and VGNC resources in 2019.

Authors: Bryony Braschi; Paul Denny; Kristian Gray; Tamsin Jones; Ruth Seal; Susan Tweedie; Bethan Yates; Elspeth Bruford
Journal: Nucleic Acids Res Date: 2019-01-08 Impact factor: 16.971

10. A reference map of the human binary protein interactome.

Authors: Katja Luck; Dae-Kyum Kim; Luke Lambourne; Kerstin Spirohn; Bridget E Begg; Wenting Bian; Ruth Brignall; Tiziana Cafarelli; Francisco J Campos-Laborie; Benoit Charloteaux; Dongsic Choi; Atina G Coté; Meaghan Daley; Steven Deimling; Alice Desbuleux; Amélie Dricot; Marinella Gebbia; Madeleine F Hardy; Nishka Kishore; Jennifer J Knapp; István A Kovács; Irma Lemmens; Miles W Mee; Joseph C Mellor; Carl Pollis; Carles Pons; Aaron D Richardson; Sadie Schlabach; Bridget Teeking; Anupama Yadav; Mariana Babor; Dawit Balcha; Omer Basha; Christian Bowman-Colin; Suet-Feung Chin; Soon Gang Choi; Claudia Colabella; Georges Coppin; Cassandra D'Amata; David De Ridder; Steffi De Rouck; Miquel Duran-Frigola; Hanane Ennajdaoui; Florian Goebels; Liana Goehring; Anjali Gopal; Ghazal Haddad; Elodie Hatchi; Mohamed Helmy; Yves Jacob; Yoseph Kassa; Serena Landini; Roujia Li; Natascha van Lieshout; Andrew MacWilliams; Dylan Markey; Joseph N Paulson; Sudharshan Rangarajan; John Rasla; Ashyad Rayhan; Thomas Rolland; Adriana San-Miguel; Yun Shen; Dayag Sheykhkarimli; Gloria M Sheynkman; Eyal Simonovsky; Murat Taşan; Alexander Tejeda; Vincent Tropepe; Jean-Claude Twizere; Yang Wang; Robert J Weatheritt; Jochen Weile; Yu Xia; Xinping Yang; Esti Yeger-Lotem; Quan Zhong; Patrick Aloy; Gary D Bader; Javier De Las Rivas; Suzanne Gaudet; Tong Hao; Janusz Rak; Jan Tavernier; David E Hill; Marc Vidal; Frederick P Roth; Michael A Calderwood
Journal: Nature Date: 2020-04-08 Impact factor: 49.962

3 in total

1. Closing the gap: Systematic integration of multiplexed functional data resolves variants of uncertain significance in BRCA1, TP53, and PTEN.

Authors: Shawn Fayer; Carrie Horton; Jennifer N Dines; Alan F Rubin; Marcy E Richardson; Kelly McGoldrick; Felicia Hernandez; Tina Pesaran; Rachid Karam; Brian H Shirts; Douglas M Fowler; Lea M Starita
Journal: Am J Hum Genet Date: 2021-11-17 Impact factor: 11.043

Review 2. Humanized yeast to model human biology, disease and evolution.

Authors: Aashiq H Kachroo; Michelle Vandeloo; Brittany M Greco; Mudabir Abdullah
Journal: Dis Model Mech Date: 2022-06-06 Impact factor: 5.732

3. Prioritizing genes for systematic variant effect mapping.

Authors: Da Kuang; Rebecca Truty; Jochen Weile; Britt Johnson; Keith Nykamp; Carlos Araya; Robert L Nussbaum; Frederick P Roth
Journal: Bioinformatics Date: 2021-04-01 Impact factor: 6.937

3 in total