| Literature DB >> 31147699 |
Alexandre Renaux1,2,3, Sofia Papadimitriou1,2,3, Nassim Versbraegen1,2, Charlotte Nachtegael1,2, Simon Boutry1,4, Ann Nowé1,3, Guillaume Smits1,5,6, Tom Lenaerts1,2,3.
Abstract
A tremendous amount of DNA sequencing data is being produced around the world with the ambition to capture in more detail the mechanisms underlying human diseases. While numerous bioinformatics tools exist that allow the discovery of causal variants in Mendelian diseases, little to no support is provided to do the same for variant combinations, an essential task for the discovery of the causes of oligogenic diseases. ORVAL (the Oligogenic Resource for Variant AnaLysis), which is presented here, provides an answer to this problem by focusing on generating networks of candidate pathogenic variant combinations in gene pairs, as opposed to isolated variants in unique genes. This online platform integrates innovative machine learning methods for combinatorial variant pathogenicity prediction with visualization techniques, offering several interactive and exploratory tools, such as pathogenic gene and protein interaction networks, a ranking of pathogenic gene pairs, as well as visual mappings of the cellular location and pathway information. ORVAL is the first web-based exploration platform dedicated to identifying networks of candidate pathogenic variant combinations with the sole ambition to help in uncovering oligogenic causes for patients that cannot rely on the classical disease analysis tools. ORVAL is available at https://orval.ibsquare.be.Entities:
Year: 2019 PMID: 31147699 PMCID: PMC6602484 DOI: 10.1093/nar/gkz437
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.ORVAL flowchart highlighting the major components of the platform. (A) Users can submit variants using a Variant Call Format (VCF) file or a tab-delimited variant list. The variants can be filtered with some predefined criteria or by using a gene panel. (B) Once submitted, variants are processed by a pipeline first applying the selected filters, then generating all di-,tri- and tetra-allelic variant combinations and annotating them using public bioinformatics resource data, then predicting which variant combinations may be disease-causing candidates with the VarCoPP predictor. Finally, these candidate variant combinations are aggregated at the gene level to build an oligogenic network. (C) By selecting a specific digenic variant combination, users can run a predictor to know the DE probabilities and can get an interpretation of the VarCoPP prediction based on its features. (D) It is also possible to interact with the oligogenic network to filter and explore specific oligogenic signatures. A dedicated page shows how the selected gene set maps with multiple cross-references to give an insight into the biological context.
Figure 2.Examples of the main output figures of ORVAL. (A) An interactive oligogenic network built from all gene pairs having at least one predicted candidate combination. The edge are coloured based on a pathogenicity score (highest Classification Score (CS) for a pair). The genes can be filtered out manually or based on their centrality. The edges can be pruned based on the pathogenicity score. (B) A protein–protein interaction network where the central nodes circled in purple represent the proteins from a selected oligogenic module and the external nodes are the first-level interactors. Direct interactions (e.g. FNDC9-PROKR2) are coloured in purple. A pie chart showing the protein cellular locations is used to highlight the corresponding nodes in the network (here, secretory-pathway. (C) A Tree-map representing the Reactome ontology sized proportionally to the number of mapped genes from the oligogenic module and colour according to the level on the ontology hierarchy. On this example, the most represented pathways are part of the Signal Transduction ontology. (D) An S-plot representing the classification of all digenic combinations as being neutral (in blue) or potentially disease-causing (from orange to dark red) depending on the predicted VarCoPP CS and Support Score (SS). The table on the right shows the gene pair, variants and prediction scores. (E) A boxplot chart, displayed for a specific digenic combination, showing the contribution of each predictive features into the disease-causing class (in red) or neutral (in blue). (F) A spider plot, displayed for a specific digenic combination, showing the probabilities for each class of digenic effect predicted by the DE Predictor, with a highest probability for True Digenic.