| Literature DB >> 28694847 |
Artem Lysenko1, Keith Anthony Boroevich1, Tatsuhiko Tsunoda1,2,3.
Abstract
BACKGROUND: Refinement of candidate gene lists to select the most promising candidates for further experimental verification remains an essential step between high-throughput exploratory analysis and the discovery of specific causal genes. Given the qualitative and semantic complexity of biological data, successfully addressing this challenge requires development of flexible and interoperable solutions for making the best possible use of the largest possible fraction of all available data.Entities:
Keywords: Biological network analysis; Cytoscape; DIAMOnD; Gene prioritisation; Random walk
Year: 2017 PMID: 28694847 PMCID: PMC5501438 DOI: 10.1186/s13040-017-0141-9
Source DB: PubMed Journal: BioData Min ISSN: 1756-0381 Impact factor: 2.522
Fig. 1Performance evaluation of multi-threaded implementation of random walk with restart gene prioritization algorithm on systems with 16 and 54 CPU cores
Fig. 2The main interface of the Arete Cytoscape app (left) and visualization of DIAMOnD gene ranking generated using annotation and filtering functionality of the app (right)
Additional metrics used for integrative analysis example and informal descriptions of what properties they capture
| Metric name | Property captured |
|---|---|
| Eccentricity | Overall remoteness from all other nodes |
| Transitivity | Density of interlinks among immediate neighbors |
| Betweenness | Network “choke points” with high proportion of shortest paths going through them |
| Tissue-specific expression | Ubiquitous versus tissue-specific expression |
| k-core number | Location in a dense core versus network periphery |
Fig. 3Comparison of different gene prioritization approaches offered in Arete app. a Box plot of ROC AUC scores of leave-one-out, 3-fold and 5-fold cross-validation for 69 different sets of disease-related genes. b Percentages of genes associated with multiple diseases in our reference set. Bottom row shows corresponding fold-enrichment statistics for the four quartiles of ranked gene lists profiled using 3-fold (c), 5-fold d and leave-one-out e cross-validation schemes
Fig. 4a and b show example ROC curves for two diseases – obstructive lung disease and psoriasis, respectively. c Comparison of ROC-AUC scores for 69 individual diseases using DIAMOnD and RWR approaches. Scores of canonical versions are shown in blue and scores where those methods were combined with additional data using iRF approach are shown in red. Each point represents a set of genes for particular disease
Fig. 5Performance evaluation of Arete methods on transcriptomics data, which profiled relapse during multiple sclerosis progression. a ROC-AUC curves; b fold-enrichment for each quartile of a reference list