| Literature DB >> 31649722 |
Yi-An Chen1, Lokesh P Tripathi1, Takeshi Fujiwara1, Tatsuya Kameyama1, Mari N Itoh1, Kenji Mizuguchi1.
Abstract
Biological data analysis is the key to new discoveries in disease biology and drug discovery. The rapid proliferation of high-throughput 'omics' data has necessitated a need for tools and platforms that allow the researchers to combine and analyse different types of biological data and obtain biologically relevant knowledge. We had previously developed TargetMine, an integrative data analysis platform for target prioritisation and broad-based biological knowledge discovery. Here, we describe the newly modelled biological data types and the enhanced visual and analytical features of TargetMine. These enhancements have included: an enhanced coverage of gene-gene relations, small molecule metabolite to pathway mappings, an improved literature survey feature, and in silico prediction of gene functional associations such as protein-protein interactions and global gene co-expression. We have also described two usage examples on trans-omics data analysis and extraction of gene-disease associations using MeSH term descriptors. These examples have demonstrated how the newer enhancements in TargetMine have contributed to a more expansive coverage of the biological data space and can help interpret genotype-phenotype relations. TargetMine with its auxiliary toolkit is available at https://targetmine.mizuguchilab.org. The TargetMine source code is available at https://github.com/chenyian-nibio/targetmine-gradle.Entities:
Keywords: data mining; data warehouse; drug discovery; gene prioritisation; integrative data analysis; knowledge discovery; multi-omics data analysis
Year: 2019 PMID: 31649722 PMCID: PMC6794636 DOI: 10.3389/fgene.2019.00934
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Key enhancements and updates in TargetMine since the last published iteration (2016).
| Data types, data models and features | New and/or enhanced data types and features | Existing data types and features |
|---|---|---|
| KEGG relations; | Combined PPI repository from iRefindex and BioGRID, literature; | |
| KEGG COMPOUND –pathway mapping; | KEGG COMPOUND | |
| ClinVar variations; | GWAS data from NHGRI | |
| MeSH term descriptors; | NCBI PubMed links | |
| ∼400,000 human and mouse TF-target annotation from ENCODE | Amadeus; ORegAnno; HTRIdb | |
| Filter PPIs by HCDPs; | Include multiple interaction types | |
| Dendrogram of hierarchically assembled associations with distances; | Two-colour grid of squares | |
Figure 1Assessing the potential benefits of including GCE data in target prioritisation with TargetMine. GCE, gene co-expression; HCDP, high-quality direct physical PPIs; IPC, integrated pathway clusters; PPI, protein–protein interactions.
Figure 2The inclusion of HCDPs filtered by Gene co-expression (GCE-HCDP) to generate extended gene sets led to an overall improved target prioritisation performance when compared with the inclusion of unfiltered HCDPs and un-extended gene sets.
The inclusion of HCDPs filtered by gene co-expression (GCE-HCDP) to generate extended gene sets led to an overall improved target prioritisation performance when compared with the inclusion of unfiltered HCDPs and un-extended gene sets.
| Original test | 0.211 | |
| +HCDP | 0.327 | |
| +Co-exp-HCDP | 0.341 | |
| +HCDP | 5.77×10−18 | |
| +Co-exp-HCDP | 5.93×10−22 | 6.14×10−20 |