| Literature DB >> 27924016 |
Etienne Z Gnimpieba1,2, Menno S VanDiermen3, Shayla M Gustafson3, Bill Conn3, Carol M Lushbough3,2.
Abstract
Bioinformatics and computational biology play a critical role in bioscience and biomedical research. As researchers design their experimental projects, one major challenge is to find the most relevant bioinformatics toolkits that will lead to new knowledge discovery from their data. The Bio-TDS (Bioscience Query Tool Discovery Systems, http://biotds.org/) has been developed to assist researchers in retrieving the most applicable analytic tools by allowing them to formulate their questions as free text. The Bio-TDS is a flexible retrieval system that affords users from multiple bioscience domains (e.g. genomic, proteomic, bio-imaging) the ability to query over 12 000 analytic tool descriptions integrated from well-established, community repositories. One of the primary components of the Bio-TDS is the ontology and natural language processing workflow for annotation, curation, query processing, and evaluation. The Bio-TDS's scientific impact was evaluated using sample questions posed by researchers retrieved from Biostars, a site focusing on BIOLOGICAL DATA ANALYSIS: The Bio-TDS was compared to five similar bioscience analytic tool retrieval systems with the Bio-TDS outperforming the others in terms of relevance and completeness. The Bio-TDS offers researchers the capacity to associate their bioscience question with the most relevant computational toolsets required for the data analysis in their knowledge discovery process.Entities:
Mesh:
Year: 2016 PMID: 27924016 PMCID: PMC5210639 DOI: 10.1093/nar/gkw940
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Bio-TDS discovery system overview.
Supporting materials available at http://biotds.org/help/supporting.xhtml
| S1 | BETS Specification description and manipulation |
| S2 | Resources extraction and semi-automatics curation |
| S3 | TONER: Tools ontology-based annotation |
| S4 | BioQueryTool query processing workflow and programmatic access |
| S5 | BioQueryTool Evaluation and comparison |
‘NS’ value in a given evaluation criteria (Precision, Recall,…) indicates limited data point (missing >40% data points compare the variable dataset size) to compute an accurate meaningful criteria value. This is due to a low retrieval rate in the related repository (e.g. no result return for the query).
Bio-TDS evaluation and comparison overview
| Criteriaa | Bio-TDS | BLD | ELIXIR | GALAXY | SeqAnswer |
|---|---|---|---|---|---|
|
|
| 0.0131 | 0.0087 | 0.0043 | 0.1484 |
|
|
| NS | NS | NS | NS |
|
|
| 0.0000 | 0.0000 | 0.0036 | 0.0339 |
|
|
| NS | NS | NS | NS |
|
|
| NS | 0.1441 | 0.1310 | 0.6899 |
|
| 0.0427 | NS | NS | NS |
|
|
|
| 0.0200 | 0.0696 | 0.0518 | 0.2327 |
|
| 0.0801 | NS | NS | NS |
|
a Evaluation Criteria: MRR = Mean Retrieval Rate; MAP = Mean Average Precision; MAR = recall; MAF = mean Average F-measure. NS: not significant result. User Query Type: +Free text Query; ++Keyword Query.