| Literature DB >> 27538578 |
Zeeshan Ahmed1, Saman Zeeshan2, Thomas Dandekar3.
Abstract
Biomedical images are helpful sources for the scientists and practitioners in drawing significant hypotheses, exemplifying approaches and describing experimental results in published biomedical literature. In last decades, there has been an enormous increase in the amount of heterogeneous biomedical image production and publication, which results in a need for bioimaging platforms for feature extraction and analysis of text and content in biomedical images to take advantage in implementing effective information retrieval systems. In this review, we summarize technologies related to data mining of figures. We describe and compare the potential of different approaches in terms of their developmental aspects, used methodologies, produced results, achieved accuracies and limitations. Our comparative conclusions include current challenges for bioimaging software with selective image mining, embedded text extraction and processing of complex natural language queries.Entities:
Mesh:
Year: 2016 PMID: 27538578 PMCID: PMC4990152 DOI: 10.1093/database/baw118
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.MEDLINE citation count. This figure shows the enormous increase in the citation count at MEDLINE over the last six decades. The year 2015s count is not complete but in progress. The graphed statistics are taken from the official website of the MEDLINE by the US national Library of Medicine (http://www.nlm.nih.gov/bsd/medline_cit_counts_yr_pub.html), attached in supplementary material.
Figure 2.Concept of information extraction from published scientific and biomedical literature. This figure gives the overview of different processes involved in the information extraction from scientific and biomedical literature including data retrieval (getting text and figures from biomedical archives using NLP queries), information extraction (text mining and image processing) and presenting integrated data.
Figure 3.Concept of information extraction from published scientific and biomedical literature. This figure gives the overview of different processes involved in the information extraction from scientific and biomedical literature including data retrieval (getting text and figures from biomedical archives using NLP queries), information extraction (text mining and image processing) and presenting integrated data.
Methods implementing image segmentation in IR
| Method | Description | Limitations |
|---|---|---|
| Thresholding or Binarization ( | This is a method based on the image segmentation, which create binary of gray scale images to perform image analysis. Various methods (e.g. point dependent techniques, region dependent techniques, local thresholding, multithresholding, Histogram Thresholding ( | Incorrectly set threshold can lead to under or over segmentation of objects ( |
| Clustering | To understand large-scale complex data (text and images etc.), this method is widely applied in different fields (e.g. information retrieval, bioimaging, medicine etc.) for pattern recognition, speech analysis and information retrieval ( | It is difficult to predict fixed number of clusters while grouping objects and it consumes extensive computational time. |
| High Dimensional Indexing (HDI) ( | There have been many HDI techniques proposed for large scaled content-based image retrieval, which have been categorized in Dimension Reduction [embedded dimension, Karhunen–Loeve transform (KLT), low-rank singular value decomposition (SVD) etc.], and Multi-dimensional indexing (Bucketing algorithm, priority k-d tree, quad-tree, K-D-B tree, hB-tree, R tree etc.) techniques ( | Blind dimension reduction might not bring optimistic results during embedded dimension reduction. |
Comparative analysis of bioimaging informatics approaches
| Features/Approaches | Methodology Categorization (Image mining, text mining, Image and text mining) | Domain Categorization (Open, specific) | Web Links |
|---|---|---|---|
| Fiji ( | Image Mining | Specific for electron microscopy data | |
| Particle swarm optimization ( | Image and text mining | Open for all kinds of images. | Not publically available online. |
| Figure panel classification ( | Image mining | Open for all kinds of images. | Not publically available online. |
| Analyzing axis diagrams ( | Image and text mining | Open for all kinds of axis diagrams. | Not publically available online. |
| Automatic categorization of biomedical images ( | Image mining | Open for all kinds of flow charts, experimental, graph and mixed images | Not publically available online. |
| Yale Image Finder ( | Image mining | Open for all kinds of biomedical images | |
| Hybrid framework ( | Image and text mining | Specific for protein-protein interaction images. | Not publically available online. |
| Low-level feature extraction with ontology ( | Image and text mining | Open for all kinds of images clinical health care images. | Not publically available online. |
| Mining images for the detection and analysis of gel diagrams ( | Image and text mining | Specific for protein-protein interaction images. | Not publically available online. |
| Mining pathway diagrams ( | Image and text mining | Specific for pathways analysis images. | |
| Edge-based image feature descriptor ( | Image mining | Open for all kinds of images health care images. | Not publically available online. |
| Integrating image data into biomedical text categorization ( | Image and text mining | Open for all kinds of biomedical images | Not publically available online. |