| Literature DB >> 30048461 |
Yan Yan1, Xu-Cheng Yin2, Chun Yang2, Sujian Li3, Bo-Wen Zhang2.
Abstract
Deep learning techniques, e.g., Convolutional Neural Networks (CNNs), have been explosively applied to the research in the fields of information retrieval and natural language processing. However, few research efforts have addressed semantic indexing with deep learning. The use of semantic indexing in the biomedical literature has been limited for several reasons. For instance, MEDLINE citations contain a large number of semantic labels from automatically annotated MeSH terms, and for a great deal of the literature, only the information of the title and the abstract is readily available. In this paper, we propose a Boltzmann Convolutional neural network framework (B-CNN) for biomedicine semantic indexing. In our hybrid learning framework, the CNN can adaptively deal with features of documents that have sequence relationships, and can capture context information accordingly; the Deep Boltzmann Machine (DBM) merges global (the entity in each document) and local information through its training with undirected connections. Additionally, we have designed a hierarchical coarse to fine style indexing structure for learning and classifying documents, and a novel feature extension approach with word sequence embedding and Wikipedia categorization. Comparative experiments were conducted for semantic indexing of biomedical abstract documents; these experiments verified the encouraged performance of our B-CNN model.Entities:
Mesh:
Year: 2018 PMID: 30048461 PMCID: PMC6061982 DOI: 10.1371/journal.pone.0197933
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
An example PubMed article with manually annotated MeSH terms.
| Journal: Photochemistry and photobiology |
| Year: 1983 |
| Title: Kinetics of bacterial bioluminescence and the fluorescent transient. |
| Abstract: The addition of FMNH(2), to Vibrio harveyi luciferase at 2° |
| MeSH terms: “Flavin Mononucleotide” “Fluorescence” “Kinetics” “Luciferases Luminescence” |
Fig 1CNNs-based hybrid learning framework.
Fig 2Deep Boltzmann machines.
Fig 3Layer’s nodes sampling process.
Fig 4Training process.
Fig 5Testing process.
Fig 6Words mapping with Wikipedia.
Datasets details.
| Dataset | Sample | Classes | Type | Field |
|---|---|---|---|---|
| dataset 1 | 9666 | 39 | multi-class | biomedicine |
| dataset 2 | 1000 | 168 | multi-class | |
| dataset 3 | 1,000,000 | 150 | multi-label | |
| 2,000 | multi-label | |||
| dataset 4 | 18,828 | 20 | multi-class | newsgroups |
| 4 | multi-class |
Classification results (%) on dataset 1.
| Method | DSE | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MiP | MiR | Mi | MaP | MaR | Ma | MiP | MiR | Mi | MaP | MaR | Ma | |
| 21.90 | 31.56 | 25.86 | 22.12 | 39.45 | 28.35 | 21.90 | 31.56 | 25.86 | 22.12 | 39.45 | 28.35 | |
| 55.15 | 54.77 | 54.96 | 56.28 | 55.89 | 56.08 | 50.98 | 50.98 | 50.98 | 51.49 | 50.98 | 51.23 | |
| 57.44 | 57.04 | 57.24 | 58.02 | 57.61 | 57.82 | 47.43 | 47.00 | 47.21 | 47.05 | 46.63 | 46.84 | |
| 53.47 | 55.86 | 54.64 | 55.70 | 55.31 | 55.50 | 50.53 | 49.93 | 50.23 | 50.64 | 50.43 | 50.53 | |
| 54.98 | 53.54 | 54.25 | 54.54 | 55.19 | 54.86 | 50.56 | 46.52 | 48.46 | 50.97 | 46.89 | 48.85 | |
| 62.03 | 62.93 | 62.66 | 62.61 | 62.63 | 52.27 | 52.11 | 52.19 | 52.22 | 52.06 | 52.14 | ||
| 63.95 | 59.62 | 61.71 | 62.69 | 62.10 | 62.40 | 50.66 | 50.20 | 50.43 | 50.56 | 50.10 | 50.33 | |
| 60.65 | 58.36 | 59.48 | 61.89 | 59.55 | 60.70 | 63.20 | 61.05 | 62.11 | 66.93 | 63.75 | 65.30 | |
| 63.56 | 61.93 | 62.73 | 63.00 | 61.80 | 62.40 | 65.98 | 64.36 | 65.16 | 68.19 | 64.94 | 66.52 | |
| 54.84 | 52.39 | 53.59 | 55.12 | 52.91 | 53.99 | 58.96 | 57.56 | 58.25 | 59.08 | 58.26 | 58.67 | |
| 63.30 | ||||||||||||
Classification results (%) on dataset 2.
| Method | DSE | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MiP | MiR | Mi | MaP | MaR | Ma | MiP | MiR | Mi | MaP | MaR | Ma | |
| 25.57 | 36.78 | 30.17 | 25.27 | 36.78 | 30.17 | 25.57 | 36.78 | 30.17 | 25.27 | 36.78 | 30.17 | |
| 47.27 | 46.94 | 47.11 | 47.04 | 46.00 | 46.52 | 40.06 | 39.66 | 39.86 | 40.10 | 39.70 | 39.90 | |
| 48.74 | 49.71 | 49.22 | 48.79 | 49.76 | 49.27 | 40.96 | 40.59 | 40.77 | 40.87 | 40.14 | 40.50 | |
| 46.79 | 46.46 | 46.62 | 45.71 | 44.97 | 45.34 | 40.68 | 40.28 | 40.48 | 39.46 | 38.67 | 39.06 | |
| 45.81 | 45.90 | 45.86 | 45.03 | 45.12 | 45.08 | 38.11 | 35.06 | 36.63 | 38.30 | 36.05 | 37.14 | |
| 52.01 | 51.96 | 51.99 | 49.93 | 50.92 | 50.42 | 42.28 | 42.16 | 42.22 | 44.52 | 41.31 | 42.86 | |
| 51.41 | 52.90 | 52.14 | 51.51 | 52.27 | 39.54 | 39.19 | 39.37 | 39.47 | 40.09 | 39.78 | ||
| 52.00 | 50.51 | 51.25 | 51.39 | 47.32 | 49.27 | 56.01 | 54.23 | 55.11 | 53.21 | 50.43 | 51.78 | |
| 54.70 | 52.57 | 53.61 | 53.85 | 50.07 | 51.89 | 58.60 | 56.19 | 57.37 | 55.87 | 52.50 | 54.13 | |
| 47.30 | 43.51 | 45.32 | 46.96 | 43.08 | 44.94 | 49.50 | 50.50 | 49.99 | 48.95 | 48.93 | 48.94 | |
| 52.51 | ||||||||||||
Similarity measure (%) on dataset 1 and dataset 2.
| Method | dataset 1 | dataset 2 | ||||||
|---|---|---|---|---|---|---|---|---|
| DSE | DSE | |||||||
| MiS | Mas | MiS | MaS | MiS | MaS | MiS | MaS | |
| 40.56 | 41.94 | 40.56 | 41.94 | 42.08 | 42.08 | 42.08 | 42.08 | |
| 52.61 | 53.24 | 50.49 | 50.62 | 48.60 | 48.32 | 45.40 | 45.41 | |
| 53.90 | 54.24 | 48.64 | 48.47 | 49.62 | 49.64 | 45.78 | 45.66 | |
| 52.45 | 52.91 | 50.12 | 50.27 | 48.37 | 47.77 | 45.65 | 45.07 | |
| 52.22 | 52.56 | 49.28 | 49.47 | 48.01 | 47.65 | 44.09 | 44.32 | |
| 57.44 | 57.23 | 51.12 | 51.09 | 51.01 | 50.21 | 46.39 | 46.69 | |
| 56.68 | 57.08 | 50.22 | 50.17 | 51.10 | 51.17 | 45.19 | 45.36 | |
| 55.25 | 56.00 | 56.90 | 59.06 | 50.64 | 49.68 | 52.70 | 50.93 | |
| 57.30 | 57.08 | 58.94 | 59.93 | 51.89 | 51.00 | 53.99 | 52.18 | |
| 51.87 | 52.09 | 54.50 | 54.75 | 47.80 | 47.63 | 50.00 | 49.48 | |
Fig 7Measures (Precision recall F1 similarity) on dataset 3 (C = 150).
Fig 8Measures (Precision recall F1 similarity) on dataset 3 (C = 2000).
Fig 9F1-measure curves on dataset 3.
Fig 10Coarse clusters of label embedding.
Fig 11Roc curves on all datasets.
Fig 12Significant test on all datasets.
Comparison of F measure (%) among the eight algorithms selected in the experimental study.
The ranks are used in the computation of the Friedman test.
| Data set | LDA | SVM | NB | LR | HC | CNN+DBN | DBC | B-CNN |
|---|---|---|---|---|---|---|---|---|
| 46.52 (6) | 49.27 (4) | 45.34 (7) | 45.08 (8) | 55.11 (2) | 54.13 (3) | 48.94 (5) | 57.39 (1) | |
| 51.43 (6) | 49.50 (7) | 46.44 (8) | 54.86 (4) | 62.65 (3) | 66.52 (2) | 53.45 (5) | 68.19 (1) | |
| 56.08 (6) | 57.82 (5) | 55.50 (8) | 55.54 (7) | 65.30 (3) | 69.97 (2) | 58.67 (4) | 72.02 (1) | |
| 66.48 (5) | 63.97 (6) | 59.01 (7) | 56.89 (8) | 71.04 (4) | 76.03 (2) | 74.03 (3) | 76.95 (1) | |
| 5.75 | 5.5 | 6 | 6.75 | 5 | 3 | 3.5 | 1 |
Fig 13F1-measure curves on dataset 4 (C = 20).
Classification results (%) on dataset 4 (C = 4).
| Model | 20 News |
|---|---|
| ClassifyLDA-EM [ | 93.60% |
| RCNN [ | 96.49% |
| CNN [ | 94.79% |
| B-CNN | 95.13% |