| Literature DB >> 31437911 |
Tian Kang1, Shirui Zou2, Chunhua Weng1.
Abstract
PICO (Population/problem, Intervention, Comparison, and Outcome) is widely adopted for formulating clinical questions to retrieve evidence from the literature. It plays a crucial role in Evidence-Based Medicine (EBM). This paper contributes a scalable deep learning method to extract PICO statements from RCT articles. It was trained on a small set of richly annotated PubMed abstracts using an LSTM-CRF model. By initializing our model with pretrained parameters from a large related corpus, we improved the model performance significantly with a minimal feature set. Our method has advantages in minimizing the need for laborious feature handcrafting and in avoiding the need for large shared annotated data by reusing related corpora in pretraining with a deep neural network.Entities:
Keywords: Evidence-Based Medicine; Natural Language Processing; Randomized Controlled Trial
Mesh:
Year: 2019 PMID: 31437911 PMCID: PMC6852618 DOI: 10.3233/SHTI190209
Source DB: PubMed Journal: Stud Health Technol Inform ISSN: 0926-9630
Figure 1 -Overview of the PICO recognition tool development. We compared two optional ways for training the LSTM-CRF model (in blue): 1) with random initialization of model parameters (green, on the left); 2) “pretrain” the model with the same architecture on EBM-NLP corpus, resulting in a better parameter initialization.
Figure 2 –Example of our annotation in brat
Figure 3 -Base model detailed architecture. It’s used to train both PICO recognition model and EBM-NLP corpus.
Descriptive statistics of the annotated corpora
| Entity class | Attribute class | ||||
|---|---|---|---|---|---|
| P. | I. (+C.) | O. | Qualifier | Measure | |
| 1185 | 2027 | 2140 | 766 | 904 | |
| 0.916 | 0.844 | 0.727 | 0.955 | 0.954 | |
Model performance in different training settings.
| No Pre + Raw | No Pre + BIO | Pre + Raw | Pre + BIO | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Best | Ave. | Best | Ave. | Best | Ave. | Best | Ave. | ||
| 0.78 | 0.76 | 0.84 | 0.85 | 0.93 | 0.87 | 0.86 | 0.83 | ||
| 0.62 | 0.63 | 0.68 | 0.66 | 0.80 | 0.70 | 0.71 | 0.7 | ||
| 0.69 | 0.66 | 0.75 | 0.74 | 0.78 | 0.73 | ||||
| 0.54 | 0.52 | 0.58 | 0.53 | 0.74 | 0.61 | 0.63 | 0.63 | ||
| 0.53 | 0.51 | 0.56 | 0.52 | 0.74 | 0.56 | 0.64 | 0.61 | ||
| 0.53 | 0.52 | 0.57 | 0.54 | 0.74 | 0.58 | ||||
Detailed evaluation for one set in PICO/attribute
| B-Pop. | I-Pop. | B-Int. | I-Int. | B-Out. | I-Out. | |
|---|---|---|---|---|---|---|
| 0.82 | 0.84 | 0.82 | 0.78 | 0.88 | 0.85 | |
| 0.68 | 0.65 | 0.70 | 0.50 | 0.75 | 0.42 | |
| 0.75 | 0.74 | 0.75 | 0.61 | 0.81 | 0.56 | |
| B-Mea. | I-Mea. | B-Qua. | I-Qua. | |||
| 0.77 | 0.85 | 0.91 | 0.5 | |||
| 0.65 | 0.65 | 0.60 | 0.17 | |||
| 0.71 | 0.74 | 0.72 | 0.25 |
Figure 4 -Sample output for our PICO extraction method