| Literature DB >> 19208256 |
Abstract
BACKGROUND: The practice of evidence-based medicine (EBM) requires clinicians to integrate their expertise with the latest scientific research. But this is becoming increasingly difficult with the growing numbers of published articles. There is a clear need for better tools to improve clinician's ability to search the primary literature. Randomized clinical trials (RCTs) are the most reliable source of evidence documenting the efficacy of treatment options. This paper describes the retrieval of key sentences from abstracts of RCTs as a step towards helping users find relevant facts about the experimental design of clinical studies.Entities:
Mesh:
Year: 2009 PMID: 19208256 PMCID: PMC2657779 DOI: 10.1186/1472-6947-9-10
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Headings of Structured Abstracts
| Class | Example Heading Names |
|---|---|
| Aim | Goals, Objective, Purpose, Hypothesis, Introduction, Background, Context, Rationale |
| Intervention | Interventions, Interventions of the Study |
| Participants | Population, Patients, Subjects, Sample |
| Outcome Measures | Primary Outcome Parameters, Main Variables, Measures, Measurements, Assessments |
| Method | Materials, Study Design, Setting, Procedures, Process, Methodology, Research Design |
| Results | Results, Findings, Outcomes, Main Outcomes and Results |
| Conclusion | Conclusion, Conclusion and Clinical Relevance, Clinical Implications, Discussion, Interpretation |
Examples of headings in structured abstracts that are mapped to equivalence classes for our classification purposes.
Four Way Cross-Validation Sentence Classification Results on Structured Abstracts: Using CRFs
| With no windowed features | With windowed features | |||||
|---|---|---|---|---|---|---|
| Accuracy = 93.53% | Accuracy = 94.23% | |||||
| Precision | Recall | Precision | Recall | |||
| Aim | 0.97 | 0.97 | 0.97 | 0.98 | 0.98 | 0.98 |
| Method | 0.93 | 0.92 | 0.92 | 0.94 | 0.93 | 0.93 |
| Results | 0.92 | 0.93 | 0.92 | 0.93 | 0.94 | 0.93 |
| Conclusion | 0.95 | 0.95 | 0.95 | 0.96 | 0.95 | 0.95 |
Sentence classification using CRFs into four major rhetorical roles. Results report using 15-fold cross validation for a system that uses no windowed features versus a system that uses windowed features. For this set, there were 13,610 abstracts, 156 k sentences.
Four Way Cross-Validation Sentence Classification Results on Structured Abstracts: Using SVMs
| With no windowed features | With windowed features | |||||
|---|---|---|---|---|---|---|
| Accuracy = 82.88% | Accuracy = 84.82% | |||||
| Precision | Recall | Precision | Recall | |||
| Aim | 0.84 | 0.77 | 0.80 | 0.86 | 0.80 | 0.83 |
| Method | 0.83 | 0.88 | 0.86 | 0.86 | 0.89 | 0.87 |
| Results | 0.85 | 0.87 | 0.86 | 0.85 | 0.89 | 0.87 |
| Conclusion | 0.76 | 0.68 | 0.72 | 0.79 | 0.72 | 0.75 |
Sentence classification using SVMs into four major rhetorical roles. Results report using 15-fold cross validation for a system that uses no windowed features versus a system that uses windowed
Five Way Cross-Validation Sentence Classification Results on Structured Abstracts: Using CRFs
| With no windowed features | With windowed features | |||||
|---|---|---|---|---|---|---|
| Precision | Recall | Precision | Recall | |||
| 1575 abs, 21.2 k sents | Accuracy = 86.30% | Accuracy = 87.99% | ||||
| Aim | 0.99 | 0.99 | 0.99 | 1.00 | 0.99 | 1.00 |
| Method | 0.85 | 0.84 | 0.85 | 0.87 | 0.85 | 0.86 |
| Results | 0.79 | 0.84 | 0.82 | 0.82 | 0.87 | 0.84 |
| Conclusion | 0.93 | 0.92 | 0.93 | 0.94 | 0.93 | 0.93 |
| 1740 abs, 22.9 k sents | Accuracy = 95.17% | Accuracy = 95.10% | ||||
| Aim | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 |
| Method | 0.97 | 0.97 | 0.97 | 0.96 | 0.97 | 0.97 |
| Results | 0.95 | 0.96 | 0.95 | 0.94 | 0.96 | 0.95 |
| Conclusion | 0.94 | 0.93 | 0.94 | 0.94 | 0.93 | 0.93 |
| 2280 abs, 29.8 k sents | Accuracy = 86.74% | Accuracy = 88.43% | ||||
| Aim | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 |
| Method | 0.85 | 0.86 | 0.85 | 0.87 | 0.87 | 0.87 |
| Results | 0.81 | 0.84 | 0.83 | 0.83 | 0.87 | 0.85 |
| Conclusion | 0.93 | 0.93 | 0.93 | 0.94 | 0.93 | 0.94 |
Sentence classification using CRFs into five classes, for each of the three classification problems. Results report using 15-fold cross validation for a system that uses no windowed features versus a system that uses windowed features.
Five Way Cross-Validation Sentence Classification Results on Structured Abstracts: Using SVMs
| With no windowed features | With windowed features | |||||
|---|---|---|---|---|---|---|
| Precision | Recall | -score | Precision | Recall | ||
| 1575 abs, 21.2 k sents | Accuracy = 79.33% | Accuracy = 84.04% | ||||
| Aim | 0.90 | 0.84 | 0.87 | 0.97 | 0.97 | 0.97 |
| Method | 0.80 | 0.82 | 0.81 | 0.84 | 0.82 | 0.83 |
| Results | 0.75 | 0.80 | 0.77 | 0.78 | 0.83 | 0.80 |
| Conclusion | 0.79 | 0.74 | 0.77 | 0.88 | 0.87 | 0.87 |
| 1740 abs, 22.9 k sents | Accuracy = 85.14% | Accuracy = 91.12% | ||||
| Aim | 0.90 | 0.83 | 0.86 | 0.96 | 0.95 | 0.96 |
| Method | 0.86 | 0.91 | 0.88 | 0.93 | 0.95 | 0.94 |
| Results | 0.86 | 0.88 | 0.87 | 0.91 | 0.92 | 0.92 |
| Conclusion | 0.79 | 0.72 | 0.75 | 0.87 | 0.87 | 0.87 |
| 2280 abs, 29.8 k sents | Accuracy = 80.64% | Accuracy = 85.20% | ||||
| Aim | 0.89 | 0.82 | 0.86 | 0.96 | 0.95 | 0.95 |
| Method | 0.82 | 0.85 | 0.84 | 0.85 | 0.85 | 0.85 |
| Results | 0.76 | 0.81 | 0.79 | 0.79 | 0.84 | 0.82 |
| Conclusion | 0.82 | 0.85 | 0.84 | 0.88 | 0.87 | 0.87 |
Sentence classification using SVMs into five classes, for each of the three classification problems. Results report using 15-fold cross validation for a system that uses no windowed features versus a system that uses windowed features.
Example Intervention, Outcome Measure and Participant Sentences
| Intervention |
|---|
| Patients received either diltiazem, 240 mg/day, or amlodipine, 5 mg/day, for 2 weeks followed by diltiazem, 360 mg/day, or amlodipine, 10 mg/day, for 2 weeks. |
| Participants were tested under two single-dose treatment conditions: placebo and citalopram (20 mg). |
| Patients with node-positive (1–3) breast cancer were assigned to open-label epirubicin/vinorelbine (EV), epirubicin/vino-relbine and sequential paclitaxel (EV/T), epirubicin/cyclophosphamide (EC) or epiru-bicin/cyclophosphamide plus sequential paclitaxel (EC/T) therapy. |
| Standard treadmill exercise testing was the primary efficacy assessment. Patients also recorded incidence of angina attacks and use of glyceryl trinitrate spray. |
| Arterial-coronary sinus differences of substrates were measured before cardiopulmonary bypass (CPB) and during early reperfusion. |
| The primary endpoints were overall survival (OS), relapse-free survival (RFS) and event-free survival (EFS). |
| Twenty-eight healthy postmenopausal women, 16 without, and 12 with hormone replacement therapy (HRT) participated in this randomized, double-blind, cross-over study. |
| Nineteen (19) young men, ages between 24 and 42, were enrolled in a single-center, institutional randomized, double-masked, crossover clinical trial. |
| Twenty-four Chinese adults with type 2 diabetes participated. |
Examples of sentences labeled as Intervention, Outcome Measure and Participants in Set 1.
Four Way Sentence Classification Results on Manually Annotated Abstracts
| All Abstracts | Structured Subset | Unstructured Subset | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Accuracy = 94.82% | Accuracy = 98.05% | Accuracy = 87.55% | |||||||
| P | R | P | R | P | R | ||||
| Aim | 0.98 | 0.91 | 0.94 | 0.99 | 0.99 | 0.99 | 0.96 | 0.78 | 0.85 |
| Method | 0.89 | 0.96 | 0.93 | 0.97 | 0.97 | 0.97 | 0.74 | 0.93 | 0.83 |
| Results | 0.97 | 0.94 | 0.96 | 0.98 | 0.98 | 0.98 | 0.95 | 0.85 | 0.90 |
| Conclusion | 0.97 | 0.99 | 0.98 | 0.99 | 0.99 | 0.99 | 0.94 | 0.97 | 0.95 |
Sentence classification using CRFs into four rhetorical roles on manually annotated data set. The CRF model was trained on 13.6 k set of structured abstracts. Precision (P), Recall (R) and F-score (F) are reported for each label over the entire data set (318), the structured subset (211) and unstructured subset (107).
Five Way Classification Including 'Intervention' on Manually Annotated Abstracts
| All Abstracts | Structured Subset | Unstructured Subset | |||||||
|---|---|---|---|---|---|---|---|---|---|
| P | R | P | R | P | R | ||||
| Accuracy = 90.14% | Accuracy = 91.39% | Accuracy = 87.35% | |||||||
| Aim | 0.92 | 0.97 | 0.94 | 0.94 | 0.98 | 0.96 | 0.88 | 0.95 | 0.91 |
| Method | 0.85 | 0.81 | 0.83 | 0.86 | 0.83 | 0.84 | 0.80 | 0.76 | 0.78 |
| Results | 0.91 | 0.97 | 0.92 | 0.91 | 0.95 | 0.93 | 0.89 | 0.92 | 0.90 |
| Conclusion | 0.96 | 0.94 | 0.95 | 0.98 | 0.94 | 0.96 | 0.92 | 0.94 | 0.93 |
| Accuracy = 95.24% | Accuracy = 96.45% | Accuracy = 92.51% | |||||||
| Aim | 0.94 | 0.99 | 0.99 | 0.96 | 1.00 | 0.98 | 0.90 | 0.96 | 0.93 |
| Method | 0.92 | 0.91 | 0.92 | 0.93 | 0.93 | 0.93 | 0.89 | 0.87 | 0.88 |
| Results | 0.98 | 0.99 | 0.99 | 0.99 | 1.00 | 0.99 | 0.96 | 0.97 | 0.96 |
| Conclusion | 0.99 | 0.99 | 0.99 | 1.00 | 1.00 | 1.00 | 0.89 | 0.87 | 0.88 |
| Accuracy = 95.60% | Accuracy = 96.45% | Accuracy = 94.55% | |||||||
| Aim | 0.95 | 0.98 | 0.97 | 0.96 | 1.00 | 0.98 | 0.93 | 0.97 | 0.95 |
| Method | 0.92 | 0.92 | 0.92 | 0.93 | 0.93 | 0.93 | 0.91 | 0.91 | 0.91 |
| Results | 0.98 | 0.99 | 0.99 | 0.99 | 1.00 | 0.99 | 0.97 | 0.99 | 0.98 |
| Conclusion | 0.99 | 0.99 | 0.99 | 1.00 | 1.00 | 1.00 | 0.99 | 0.98 | 0.99 |
| Accuracy = 93.89% | Accuracy = 95.02% | Accuracy = 91.34% | |||||||
| Aim | 0.95 | 0.97 | 0.96 | 0.96 | 0.99 | 0.97 | 0.92 | 0.93 | 0.93 |
| Method | 0.88 | 0.89 | 0.88 | 0.89 | 0.90 | 0.90 | 0.85 | 0.85 | 0.85 |
| Results | 0.99 | 0.99 | 0.99 | 1.00 | 1.00 | 1.00 | 0.96 | 0.96 | 0.97 |
| Conclusion | 0.99 | 0.99 | 0.99 | 1.00 | 1.00 | 1.00 | 0.96 | 0.98 | 0.97 |
Sentence classification using CRFs into five classes including Intervention. Results report on four systems. System 1: baseline system. System 2: feature vectors augmented with section headings from the four rhetorical roles, where they are either mapped from original headings in structured abstracts or predicted by the four class CRF model for unstructured abstracts. System 3 (oracle): feature vectors augmented with manually corrected section headings. System 4: same as System 2 except the training data is also augmented with training data from Set I. Precision (P), Recall (R) and F-score (F) are reported for each label over the entire data set (318), the structured subset (211) and unstructured subset (107).
Five Way Classification Including 'Outcome Measure' on Manually Annotated Abstracts
| All Abstracts | Structured Subset | Unstructured Subset | |||||||
|---|---|---|---|---|---|---|---|---|---|
| P | R | P | R | P | R | ||||
| Accuracy = 90.29% | Accuracy = 90.95% | Accuracy = 00.00% | |||||||
| Aim | 0.98 | 0.97 | 0.97 | 0.98 | 0.98 | 0.98 | 0.97 | 0.94 | 0.96 |
| Method | 0.87 | 0.83 | 0.85 | 0.88 | 0.83 | 0.85 | 0.85 | 0.82 | 0.84 |
| Results | 0.89 | 0.96 | 0.92 | 0.90 | 0.97 | 0.93 | 0.88 | 0.93 | 0.90 |
| Conclusion | 0.96 | 0.93 | 0.95 | 0.98 | 0.94 | 0.96 | 0.93 | 0.93 | 0.93 |
| Accuracy = 94.22% | Accuracy = 94.89% | Accuracy = 92.70% | |||||||
| Aim | 0.98 | 0.98 | 0.98 | 0.98 | 0.99 | 0.99 | 0.96 | 0.95 | 0.96 |
| Method | 0.88 | 0.88 | 0.88 | 0.89 | 0.88 | 0.89 | 0.87 | 0.88 | 0.88 |
| Results | 0.97 | 0.98 | 0.98 | 0.98 | 0.99 | 0.99 | 0.95 | 0.97 | 0.96 |
| Conclusion | 0.99 | 0.99 | 0.99 | 1.00 | 1.00 | 1.00 | 0.97 | 0.97 | 0.97 |
| Accuracy = 94.77% | Accuracy = 94.89% | Accuracy = 94.07% | |||||||
| Aim | 0.98 | 0.99 | 0.99 | 0.98 | 0.99 | 0.99 | 0.98 | 0.98 | 0.98 |
| Method | 0.89 | 0.89 | 0.89 | 0.89 | 0.88 | 0.89 | 0.90 | 0.88 | 0.89 |
| Results | 0.98 | 0.99 | 0.98 | 0.98 | 0.99 | 0.99 | 0.96 | 0.98 | 0.97 |
| Conclusion | 0.99 | 0.99 | 0.99 | 1.00 | 1.00 | 1.00 | 0.99 | 0.98 | 0.98 |
| Accuracy = 95.60% | Accuracy = 96.71% | Accuracy = 93.09% | |||||||
| Aim | 0.99 | 0.98 | 0.98 | 1.00 | 1.00 | 1.00 | 0.96 | 0.95 | 0.96 |
| Method | 0.91 | 0.89 | 0.90 | 0.92 | 0.91 | 0.92 | 0.91 | 0.85 | 0.88 |
| Results | 0.99 | 0.99 | 0.99 | 1.00 | 1.00 | 1.00 | 0.96 | 0.97 | 0.96 |
| Conclusion | 0.99 | 0.99 | 0.99 | 1.00 | 1.00 | 1.00 | 0.97 | 0.98 | 0.97 |
Sentence classification using CRFs into five classes including Outcome Measure. Results report on four systems as described in Table 8. System 4 describes a system identical to System 2 except the training data is augmented with those from Set O.
Five Way Classification Including 'Participants' on Manually Annotated Abstracts
| All Abstracts | Structured Subset | Unstructured Subset | |||||||
|---|---|---|---|---|---|---|---|---|---|
| P | R | P | R | P | R | ||||
| Accuracy = 90.59% | Accuracy = 90.82% | Accuracy = 90.08% | |||||||
| Aim | 0.96 | 0.97 | 0.97 | 0.97 | 0.98 | 0.97 | 0.95 | 0.96 | 0.95 |
| Method | 0.86 | 0.91 | 0.88 | 0.87 | 0.92 | 0.89 | 0.83 | 0.89 | 0.86 |
| Results | 0.91 | 0.94 | 0.92 | 0.90 | 0.95 | 0.93 | 0.93 | 0.92 | 0.92 |
| Conclusion | 0.96 | 0.93 | 0.95 | 0.98 | 0.93 | 0.95 | 0.93 | 0.94 | 0.94 |
| Accuracy = 94.64% | Accuracy = 94.98% | Accuracy = 93.68% | |||||||
| Aim | 0.98 | 0.98 | 0.98 | 0.99 | 0.99 | 0.99 | 0.97 | 0.97 | 0.97 |
| Method | 0.90 | 0.96 | 0.93 | 0.91 | 0.97 | 0.94 | 0.88 | 0.93 | 0.90 |
| Results | 0.96 | 0.98 | 0.97 | 0.96 | 0.99 | 0.98 | 0.96 | 0.96 | 0.96 |
| Conclusion | 0.99 | 0.99 | 0.99 | 1.00 | 1.00 | 1.00 | 0.97 | 0.97 | 0.97 |
| Accuracy = 94.91% | Accuracy = 94.98% | Accuracy = 94.75% | |||||||
| Aim | 0.98 | 0.98 | 0.98 | 0.99 | 0.99 | 0.99 | 0.97 | 0.97 | 0.97 |
| Method | 0.90 | 0.96 | 0.93 | 0.90 | 0.97 | 0.93 | 0.88 | 0.94 | 0.91 |
| Results | 0.97 | 0.97 | 0.98 | 0.96 | 0.99 | 0.98 | 0.97 | 0.98 | 0.98 |
| Conclusion | 0.99 | 0.99 | 0.99 | 1.00 | 1.00 | 1.00 | 0.99 | 0.98 | 0.98 |
| Accuracy = 94.25% | Accuracy = 94.98% | Accuracy = 94.75% | |||||||
| Aim | 0.99 | 0.99 | 0.99 | 1.00 | 1.00 | 1.00 | 0.97 | 0.96 | 0.97 |
| Method | 0.92 | 0.94 | 0.93 | 0.93 | 0.94 | 0.94 | 0.89 | 0.92 | 0.90 |
| Results | 0.96 | 0.98 | 0.97 | 0.95 | 1.00 | 0.97 | 0.97 | 0.95 | 0.96 |
| Conclusion | 0.99 | 0.99 | 0.99 | 1.00 | 1.00 | 1.00 | 0.95 | 0.98 | 0.97 |
Sentence classification using CRFs into five classes including Participants. Results report on four systems as described in Table 8. System 4 describes a system identical to System 2 except the training data is augmented with those from Set P.
Example Unstructured Abstract
| To determine the mechanisms underlying increased aerobic power in response to exercise training in octogenarians, we studied mildly frail elderly men and women randomly assigned to an exercise group (n = 22) who participated in a training program of 6 mo of physical therapy, strength training, and walking followed by 3 mo of more intense endurance exercise at 78% of peak heart rate or a control sedentary group (n = 24). Peak O2 consumption (V(O2 peak)) increased 14% in the exercise group (P 0.0001) but decreased slightly in controls. Training induced 14% increase (P = 0.027) in peak exercise cardiac output (Q), determined via acetylene re-breathing, and no change in arteriovenous O2 content difference. The increase in Q was mediated by increases in heart rate (P = 0.009) and probably stroke volume (P = 0.096). ... |