| Literature DB >> 19025686 |
Halil Kilicoglu1, Sabine Bergler.
Abstract
BACKGROUND: Due to the nature of scientific methodology, research articles are rich in speculative and tentative statements, also known as hedges. We explore a linguistically motivated approach to the problem of recognizing such language in biomedical research articles. Our approach draws on prior linguistic work as well as existing lexical resources to create a dictionary of hedging cues and extends it by introducing syntactic patterns. Furthermore, recognizing that hedging cues differ in speculative strength, we assign them weights in two ways: automatically using the information gain (IG) measure and semi-automatically based on their types and centrality to hedging. Weights of hedging cues are used to determine the speculative strength of sentences.Entities:
Mesh:
Year: 2008 PMID: 19025686 PMCID: PMC2586760 DOI: 10.1186/1471-2105-9-S11-S10
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Examples of lexical surface realizations of hedging
| Modal auxiliaries | |
| Epistemic judgment verbs | |
| Epistemic evidential verbs | |
| Epistemic deductive verbs | |
| Epistemic adjectives | |
| Epistemic adverbs | |
| Epistemic nouns |
Syntactic patterns and their effect on hedging strength
| <EPISTEMIC VERB> | +1 |
| <EPISTEMIC VERB> | +2 |
| Otherwise | -1 |
| <EPISTEMIC NOUN> followed by | +2 |
| Otherwise | -1 |
| +1 | |
| +2 | |
| +1 | |
| +1 | |
| 3(SA) |
Evaluation results of the baseline methods using the fruit-fly dataset
| baseline1 (14 strings) | 0.79 | 0.40 | 0.82 | 0.53 |
| baseline2 (15 strings) | 0.95 | 0.43 | 0.85 | 0.60 |
Evaluation results from our system using SA weighting on the fruit-fly dataset
| 1 | 0.68 | 0.95 | 0.88 | 0.79 |
| 2 | 0.74 | 0.94 | 0.90 | 0.83 |
| 3 | 0.85 | 0.86 | 0.93 | 0.85 |
| 4 | 0.91 | 0.71 | 0.91 | 0.80 |
| 5 | 0.92 | 0.63 | 0.89 | 0.75 |
| 6 | 0.97 | 0.40 | 0.85 | 0.57 |
| 7 | 1 | 0.19 | 0.79 | 0.33 |
Evaluation results from our system using IG weighting on the fruit-fly dataset
| 1 | 0.66 | 0.89 | 0.86 | 0.76 |
| 1.5 | 0.81 | 0.79 | 0.90 | 0.80 |
| 2 | 0.83 | 0.69 | 0.89 | 0.75 |
| 3 | 0.88 | 0.53 | 0.87 | 0.66 |
| 5 | 0.98 | 0.25 | 0.81 | 0.40 |
| 6 | 1 | 0.13 | 0.79 | 0.24 |
Evaluation results of the baseline methods using the BMC dataset
| baseline1 (14 strings) | 0.65 | 0.52 | 0.87 | 0.58 |
| baseline2 (15 strings) | 0.83 | 0.47 | 0.89 | 0.60 |
Evaluation results from our system using SA weighting on the BMC dataset
| 1 | 0.58 | 0.96 | 0.87 | 0.73 |
| 2 | 0.66 | 0.94 | 0.90 | 0.77 |
| 3 | 0.80 | 0.85 | 0.94 | 0.82 |
| 4 | 0.83 | 0.65 | 0.92 | 0.73 |
| 5 | 0.95 | 0.56 | 0.92 | 0.70 |
| 6 | 0.97 | 0.35 | 0.89 | 0.52 |
| 7 | 0.98 | 0.21 | 0.86 | 0.35 |
Evaluation results from our system using IG weighting on the BMC dataset
| 1 | 0.48 | 0.82 | 0.81 | 0.60 |
| 1.5 | 0.66 | 0.74 | 0.89 | 0.70 |
| 1.75 | 0.75 | 0.67 | 0.90 | 0.71 |
| 2 | 0.75 | 0.65 | 0.90 | 0.70 |
| 2.5 | 0.88 | 0.54 | 0.91 | 0.67 |
| 5 | 0.97 | 0.35 | 0.89 | 0.52 |
Recall/precision break-even point (BEP) results
| baseline1 | 0.60 |
| baseline2 | 0.76 |
| Our system on the fruit-fly dataset with SA weighting | 0.85 |
| Our system on the fruit-fly dataset with IG weighting | 0.80 |
| Our system on the BMC dataset with SA weighting | 0.82 |
| Our system on the BMC dataset with IG weighting | 0.70 |