| Literature DB >> 18410678 |
Nikiforos Karamanis1, Ruth Seal, Ian Lewin, Peter McQuilton, Andreas Vlachos, Caroline Gasperin, Rachel Drysdale, Ted Briscoe.
Abstract
BACKGROUND: Despite increasing interest in applying Natural Language Processing (NLP) to biomedical text, whether this technology can facilitate tasks such as database curation remains unclear.Entities:
Mesh:
Year: 2008 PMID: 18410678 PMCID: PMC2375127 DOI: 10.1186/1471-2105-9-193
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1PaperView navigator. Screenshot of PaperBrowser's PaperView navigation mechanism, which lists automatically recognised gene names in the order in which they appear in each section.
Figure 2EntitiesView navigator. Screenshot of PaperBrowser's EntitiesView navigation mechanism, which groups together related noun phrases such as "dpp" and "the dpp pathway".
Figure 3Natural Language Processing pipeline. Figure 3 shows how a series of NLP modules are arranged in a pipeline fashion to produce the format which is displayed on PaperBrowser for each curatable article.
Figure 4Experimental condition. Screenshot showing how PaperBrowser's windows were arranged in the experimental condition.
Figure 5Control condition. Screenshot showing how PaperBrowser's windows were arranged in the control condition.
Navigation using "Find"
| Event type | NAVS:OFF | NAVS:ON |
| All highlighting events | 45.02% (375/833) | 3.16% (26/822) |
| TEXT:BEFORE | 69.07% (67/97) | 10.99% (10/91) |
Table 1 shows the percentage of highlighting events which were navigated using "Find" (PREV:FIND) in control (NAVS:OFF) and experimental (NAVS:ON) conditions. The "All highlighting events" row shows the percentage of navigated events over all highlighting events. The TEXT:BEFORE row shows the percentage of navigated events when text was highlighted-back in the article.
Navigation preferences
| Event type | PREV:FIND | PREV:NAVS |
| All navigated events | 10.04% (26/259) | 83.01% (215/259) |
| TEXT:BEFORE | 13.70% (10/73) | 86.30% (63/73) |
Table 2 shows the percentage of navigated events using "Find" (PREV:FIND) and PaperBrowser's navigators (PREV:NAVS) in the experimental condition (NAVS:ON). The "All navigated events" row reports the percentage over all navigated events. The TEXT:BEFORE row shows the percentage over the navigated events in which text was highlighted-back in the article.
Mean distance in control condition
| PREV:FIND | PREV:NONE | |
| Total mean | ||
| Counts | 375 | 458 |
| Standard deviation | 2476.82 | 1591.97 |
| CUR01 mean | ||
| Counts | 334 | 201 |
| Standard deviation | 2507.45 | 660.43 |
| CUR02 mean | ||
| Counts | 41 | 257 |
| Standard deviation | 2126.30 | 1988.33 |
Mean distance in tokens (DIST) between text marked in a navigated event (PREV:FIND) or an unnavigated event (PREV:NONE) and the text highlighted in the immediately preceding event in the control condition (NAVS:OFF). Curator-specific data are indicated by the curator's ID (CUR01 vs CUR02). Mean is calculated as the sum of all observations divided by the number of observations (counts).
Mean distance in experimental condition
| PREV:NAVS | PREV:NONE | |
| Total mean | ||
| Counts | 215 | 563 |
| Standard deviation | 2828.91 | 1012.22 |
| CUR01 mean | ||
| Counts | 180 | 243 |
| Standard deviation | 2624.99 | 174.58 |
| CUR02 mean | ||
| Counts | 35 | 320 |
| Standard deviation | 2812.57 | 1308.64 |
Mean distance in tokens (DIST) between text marked in a navigated event (PREV:NAVS) or an unnavigated event (PREV:NONE) and the text highlighted in the immediately preceding event in the experimental condition (NAVS:ON). Curator-specific data are indicated by the curator's ID (CUR01 vs CUR02).
Navigation efficiency
| PREV:FIND | PREV:NAVS | |
| Total mean | ||
| Counts | 375 | 215 |
| Standard deviation | 9.31 | 2.83 |
| CUR01 mean | ||
| Counts | 334 | 180 |
| Standard deviation | 9.24 | 2.97 |
| CUR02 mean | ||
| Counts | 41 | 35 |
| Standard deviation | 9.93 | 1.85 |
Mean number of navigation actions occurring before each navigated event using "Find" (PREV:FIND) in the control condition and PaperBrowser's navigators (PREV:NAVS) in the experimental condition. Curator-specific data are indicated by the curator's ID (CUR01 vs CUR02).
Navigation utility
| PREV:FIND | PREV:NAVS | |
| Total mean | ||
| Counts | 375 | 215 |
| Standard deviation | 2.35 | 4.17 |
| CUR01 mean | ||
| Counts | 334 | 180 |
| Standard deviation | 1.29 | 1.95 |
| CUR02 mean | ||
| Counts | 41 | 35 |
| Standard deviation | 5.90 | 8.54 |
Mean number of unnavigated events occurring after an event navigated using "Find" (PREV:FIND) in the control condition or PaperBrowser's navigators (PREV:NAVS) in the experimental condition. Curator-specific data are indicated by the curator's ID (CUR01 vs CUR02).