| Literature DB >> 27695078 |
Behrouz Bokharaeian1, Alberto Diaz1, Hamidreza Chitsaz2.
Abstract
MOTIVATION: Supervised biomedical relation extraction plays an important role in biomedical natural language processing, endeavoring to obtain the relations between biomedical entities. Drug-drug interactions, which are investigated in the present paper, are notably among the critical biomedical relations. Thus far many methods have been developed with the aim of extracting DDI relations. However, unfortunately there has been a scarcity of comprehensive studies on the effects of negation, complex sentences, clause dependency, and neutral candidates in the course of DDI extraction from biomedical articles.Entities:
Mesh:
Year: 2016 PMID: 27695078 PMCID: PMC5047471 DOI: 10.1371/journal.pone.0163480
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1The extended unified XML format of a sentence with negation cue in NegDDI-DrugBank corpus.
The list of the extracted features used in the system.
| Feature category | Feature name | Type | Definition |
|---|---|---|---|
| BothInsideNegSc | Boolean | is set as true when both drugs are inside the negation scope | |
| BothRightNegSc | Boolean | is set as true when both drugs are on the right side of the negation scope | |
| BothLeftSNegSc | Boolean | is set as true when both drugs are on the left side of the negation scope | |
| OneLeftOneInsideNegSc | Boolean | is set as true when one drug is on the left side of the negation scope, and the otheron the inside | |
| OneRightOneInsideNegSc | Boolean | is set as true when one drug is on the right side of the negation scope, and theother on the inside | |
| OneLeftOneRightSc | Boolean | is set as true when one drug is on the right side of the negation scope, and theother on the left | |
| NegationCue | String | Negation cue | |
| AlthoughIS | Boolean | set as true when the sentence has | |
| WhileIS | Boolean | set as true when the sentence has | |
| WhenIS | Boolean | set as true when the sentence has | |
| BeforeIS | Boolean | set as true when the sentence has | |
| NowthatIS | Boolean | set as true when the sentence has | |
| AssoonasIS | Boolean | set as true when the sentence has | |
| AslongasIS | Boolean | set as true when the sentence has | |
| AnywhereIS | Boolean | set as true when the sentence has | |
| UntilIS | Boolean | set as true when the sentence has | |
| OnceIS | Boolean | set as true when the sentence has | |
| TillIS | Boolean | set as true when the sentence has | |
| BecauseIS | Boolean | set as true when the sentence has | |
| ThoughIS | Boolean | set as true when the sentence has | |
| EventhoughIS | Boolean | set as true when the sentence has | |
| SinceIS | Boolean | set as true when the sentence has | |
| ButIS | Boolean | set as true when the sentence has | |
| UnlessIS | Boolean | set as true when the sentence has | |
| afterIS | Boolean | set as true when the sentence has | |
| whereasIS | Boolean | set as true when the sentence has | |
| asthoughIS | Boolean | set as true when the sentence has | |
| sothatIS | Boolean | set as true when the sentence has | |
| inorderthatIS | Boolean | set as true when the sentence has | |
| everywhereIS | Boolean | set as true when the sentence has | |
| evenifIS | Boolean | set as true when the sentence has | |
| RatherthanIS | Boolean | set as true when the sentence has | |
| AslongasIS | Boolean | set as true when the sentence has | |
| OnlyifIS | Boolean | set as true when the sentence has | |
| JustasIS | Boolean | set as true when the sentence has | |
| F-StructuresDependencies | String | Corresponding to every feature F of the original method which contains only tokens or subtrees, if the token or subtree X located in an independent clause, a string X-IDC added to this new feature, otherwise if the token or subtree X located in a dependent clause, a string X-DC added to this new text feature | |
| NeutralCandRule1 | Boolean | (.)*d1(/|s|()d2(.) | |
| NeutralCandRule2 | Boolean | d2 ||d1.contains(OtherNs(d2)) ||(d2.contains(OtherNs(d1)) | |
| NeutralCandRule3 | Boolean | (.)*d1((|s) (N,|e.g.|i.e.|s|DrgNaOth|,|))* d2(.)* | |
| NeutralCandRule4 | Boolean | (.)*d1(s)*,(s|DrgNaOth|,|, and|, other|oral)*d2(.)* | |
| NeutralCandRule5 | Boolean | (.)*(:|such as|e.g.|i.e.)(s|DrgNaOth|,|and|or|and/or)*d1(s|DrgNaOth|,|and)*d2(.)* | |
| NeutralCandRule6 | Boolean | (.)* (been studied)(.)* | |
| NeutralCandRule7 | Boolean | (.)* been investigated (.)* & (.)*(although)(.)* | |
| NeutralCandRule8 | Boolean | (.)* (been established)(.)* | |
| NeutralCandRule9 | Boolean | (.)*(studies)(.)* (performed)(.)*& (.)*(studies)(.)* (conducted)(.)* | |
| NeutralCandRule10 | Boolean | [(.)*][no experience][(.)*] |
Fig 2A sample of a negated sentence with some DDI candidates.
Fig 3A sample of a negated sentence with a concessive clause.
Fig 4A constituency parse tree of a sentence with a concessive dependent clause highlighted in blue and two negation cues.
Fig 5A sample sentence with negation from NegDDI-DrugBank with neutral and distinguished false DDIs.
Fig 6A sample of a sentence having two neutral DDI candidates.
Fig 7A sample of a sentence including a neutral and a distinguished DDI candidate.
Fig 8Basic components of the implemented framework.
Basic statistics of the two utilized datasets of the DDI corpus.
| MEDLINE | DrugBank | |||||
|---|---|---|---|---|---|---|
| Test | Train | Total | Test | Train | Total | |
| 33 | 142 | 175 | 158 | 572 | 730 | |
| 326 | 1301 | 2308 | 973 | 5675 | 6648 | |
| 426 | 1836 | 2308 | 2512 | 12,929 | 15,441 | |
| 95 | 232 | 327 | 884 | 3788 | 4672 | |
| 356 | 1555 | 1911 | 4381 | 22,217 | 26,598 | |
| 126 | 478 | 604 | 2067 | 9215 | 11,282 | |
| 14,358 | 61,525 | 75,883 | 244,658 | 1,163,072 | 1,407,730 | |
| 43 | 316 | 359 | 1367 | 4558 | 5925 | |
| 482 | 2033 | 1787 | 5265 | 31,432 | 36,697 | |
F1-measure results for Global Context (GC), SubTree (ST), and Local Context (LC) kernel methods with and without the NCT augmenting features.
| Category | Test Size | GC (%) | ST (%) | LC (%) | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| - | +NCT | ↑ | - | +NCT | ↑ | - | +NCT | ↑ | ||||
| +Negation | -Connector | 971 | 56.5 | 5.7 | 61.0 | 7.5 | 62.6 | 2.6 | ||||
| +Connector | 396 | 51.7 | 6.7 | 63.2 | 0.2 | 58.0 | 4.9 | |||||
| -Negation | +Connector | 1,005 | 62.3 | 3.9 | 58.6 | 15.4 | 64.8 | 4.8 | ||||
| -Connector | 2,893 | 64.8 | 8.1 | 36.3 | 2.8 | 63.9 | 5.8 | |||||
| Total | 5,265 | 61.7 | 6.5 | 47.1 | 5.9 | 63.4 | 4.9 | |||||
| +Negation | -Connector | 198 | 31.1 | 8.1 | 18.7 | 4 | 39.4 | 11.2 | ||||
| +Connector | 161 | 28.3 | 9.9 | 17.9 | 5.2 | 40.1 | 10 | |||||
| -Negation | +Connector | 443 | 34.6 | 3.8 | 18.9 | 0.9 | 44.2 | 4.5 | ||||
| -Connector | 1436 | 34.1 | 6.5 | 18.2 | 0.2 | 41.7 | 3.1 | |||||
| Total | 2238 | 33.9 | 5.9 | 18.4 | 3.1 | 42.3 | 6.1 | |||||
F1-measure results for the global context kernel with combination of different feature sets: Negation scope and cue (N), Clause dependency (C), and neuTral candidate (T).
| Category | Global Context (%) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| - | +N | +C | +T | +NC | +CT | +NT | +NCT | |||
| +Negation | -Connector | 56.6 | 54.9 | 58.6 | 66.2 | 57.8 | 59.8 | 62.1 | ||
| +Connector | 51.7 | 52.2 | 52.9 | 59.7 | 52.3 | 58.2 | 58.0 | |||
| -Negation | -Connector | 64.7 | 64.8 | 64.8 | 71.8 | 64.8 | 71.9 | 71.9 | ||
| +Connector | 62.3 | 62.3 | 65.3 | 65.3 | 63.7 | 65.7 | 65.9 | |||
| Total | 61.7 | 61.3 | 62.9 | 68.6 | 62.4 | 67.5 | 68.3 | |||
| +Negation | -Connector | 31.1 | 32.6 | 34.2 | 37.5 | 37.2 | 37.4 | 38.2 | ||
| +Connector | 28.3 | 33.5 | 33.5 | 35.2 | 38.4 | 35.8 | 38.2 | |||
| -Negation | -Connector | 34.6 | 38.7 | 35.4 | 35.4 | 36.1 | 34.8 | 37.5 | ||
| +Connector | 34.1 | 38.2 | 36.7 | 37.2 | 35.4 | 36.4 | 39.5 | |||
| Total | 33.9 | 36.6% | 34.2 | 36.0 | 36.8 | 36.7 | 38.4 | |||
F1-measure results for the subtree kernel with combination of different feature sets: Negation scope and cue (N), Clause dependency (C), and neuTral candidate (T).
| Category | SubTree (%) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| - | +N | +C | +T | +NC | +CT | +NT | +NCT | |||
| +Negation | -Connector | 60.9 | 59.2 | 59.9 | 66.9 | 68.9 | 59.9 | 68.5 | ||
| +Connector | 63.2 | 63.1 | 62.6 | 63.2 | 62.7 | 63.2 | 63.1 | |||
| -Negation | -Connector | 58.6 | 62.9 | 59.7 | 68.5 | 59.5 | 68.4 | |||
| +Connector | 36.3 | 36.3 | 36.3 | 38.7 | 36.3 | 38.6 | 36.3 | |||
| Total | 47.1 | 47.6 | 47.1 | 51.4 | 48.7 | 50.1 | 51.6 | |||
| +Negation | -Connector | 18.7 | 19.9 | 20.2 | 20.8 | 19.8 | 22.6 | 22.7 | ||
| +Connector | 17.9 | 19.4 | 19.6 | 19.6 | 17.3 | 18.7 | 20.7 | |||
| -Negation | -Connector | 18.9 | 18.8 | 19.8 | 20.9 | 19.7 | 19.7 | 19.8 | ||
| +Connector | 18.2 | 19.6 | 19.1 | 19.8 | 15.8 | 18.6 | 18.4 | |||
| Total | 18.4 | 19.8 | 19.6 | 19.9 | 19.6 | 20.3 | 21.4 | |||
F1-measure results for the local context kernel with combination of different feature sets: Negation scope and cue (N), Clause dependency (C), and neuTral candidate (T).
| Category | Local Context (%) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| - | +N | +C | +T | +NC | +CT | +NT | +NCT | |||
| +Negation | -Connector | 62.6 | 63.4 | 62.8 | 61.5 | 65.7 | 65.6 | 65.2 | ||
| +Connector | 58.0 | 52.2 | 60.9 | 67.2 | 50.9 | 64.8 | 63.1 | |||
| -Negation | -Connector | 64.8 | 65.9 | 64.9 | 66.2 | 65.7 | 66.9 | 68.9 | ||
| +Connector | 63.9 | 65.3 | 63.9 | 69.6 | 64.2 | 69.9 | 69.9 | |||
| Total | 63.4 | 64.1 | 63.7 | 68.1 | 63.0 | 68.4 | ||||
| +Negation | -Connector | 39.4 | 43.4 | 49.2 | 51.2 | 48.5 | 46.8 | 48.6 | ||
| +Connector | 40.1 | 44.2 | 50.2 | 48.4 | 48.9 | 50.2 | 50.1 | |||
| -Negation | -Connector | 44.2 | 37.2 | 42.5 | 42.6 | 45.9 | 44.7 | 48.7 | ||
| +Connector | 41.7 | 48.4 | 43.2 | 49.8 | 46.1 | 47.3 | 44.8 | |||
| Total | 42.3 | 43.5 | 45.7 | 46.5 | 47.7 | 47.9 | 48.2 | |||
The f-score and calculated p-values by sign test for the test parts of the two datasets of the three improved and original methods.
| Method | - | +NCT (%) | M+ | M- | ||
|---|---|---|---|---|---|---|
| 61.7 | 425 | 62 | 9.0e-53 | |||
| 47.1 | 395 | 65 | 3.6e-71 | |||
| 63.4 | 480 | 73 | 3.3e-64 | |||
| 33.9 | 143 | 35 | 2.0e-23 | |||
| 18.4 | 129 | 38 | 2.3e-32 | |||
| 42.3 | 153 | 34 | 3.4e-43 |