| Literature DB >> 20178611 |
Quoc-Chinh Bui1, Breanndán O Nualláin, Charles A Boucher, Peter M A Sloot.
Abstract
BACKGROUND: In HIV treatment it is critical to have up-to-date resistance data of applicable drugs since HIV has a very high rate of mutation. These data are made available through scientific publications and must be extracted manually by experts in order to be used by virologists and medical doctors. Therefore there is an urgent need for a tool that partially automates this process and is able to retrieve relations between drugs and virus mutations from literature.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20178611 PMCID: PMC2841207 DOI: 10.1186/1471-2105-11-101
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Work flow of the system.
Examples of relation words and their categories
| Resistant | Susceptible | Associated | Responsive |
|---|---|---|---|
| Resistance, resistant, antagonize | Susceptibility, susceptible, sensitivity | Associate, association, bind, incorporation | Response, responsible |
Examples of manner words and their corresponding groups.
| High | Increase | Medium | Decrease | Low | No manner |
|---|---|---|---|---|---|
| High, full, strong, significant | Increase, higher | Intermediate, medium, moderate | Decrease, reduce, lower, diminished | Low, weak, loss |
Figure 2Penn Treebank output of the Stanford parser.
Main grammatical relations and some of their values generated from parse tree in Figure 2.
| Component | Explanation and example |
|---|---|
| nsubj | Nominal subject: MUTATION0 |
| nsubjpass | Nominal passive subject |
| pre | Predicate of a clause: increased resistance to DRUG0, but not DRUG1 |
| dobj | Direct object: resistance |
| iobj | Indirect object |
| pobj | Prepositional object: DRUG0, but not DRUG1 |
| prep | Prepositional modifier: to DRUG0, but not DRUG1 |
| cc | Coordination: but not |
| conj | Conjunction: DRUG1 |
| neg | Negation: not |
| acomp | Adjectival complement |
Figure 3Extracting relations from grammatical relations of a simplified sentence.
Output extracted relations between K65R mutation and D4T when running the system over all candidate sentences.
| Mutation | Relation | Drug |
|---|---|---|
| K65R | resistance to | D4T |
| K65R | reducing resistance to | D4T |
| K65R | resistance to | D4T |
| K65R | resistance to | D4T |
| K65R | resistance to | D4T |
| K65R | resistance to | D4T |
| K65R | result to | D4T |
| K65R | increased susceptibility to | D4T |
| K65R | fully susceptible to | D4T |
| K65R | fully susceptible to | D4T |
| K65R | resulted in reduced susceptibilities to | D4T |
Figure 4Example of categorizing extracted relations of the K65R mutation and D4T.
Datasets statistics
| Dataset | Number of instances | |
|---|---|---|
| Positive | Negative | |
| 500 sentences from PubMed abstracts | 1095 | 921 |
| 130 sentences from Stanford HIVDB comments | 307 | 261 |
Performance of the system compared with the baseline method over 2 datasets.
| Datasets | Base_C | Rule based | ||||
|---|---|---|---|---|---|---|
| P | R | F | P | R | F | |
| 500 sentences from PubMed abstracts | 53.6 | 100 | 69.8 | |||
| 130 sentences from Stanford HIVDB comments | 54.4 | 100 | 70 | |||
Evaluation of the performance (%): precision (P), recall (R), and F-score (F) of the proposed method (rule based) and the co-occurrence baseline method (Base_C)
Prediction results of mutation K65R and its related drugs.
| Mutation | Drug | Resistance type | HIVDB | REGADB |
|---|---|---|---|---|
| K65R | 3TC | I | I | I |
| K65R | ABC | I | I | S |
| K65R | AZT | S | S | S |
| K65R | D4T | I | I | S |
| K65R | DDI | I | I | I |
| K65R | FTC | I | I | I |
| K65R | TDF | I | I | R |
| K65R | DDC | I | N/A | N/A |
The results of K65R mutation and its drug resistance value calculated by the system compared with the result of the Stanford HIVDB based on three levels of resistance: susceptible (S), intermediate resistant (I), and resistant (R). In addition, we also provided the output of the RegaDB to show the differences between the expert systems.
Summary of the prediction results of the 10 most frequent mutations and their related drugs extracted from text compares with the HIVDB on two levels and three levels of resistance: susceptible (S), intermediate resistant (I), and resistant (R).
| Mutation | Drugs | Agreement with the Stanford HIVDB output (%) | |
|---|---|---|---|
| Two levels: S, R | Three levels: S, I, R | ||
| I84V | ATV, IDV, LPV, NFV, SQV, TPV | 6/6 | 6/6 |
| K103N | AZT, DLV, EFV, NVP | 3/4 | 6/6 |
| K65R | 3TC, ABC, AZT, D4T, DDI, FTC, TDF | 7/7 | 7/7 |
| L74V | 3TC, ABC, AZT, D4T, DDI | 3/5 | 3/5 |
| L90M | ATV, IDV, LPV, NFV, SQV | 5/5 | 4/5 |
| M184V | 3TC, ABC, AZT, D4T, DDI, EFV, FTC, NVP, TDF | 6/9 | 7/9 |
| M46I | ATV, IDV, NFV, SQV | 3/4 | 2/4 |
| Q151M | 3TC, ABC, AZT, D4T, DDI | 5/5 | 3/5 |
| V82A | IDV, LPV, NFV, SQV | 4/4 | 3/4 |
| Y181C | AZT, D4T, DLV, EFV, NVP | 4/5 | 3/5 |