| Literature DB >> 24568600 |
Artjom Klein1, Alexandre Riazanov, Matthew M Hindle, Christopher Jo Baker.
Abstract
BACKGROUND: Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems.Entities:
Year: 2014 PMID: 24568600 PMCID: PMC3939821 DOI: 10.1186/2041-1480-5-11
Source DB: PubMed Journal: J Biomed Semantics
Figure 1Modelling example. Document with PMID:18669538 reports a combined mutation (R205A and R231A) which impacts protein P37089 resulting in a negative effect on the protein property GO_0005272.
Main ontologies used in the benchmarking infrastructure for mutation text mining
| ao | |
| aof | |
| aos | |
| lsrn | |
| mieo | |
| sio |
Corpus statistics for the mutation grounding task
| EnzyMiner | 38 | 49 | 176 |
| KinMutBase | 128 | 26 | 271 |
| DHLA | 13 | 4 | 52 |
| PIK3CA | 30 | 1 | 169 |
| FGFR3 | 26 | 1 | 174 |
| MEN1 | 7 | 1 | 22 |
(∗) - unique per document.
Corpus statistics for mutation impact extraction tasks
| | |||||
|---|---|---|---|---|---|
| | |||||
| OMM Impact | 40 | 223 | - | 2045 | 1997 |
| EnzyMiner | 38 | 172 | 282 | 440 | 440 |
| DHLA | 13 | 52∗∗ | 73 | - | - |
(∗) - Unique per document.
(∗∗) - The OMM Impact and Enzyminer corpora contain single point mutations as well as combined mutations. There are only single point mutations in the DHLA corpus.
Corpus schemas
| EnzyMiner | PMID, UniProt ID, mutation, protein property, |
| | impact direction, impact sentence |
| DHLA | PMID, UniProt ID, mutation, protein property, |
| | impact direction |
| KinMutBase | PMID, UniProt ID, mutation |
| PIK3CA | PMID, UniProt ID, mutation |
| FGFR3 | PMID, UniProt ID, mutation |
| MEN1 | PMID, UniProt ID, mutation |
| OMM Impact | PMID, EC number, mutation, impact sentence |
Mutation impact extraction systems
| Mutation recognition | + | + |
| Mutation series recognition | + | - |
| Mutation-protein grounding | - | + |
| Impact sentence recognition | + | + |
| Impact sentence grounding to mutation | + | + |
| Protein property recognition and normalization | + | + |
| Impact direction recognition | + | + |
| Physical quantity recognition | + | - |
| Protein property-Physical quantity grounding | + | - |
Mutation impact extraction systems: evaluation results (micro averaging)
| | ||
| OMM | 0.03/0.02 | 0.34/0.29 |
| MIES | 0.21/0.05 | 0.78/0.44 |
| | ||
| OMM | 0.59/0.32 | 0.76/0.63 |
| MIES | 0.17/0.12 | 0.28/0.05 |
| | ||
| OMM | 0.59 | 0.71 |
| MIES | 0.86 | 0.69 |
P/R – precision/recall. A – accuracy.