| Literature DB >> 22166494 |
Dietrich Rebholz-Schuhmann1, Antonio Jimeno Yepes, Chen Li, Senay Kafkas, Ian Lewin, Ning Kang, Peter Corbett, David Milward, Ekaterina Buyko, Elena Beisswanger, Kerstin Hornbostel, Alexandre Kouznetsov, René Witte, Jonas B Laurila, Christopher Jo Baker, Cheng-Ju Kuo, Simone Clematide, Fabio Rinaldi, Richárd Farkas, György Móra, Kazuo Hara, Laura I Furlong, Michael Rautschka, Mariana Lara Neves, Alberto Pascual-Montano, Qi Wei, Nigel Collier, Md Faisal Mahbub Chowdhury, Alberto Lavelli, Rafael Berlanga, Roser Morante, Vincent Van Asch, Walter Daelemans, José Luís Marina, Erik van Mulligen, Jan Kors, Udo Hahn.
Abstract
BACKGROUND: Competitions in text mining have been used to measure the performance of automatic text processing solutions against a manually annotated gold standard corpus (GSC). The preparation of the GSC is time-consuming and costly and the final corpus consists at the most of a few thousand documents annotated with a limited set of semantic groups. To overcome these shortcomings, the CALBC project partners (PPs) have produced a large-scale annotated biomedical corpus with four different semantic groups through the harmonisation of annotations from automatic text mining solutions, the first version of the Silver Standard Corpus (SSC-I). The four semantic groups are chemical entities and drugs (CHED), genes and proteins (PRGE), diseases and disorders (DISO) and species (SPE). This corpus has been used for the First CALBC Challenge asking the participants to annotate the corpus with their text processing solutions.Entities:
Year: 2011 PMID: 22166494 PMCID: PMC3239301 DOI: 10.1186/2041-1480-2-S5-S11
Source DB: PubMed Journal: J Biomed Semantics
The table gives an overview on the annotation solutions that have been used for the generation of the SSC-I and the SSC-II. For the generation of the SSC-I only the annotations from the 4 project partners (P01 – P04) have been integrated, whereas the SSC-II combines the annotations from the challenge participants (P06-P10, P13 and P15), not including P11, P12 and P14, since they have used the training data. Please refer to the proceedings of the first CALBC workshop for further details [8].
| Solution | PPs | CPs | Use of Training Data | PRGE | CHED | DISO | SPE |
|---|---|---|---|---|---|---|
| Dictionary-based concept recognition | P01 | [ / ] | UniProtKb | Jochem | UMLS | NCBI taxonomy |
| P02 | [ / ] | Different resources incl. UniProtKb, EntrezGene | Jochem | UMLS | NCBI taxonomy | |
| P04 | [ / ] | UniProtKb, EntrezGene | Jochem | MeSH, MedDRA, NCI, SNOMED-CT UMLS | NCI, MeSH, SNOMED-CT | |
| P06 | [ / ] | |||||
| P10 | [ / ] | UniProtKb, EntrezGene | NCBI taxonomy | |||
| P13 | [ / ] | |||||
| Indexing of tokens and terms | P15 | [ / ] | UMLS | UMLS | UMLS | UMLS |
| Both, trained & rule-based solutions | P03 | [ / ] | UniProtKb, EntrezGene | Jochem | UMLS | NCBI taxonomy |
| Case-based reasoning | P09 | [ / ] | UMLS | |||
| CRF based, trained NER solution | P07 | [ / ] | ||||
| P16 | [ / ] | Genia | UMLS | |||
| P11 | YES | [ / ] | [ / ] | [ / ] | [ / ] | |
| P12 | YES | [ / ] | [ / ] | [ / ] | [ / ] | |
| P14 | YES | [ / ] | [ / ] | [ / ] | [ / ] | |
The table shows the number of annotations that are contained in the SSC-I. This corpus has been generated from the contributions of the PPs. Not all challenge participants (CPs) have participated in all parts of the challenge. A smaller number of CPs has submitted annotations for chemical entities. The average number of annotations for CHED and PRGE in the submitted corpora was above the number of annotations in the SSC-I and for DISO and SPE below the number of the ones in the SSC-I.
| Nr. of annotations in SSC-I | Nr. of CPs | Nr. of Submissions from CPs | Average nr. of annotations from all CPs | Nr. of annotations in SSC-II | |
|---|---|---|---|---|---|
| CHED | 228,622 | 6 | 11 | 233,398 | 238,431 |
| PRGE | 275,235 | 9 | 15 | 343,681 | 435,797 |
| DISO | 300,637 | 8 | 11 | 255,599 | 245,524 |
| SPE | 317,211 | 7 | 9 | 277,071 | 304,503 |
The table shows the F-measure performance of the PPs and the CPs against the SSC-I (cos-98 harmonisation, 2 vote agreement). The project partners are part of the comparison (P01 – P04). P11, P12, and P14 used the training data for their annotations. Only the best performing submission of each CP was included into the analysis. P09 only contributed a small number of annotations in the submitted corpus.
| Cos-98 | P03 | P04 | P01 | P02 | P10 | P08 | P15 | P09 | P06 | P07 | P09 | P13 | P16 | |||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SPE | 93% | 93% | 79% | 83% | 71% | 69% | 84% | 69% | 56% | 42% | 2% | |||||
| DISO | 87% | 89% | 71% | 69% | 82% | 76% | 78% | 62% | 51% | 32% | 3% | 73% | ||||
| CHEM | 75% | 82% | 49% | 68% | 51% | 20% | 17% | 3% | 23% | |||||||
| PRGE | 77% | 66% | 66% | 59% | 40% | 52% | 12% | 18% | 2% | 50% | 11% | 28% | ||||
| Avg. | 86% | 85% | 76% | 75% | 67% | 68% | 68% | 58% | 35% | 27% | 2% | |||||
Figure 1(Proteins/Genes in SSC-I and SSC-II): The figure displays the comparison of the CPs performances to the SSC-I (left side) and to the SSC-II (right part) for the annotation of proteins and genes (PRGE). Only a restricted number of annotated copora from CPs can be measured against the SSC-II, since a few submissions are based on solutions that have been trained on the SSC-I. The two diagrams display scatter plots for the precision and recall values of the different annotation solutions. Red circles denote systems that have used the training data and yellow circles denote the annotations delivered by the PPs’ annotation solutions.
Figure 2(Chemical Entities in SSC-I and SSC-II): The figure displays the performances for the PPs’ and CPs’ annotated datasets for chemical entities (CHED) measured against the SSC-I (left side) and the SSC-II (right part). For further details please refer to Fig. 1.
Figure 3(Diseases in SSC-I and SSC-II): Distribution of the CPs’ contributions in a Prec/Rec scatter plot for the category of disease annotations (DISO). The best performing solutions were again trained on the training data and achieved performances of almost 90% recall at 90% precision.
Figure 4(Species in SSC-I and SSC-II): Distribution of the CPs’ contributions in a Prec/Rec scatter plot for species (SPE). The two machine-learning approaches showed almost identical performances.
F-measure performance of the contributions from the PPs and the challenge participants against the SSC-II (harmonisation: 98% cosine similarity, 3 vote agreement, 1,030 documents, see Material & Methods).
| Partners | Participants | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| P01 | P02 | P03 | P04 | P08 | P09 | P15 | P06 | P10 | P16 | avg | |
| SPE | 69.9% | 66.6% | 72.6% | 79.1% | 60.2% | 2.3% | 44.2% | 87.8% | 60.3% | ||
| DISO | 77.2% | 67.4% | 68.9% | 65.6% | 53.5% | 2.5% | 31.5% | 75.7% | 80.6% | 58.1% | |
| CHEM | 40.3% | 76.8% | 70.8% | 58.6% | 26.0% | 4.4% | 16.0% | 41.8% | |||
| PRGE | 62.6% | 47.1% | 58.9% | 58.6% | 33.1% | 3.2% | 34.6% | 54.0% | 47.1% | 44.4% | |
Figure 5(F-measure performances for PRGE and CHED): The left and the right diagram show the performances for the different annotation solutions against the SSC-I (blue diamond) and against the SSC-II (red box). Each pair-wise entry represents a single annotation solution. The first four solutions have been provided by the PPs, the other solutions are taggers from the CPs. The left diagram shows the results for the PRGE annotations and the right diagram shows the results for the CHED annotations.
Figure 6(F-measure Performances for SPE and DISO): The left diagram shows the results for the species annotations and the right diagram shows the results for the disease annotations (for details please refer to fig. 5).
The table shows the direct measurement of the SSC-I against the SSC-II that has been generated with the similarity measure of 98% cosine similarity scoring and a 3-vote agreement between the participants. The comparison is based on a 98% cosine similarity score.
| Reference SSC-I (cos 0.98) | ||||
|---|---|---|---|---|
| SSC-II | DISO | SPE | PRGE | CHED |
| Rec | 89.0% | 94.5% | 59.7% | 49.6% |
| Prec | 71.6% | 90.0% | 96.8% | 49.4% |
| F-meas | 79.3% | 92.2% | 73.8% | 49.5% |