| Literature DB >> 34322654 |
Darshini Mahendran1, Gabrielle Gurdin1, Nastassja Lewinski2, Christina Tang2, Bridget T McInnes1.
Abstract
Chemical patents are an essential source of information about novel chemicals and chemical reactions. However, with the increasing volume of such patents, mining information about these chemicals and chemical reactions has become a time-intensive and laborious endeavor. In this study, we present a system to extract chemical reaction events from patents automatically. Our approach consists of two steps: 1) named entity recognition (NER)-the automatic identification of chemical reaction parameters from the corresponding text, and 2) event extraction (EE)-the automatic classifying and linking of entities based on their relationships to each other. For our NER system, we evaluate bidirectional long short-term memory (BiLSTM)-based and bidirectional encoder representations from transformer (BERT)-based methods. For our EE system, we evaluate BERT-based, convolutional neural network (CNN)-based, and rule-based methods. We evaluate our NER and EE components independently and as an end-to-end system, reporting the precision, recall, and F 1 score. Our results show that the BiLSTM-based method performed best at identifying the entities, and the CNN-based method performed best at extracting events.Entities:
Keywords: chemical natural language processing; event extraction; information extraction; named entity recognition; relation extraction
Year: 2021 PMID: 34322654 PMCID: PMC8312343 DOI: 10.3389/frma.2021.688353
Source DB: PubMed Journal: Front Res Metr Anal ISSN: 2504-0537
Definitions of entities and trigger words of the dataset.
| Entity labels | Definition |
|---|---|
| REACTION_PRODUCT (R.P.) | A product is a substance that is formed during a chemical reaction |
| STARTING_MATERIAL (S.M.) | A substance that is consumed in the course of a chemical reaction providing atoms to products is considered as starting material |
| REAGENT_CATALYST (R.C.) | A reagent is a compound added to a system to cause or help with a chemical reaction. Compounds like catalysts, bases to remove protons, or acids to add protons must also be annotated with this tag |
| SOLVENT (S) | A solvent is a chemical entity that dissolves a solute, resulting in a solution |
| OTHER_COMPOUND (O.C.) | Other chemical compounds that are not the products, starting materials, reagents, catalysts, and solvents |
| TIME | The reaction time of the reaction |
| TEMPERATURE (Temp) | The temperature of the reaction |
| YIELD_PERCENT (Y.P.) | Yields given in percent values |
| YIELD_OTHER (Y.O.) | Yields provided in other units than % |
| EXAMPLE_LABEL | A label associated with a reaction specification |
| REACTION_STEP | Event within which starting materials are converted into product |
| WORKUP | Event step required to isolate and purify the product of a chemical reaction |
Number of entity types and trigger words in the training data and their event relations.
| Events | Entities | Instances | REACTION_STEP | WORKUP |
|---|---|---|---|---|
| ARG1 | EXAMPLE_LABEL | 886 | — | — |
| REACTION_PRODUCT | 2052 | 1,101 | 11 | |
| STARTING_MATERIAL | 1754 | 1747 | 4 | |
| REAGENT_CATALYST | 1,281 | 1,272 | — | |
| SOLVENT | 1,140 | 1,134 | 4 | |
| OTHER_COMPOUND | 4,640 | 161 | 4,097 | |
| ARGM | YIELD_PERCENT | 955 | 937 | 1 |
| YIELD_OTHER | 1,061 | 1,043 | 2 | |
| TIME | 1,059 | 839 | 81 | |
| TEMPERATURE | 1,515 | 813 | 242 | |
| Triggers | REACTION_STEP | 3,815 | ||
| WORKUP | 3,053 |
FIGURE 1An example from the CLEF ChEMU-2020 dataset that shows the entities, trigger words, and events.
Precision (P), recall (R), and F 1 (F) results for our NER system.
| Method | Entity | Exact | Relax | ||||
|---|---|---|---|---|---|---|---|
| P | R | F | P | R | F | ||
| BiLSTM + CRF | EXAMPLE_LABEL | 0.94 | 0.95 | 0.94 | 0.94 | 0.98 | 0.96 |
| OTHER_COMPOUND | 0.9 | 0.82 | 0.86 | 0.97 | 0.99 | 0.98 | |
| REACTION_PRODUCT | 0.84 | 0.83 | 0.83 | 0.9 | 0.97 | 0.94 | |
| REAGENT_CATALYST | 0.85 | 0.9 | 0.87 | 0.88 | 0.99 | 0.93 | |
| SOLVENT | 0.91 | 0.94 | 0.93 | 0.92 | 1 | 0.96 | |
| STARTING_MATERIAL | 0.85 | 0.84 | 0.85 | 0.91 | 1 | 0.95 | |
| TEMPERATURE | 0.63 | 0.63 | 0.63 | 0.99 | 0.99 | 0.99 | |
| TIME | 0.88 | 0.88 | 0.88 | 1 | 1 | 1 | |
| YIELD_OTHER | 0.95 | 0.98 | 0.97 | 0.96 | 1 | 0.98 | |
| YIELD_PERCENT | 0.99 | 0.99 | 0.99 | 1 | 1 | 1 | |
| System |
|
|
|
|
|
| |
| BioBERT + CRF | EXAMPLE_LABEL | 0.91 | 0.94 | 0.92 | 0.92 | 0.95 | 0.94 |
| OTHER_COMPOUND | 0.88 | 0.83 | 0.85 | 0.95 | 0.94 | 0.95 | |
| REACTION_PRODUCT | 0.44 | 0.65 | 0.52 | 0.73 | 0.95 | 0.82 | |
| REAGENT_CATALYST | 0.78 | 0.81 | 0.79 | 0.86 | 0.87 | 0.87 | |
| SOLVENT | 0.89 | 0.92 | 0.90 | 0.90 | 0.92 | 0.91 | |
| STARTING_MATERIAL | 0.39 | 0.60 | 0.48 | 0.69 | 0.92 | 0.79 | |
| TEMPERATURE | 0.95 | 0.96 | 0.96 | 0.98 | 0.99 | 0.99 | |
| TIME | 0.88 | 0.88 | 0.88 | 0.99 | 0.99 | 0.99 | |
| YIELD_OTHER | 0.78 | 0.85 | 0.81 | 0.89 | 0.95 | 0.92 | |
| YIELD_PERCENT | 0.95 | 0.99 | 0.97 | 0.97 | 1.00 | 0.98 | |
| System |
|
|
|
|
|
| |
Bold indicates system performance of both models.
FIGURE 2Confusion matrix using (A) BiLSTM + CRF and (B) BERT + CRF results. Keys for the acronyms are as follows: EXAMPLE_LABEL (E.L.), REACTION_PRODUCT (R.P.), STARTING_MATERIAL (S.M.), REAGENT_CATALYST (R.C.), SOLVENT (S), OTHER_COMPOUND (O.C.), YIELD_PERCENT (Y.P.), YIELD_OTHER (Y.O.), TIME (Time), and TEMPERATURE (Temp).
Our best results in comparison with the top results of the ChEMU-2020 competition for NER. Baseline is provided by the organizers of the ChEMU-2020 challenge.
| Exact | Relax | ||||||
|---|---|---|---|---|---|---|---|
| P | R | F | P | R | F | ||
| Our methods | BiLSTM-based | 0.87 | 0.85 | 0.86 | 0.95 |
|
|
| BioBERT-based | 0.73 | 0.82 | 0.77 | 0.87 | 0.95 | 0.91 | |
| ChEMU_2020 teams | Melaxtech |
|
|
|
| 0.97 |
|
| VinAI | 0.95 | 0.95 | 0.95 |
| 0.97 |
| |
| Lasige BioTM | 0.93 | 0.95 | 0.94 | 0.96 | 0.97 | 0.96 | |
| BiTeM | 0.94 | 0.91 | 0.92 |
| 0.96 | 0.96 | |
| NextMove/Minesoft | 0.90 | 0.89 | 0.90 | 0.93 | 0.92 | 0.92 | |
| AUKBC | 0.68 | 0.41 | 0.51 | 0.88 | 0.53 | 0.66 | |
| Baseline | ChEMU organizers | 0.91 | 0.87 | 0.89 | 0.92 | 0.89 | 0.91 |
Bold indicates best results for P, R, and F for both exact and relax match results.
Precision (P), recall (R), and (F) score of the EE system with trigger words identified using our BiLSTM + CRF trained with ChEMU patent embeddings.
| Method | Argument | Trigger | Entity | # Train | P | R | F |
|---|---|---|---|---|---|---|---|
| Rule-based | ARG1 | REACTION_STEP | OTHER_COMPOUND | 161 | 0.02 | 0.06 | 0.04 |
| REACTION_PRODUCT | 1,101 | 0.82 | 0.78 | 0.80 | |||
| REAGENT_CATALYST | 1,272 | 0.52 | 0.35 | 0.42 | |||
| SOLVENT | 1,134 | 0.81 | 0.55 | 0.65 | |||
| STARTING_MATERIAL | 1747 | 0.63 | 0.31 | 0.41 | |||
| Average | 0.56 | 0.52 | 0.46 | ||||
| WORKUP | OTHER_COMPOUND | 4,097 | 0.90 | 0.86 | 0.88 | ||
| REACTION_PRODUCT | 11 | 0.01 | 1.00 | 0.02 | |||
| REAGENT_CATALYST | — | 0.00 | 0.00 | 0.00 | |||
| SOLVENT | 4 | 0.07 | 1.00 | 0.14 | |||
| STARTING_MATERIAL | 4 | 0.04 | 1.00 | 0.08 | |||
| Average | 0.20 | 0.77 | 0.22 | ||||
| ARGM | REACTION_STEP | TEMPERATURE | 813 | 0.77 | 0.89 | 0.83 | |
| TIME | 839 | 0.85 | 0.93 | 0.89 | |||
| YIELD_OTHER | 1,043 | 0.83 | 0.80 | 0.81 | |||
| YIELD_PERCENT | 937 | 0.86 | 0.85 | 0.85 | |||
| Average | 0.83 | 0.87 | 0.85 | ||||
| WORKUP | TEMPERATURE | 242 | 0.66 | 0.81 | 0.73 | ||
| TIME | 81 | 0.36 | 0.53 | 0.43 | |||
| Average | 0.51 | 0.67 | 0.58 | ||||
| System |
|
|
| ||||
| CNN-based | ARG1 | REACTION_STEP | OTHER_COMPOUND | 161 | 0.00 | 0.00 | 0.00 |
| REACTION_PRODUCT | 1,101 | 0.92 | 0.96 | 0.94 | |||
| REAGENT_CATALYST | 1,272 | 0.78 | 0.69 | 0.74 | |||
| SOLVENT | 1,134 | 0.64 | 0.74 | 0.69 | |||
| STARTING_MATERIAL | 1747 | 0.82 | 0.43 | 0.56 | |||
| Average | — | 0.63 | 0.56 | 0.59 | |||
| WORKUP | OTHER_COMPOUND | 4,097 | 0.73 | 0.29 | 0.42 | ||
| REACTION_PRODUCT | 11 | 0.00 | 0.00 | 0.00 | |||
| SOLVENT | 4 | 0.00 | 0.00 | 0.00 | |||
| STARTING_MATERIAL | 4 | 0.00 | 0.00 | 0.00 | |||
| Average | 0.18 | 0.07 | 0.11 | ||||
| ARGM | REACTION_STEP | TEMPERATURE | 813 | 0.83 | 0.30 | 0.44 | |
| TIME | 839 | 0.78 | 0.73 | 0.75 | |||
| YIELD_OTHER | 1,043 | 0.93 | 0.96 | 0.95 | |||
| YIELD_PERCENT | 937 | 0.91 | 0.94 | 0.92 | |||
| Average | 0.86 | 0.73 | 0.77 | ||||
| WORKUP | TEMPERATURE | 242 | 0.56 | 0.08 | 0.14 | ||
| TIME | 81 | 0 .00 | 0.00 | 0.00 | |||
| Average | 0.28 | 0.04 | 0.07 | ||||
| System |
|
|
| ||||
| BERT-based | ARG1 | REACTION_STEP | OTHER_COMPOUND | 161 | 0.03 | 0.06 | 0.04 |
| REACTION_PRODUCT | 1,101 | 0.84 | 0.82 | 0.83 | |||
| REAGENT_CATALYST | 1,272 | 0.51 | 0.2 | 0.29 | |||
| SOLVENT | 1,134 | 0.49 | 0.62 | 0.55 | |||
| STARTING_MATERIAL | 1747 | 0.55 | 0.92 | 0.69 | |||
| Average | 0.48 | 0.52 | 0.59 | ||||
| WORKUP | OTHER_COMPOUND | 4,097 | 0.54 | 0.48 | 0.51 | ||
| REACTION_PRODUCT | 11 | 0.00 | 0.00 | 0.00 | |||
| SOLVENT | 4 | 0.00 | 0.00 | 0.00 | |||
| STARTING_MATERIAL | 4 | 0.00 | 0.00 | 0.00 | |||
| Average | 0.14 | 0.12 | 0.13 | ||||
| ARGM | REACTION_STEP | TEMPERATURE | 813 | 0.44 | 0.20 | 0.27 | |
| TIME | 839 | 0.51 | 0.82 | 0.63 | |||
| YIELD_OTHER | 1,043 | 0.83 | 0.83 | 0.83 | |||
| YIELD_PERCENT | 937 | 0.84 | 0.92 | 0.88 | |||
| Average | 0.66 | 0.69 | 0.65 | ||||
| WORKUP | TEMPERATURE | 242 | 0.26 | 0.17 | 0.21 | ||
| TIME | 81 | 0.23 | 0.26 | 0.24 | |||
| Average | 0.25 | 0.22 | 0.23 | ||||
| System |
|
|
| ||||
| BioBERT-based | ARG1 | REACTION_STEP | OTHER_COMPOUND | 161 | 0.04 | 0.02 | 0.02 |
| REACTION_PRODUCT | 1,101 | 0.84 | 0.82 | 0.83 | |||
| REAGENT_CATALYST | 1,272 | 0.53 | 0.45 | 0.49 | |||
| SOLVENT | 1,134 | 0.51 | 0.39 | 0.44 | |||
| STARTING_MATERIAL | 1747 | 0.59 | 0.27 | 0.37 | |||
| Average | 0.50 | 0.39 | 0.43 | ||||
| WORKUP | OTHER_COMPOUND | 4,097 | 0.52 | 0.53 | 0.54 | ||
| REACTION_PRODUCT | 11 | 0.00 | 0.00 | 0.00 | |||
| SOLVENT | 4 | 0.00 | 0.00 | 0.00 | |||
| STARTING_MATERIAL | 4 | 0.00 | 0.00 | 0.00 | |||
| Average | 0.13 | 0.13 | 0.14 | ||||
| ARGM | REACTION_STEP | TEMPERATURE | 813 | 0.43 | 0.08 | 0.13 | |
| TIME | 839 | 0.57 | 0.30 | 0.40 | |||
| YIELD_OTHER | 1,043 | 0.84 | 0.81 | 0.82 | |||
| YIELD_PERCENT | 937 | 0.84 | 0.88 | 0.86 | |||
| Average | — | 0.67 | 0.52 | 0.56 | |||
| WORKUP | TEMPERATURE | 242 | 0.27 | 0.20 | 0.23 | ||
| TIME | 81 | 0.17 | 0.02 | 0.04 | |||
| Average | 0.22 | 0.11 | 0.14 | ||||
| System |
|
|
| ||||
Bold indicates system performance of three methods.
Error analysis for the event extraction (EE) system where trigger words are trained with ChemPatent embeddings.
| Method | Argument | Trigger | Entity | tp | fp | fn | fpm | fnm |
|---|---|---|---|---|---|---|---|---|
| Rule-based | ARG1 | REACTION_STEP | OTHER_COMPOUND | 40 | 1798 | 23 | 18 | 11 |
| REACTION_PRODUCT | 351 | 75 | 101 | 10 | 3 | |||
| REAGENT_CATALYST | 177 | 162 | 328 | 8 | 8 | |||
| SOLVENT | 234 | 54 | 193 | 4 | 7 | |||
| STARTING_MATERIAL | 217 | 128 | 494 | 15 | 9 | |||
| WORKUP | OTHER_COMPOUND | 1,501 | 171 | 249 | 54 | 73 | ||
| REACTION_PRODUCT | 4 | 375 | 0 | 9 | 0 | |||
| REAGENT_CATALYST | 0 | 40 | 0 | 9 | 0 | |||
| SOLVENT | 2 | 25 | 0 | 5 | 0 | |||
| STARTING_MATERIAL | 1 | 24 | 0 | 2 | 0 | |||
| ARGM | REACTION_STEP | TEMPERATURE | 450 | 131 | 53 | 29 | 15 | |
| TIME | 386 | 66 | 27 | 21 | 10 | |||
| YIELD_OTHER | 350 | 74 | 85 | 11 | 3 | |||
| YIELD_PERCENT | 326 | 55 | 58 | 11 | 3 | |||
| WORKUP | TEMPERATURE | 89 | 45 | 21 | 13 | 20 | ||
| TIME | 23 | 41 | 20 | 16 | 13 | |||
| System | 4,151 | 3,957 | 1,652 | 421 | 175 | |||
| CNN-based | ARG1 | REACTION_STEP | OTHER_COMPOUND | 0 | 0 | 63 | 0 | 11 |
| REACTION_PRODUCT | 436 | 36 | 16 | 11 | 3 | |||
| REAGENT_CATALYST | 350 | 97 | 155 | 17 | 8 | |||
| SOLVENT | 316 | 179 | 111 | 16 | 7 | |||
| STARTING_MATERIAL | 305 | 68 | 406 | 12 | 9 | |||
| WORKUP | OTHER_COMPOUND | 516 | 192 | 1,234 | 23 | 73 | ||
| REACTION_PRODUCT | 0 | 0 | 4 | 0 | 0 | |||
| REAGENT_CATALYST | — | — | — | — | — | |||
| SOLVENT | 0 | 0 | 2 | 0 | 0 | |||
| STARTING_MATERIAL | 0 | 0 | 1 | 0 | 0 | |||
| ARGM | REACTION_STEP | TEMPERATURE | 151 | 30 | 352 | 15 | 15 | |
| TIME | 300 | 87 | 113 | 16 | 10 | |||
| YIELD_OTHER | 418 | 31 | 17 | 11 | 3 | |||
| YIELD_PERCENT | 361 | 36 | 23 | 13 | 3 | |||
| WORKUP | TEMPERATURE | 9 | 7 | 101 | 0 | 20 | ||
| TIME | 0 | 0 | 43 | 0 | 13 | |||
| System | 3,162 | 763 | 2,641 | 134 | 175 | |||
| BERT-based | ARG1 | REACTION_STEP | OTHER_COMPOUND | 4 | 120 | 59 | 15 | 11 |
| REACTION_PRODUCT | 369 | 72 | 83 | 17 | 3 | |||
| REAGENT_CATALYST | 101 | 97 | 404 | 9 | 8 | |||
| SOLVENT | 266 | 273 | 161 | 22 | 7 | |||
| STARTING_MATERIAL | 654 | 531 | 57 | 54 | 9 | |||
| WORKUP | OTHER_COMPOUND | 845 | 708 | 905 | 77 | 73 | ||
| REACTION_PRODUCT | 0 | 0 | 4 | 0 | 0 | |||
| REAGENT_CATALYST | — | — | — | — | — | |||
| SOLVENT | 0 | 0 | 2 | 0 | 0 | |||
| STARTING_MATERIAL | 0 | 0 | 1 | 0 | 0 | |||
| ARGM | REACTION_STEP | TEMPERATURE | 101 | 131 | 402 | 15 | 15 | |
| TIME | 338 | 319 | 75 | 18 | 3 | |||
| YIELD_OTHER | 360 | 73 | 75 | 18 | 3 | |||
| YIELD_PERCENT | 353 | 65 | 31 | 17 | 3 | |||
| WORKUP | TEMPERATURE | 19 | 54 | 91 | 6 | 20 | ||
| TIME | 11 | 37 | 32 | 0 | 13 | |||
| System | 3,421 | 2,480 | 2,382 | 284 | 175 | |||
| BioBERT-based | ARG1 | REACTION_STEP | OTHER_COMPOUND | 0 | 10 | 63 | 2 | 11 |
| REACTION_PRODUCT | 440 | 88 | 12 | 20 | 3 | |||
| REAGENT_CATALYST | 156 | 146 | 349 | 13 | 8 | |||
| SOLVENT | 256 | 235 | 171 | 20 | 7 | |||
| STARTING_MATERIAL | 236 | 169 | 475 | 23 | 9 | |||
| WORKUP | OTHER_COMPOUND | 928 | 790 | 822 | 73 | 68 | ||
| REACTION_PRODUCT | 0 | 0 | 4 | 0 | 0 | |||
| REAGENT_CATALYST | — | — | — | — | — | |||
| SOLVENT | 0 | 0 | 2 | 0 | 0 | |||
| STARTING_MATERIAL | 0 | 0 | 1 | 0 | 0 | |||
| ARGM | REACTION_STEP | TEMPERATURE | 40 | 53 | 463 | 8 | 15 | |
| TIME | 95 | 95 | 288 | 7 | 10 | |||
| YIELD_OTHER | 352 | 67 | 83 | 17 | 3 | |||
| YIELD_PERCENT | 338 | 65 | 46 | 17 | 3 | |||
| WORKUP | TEMPERATURE | 22 | 59 | 88 | 4 | 19 | ||
| TIME | 1 | 5 | 42 | 0 | 13 | |||
| System | 2,984 | 1782 | 2,909 | 204 | 169 | |||
Our best results in comparison with the top results of the ChEMU-2020 competition for event extraction (EE). Baseline is provided by the organizers of the ChEMU-2020 challenge.
| P | R | F | ||
|---|---|---|---|---|
| Our methods | Rule-based | 0.51 | 0.72 | 0.60 |
| CNN-based | 0.81 | 0.54 | 0.65 | |
| BERT-based | 0.58 | 0.59 | 0.58 | |
| BioBERT-based | 0.62 | 0.50 | 0.55 | |
| ChEMU_2020 teams | Melaxtech |
|
|
|
| NextMove/Minesoft | 0.94 | 0.86 | 0.90 | |
| BOUN_REX | 0.76 | 0.69 | 0.72 | |
| Baseline | ChEMU organizers | 0.24 | 0.89 | 0.38 |
Bold value results (P,R and F) of the best model from the competition.
Precision (P), recall (R), and (F) results for our end-to-end system using our BiLSTM + CRF NER and CNN-based EE methods.
| Method | Argument | Trigger | Entity | Train | Exact | Relax | ||||
|---|---|---|---|---|---|---|---|---|---|---|
| P | R | F | P | R | F | |||||
| CNN-based | ARG1 | REACTION_STEP | OTHER_COMPOUND | 161 | 0.33 | 0.02 | 0.03 | 0.33 | 0.02 | 0.03 |
| REACTION_PRODUCT | 1,101 | 0.72 | 0.79 | 0.76 | 0.81 | 0.92 | 0.86 | |||
| REAGENT_CATALYST | 1,272 | 0.66 | 0.31 | 0.42 | 0.68 | 0.32 | 0.43 | |||
| SOLVENT | 1,134 | 0.64 | 0.48 | 0.54 | 0.65 | 0.49 | 0.56 | |||
| STARTING_MATERIAL | 1747 | 0.63 | 0.45 | 0.53 | 0.66 | 0.47 | 0.55 | |||
| WORKUP | OTHER_COMPOUND | 4,097 | 0.67 | 0.61 | 0.64 | 0.72 | 0.69 | 0.70 | ||
| REACTION_PRODUCT | 11 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |||
| SOLVENT | 4 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |||
| STARTING_MATERIAL | 4 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |||
| ARGM | REACTION_STEP | TEMPERATURE | 813 | 0.43 | 0.15 | 0.22 | 0.62 | 0.21 | 0.31 | |
| TIME | 839 | 0.68 | 0.73 | 0.70 | 0.76 | 0.82 | 0.79 | |||
| YIELD_OTHER | 1,043 | 0.81 | 0.92 | 0.86 | 0.86 | 0.96 | 0.91 | |||
| YIELD_PERCENT | 937 | 0.85 | 0.95 | 0.90 | 0.86 | 0.96 | 0.91 | |||
| WORKUP | TEMPERATURE | 242 | 0.25 | 0.12 | 0.16 | 0.41 | 0.19 | 0.26 | ||
| TIME | 81 | 0.67 | 0.05 | 0.09 | 0.67 | 0.05 | 0.09 | |||
| System |
|
|
|
|
|
| ||||
| BERT-based | ARG1 | REACTION_STEP | OTHER_COMPOUND | 161 | 0.08 | 0.05 | 0.06 | 0.08 | 0.05 | 0.06 |
| REACTION_PRODUCT | 1,101 | 0.68 | 0.60 | 0.64 | 0.75 | 0.67 | 0.71 | |||
| REAGENT_CATALYST | 1,272 | 0.45 | 0.48 | 0.46 | 0.48 | 0.5 | 0.49 | |||
| SOLVENT | 1,134 | 0.45 | 0.18 | 0.25 | 0.46 | 0.18 | 0.26 | |||
| STARTING_MATERIAL | 1747 | 0.48 | 0.77 | 0.59 | 0.51 | 0.84 | 0.63 | |||
| WORKUP | OTHER_COMPOUND | 4,097 | 0.49 | 0.80 | 0.61 | 0.53 | 0.92 | 0.67 | ||
| REACTION_PRODUCT | 11 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |||
| SOLVENT | 4 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |||
| STARTING_MATERIAL | 4 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |||
| ARGM | REACTION_STEP | TEMPERATURE | 813 | 0.28 | 0.15 | 0.20 | 0.50 | 0.27 | 0.35 | |
| TIME | 839 | 0.43 | 0.38 | 0.40 | 0.49 | 0.44 | 0.46 | |||
| YIELD_OTHER | 1,043 | 0.80 | 0.73 | 0.76 | 0.81 | 0.74 | 0.78 | |||
| YIELD_PERCENT | 937 | 0.84 | 0.85 | 0.85 | 0.85 | 0.85 | 0.85 | |||
| WORKUP | TEMPERATURE | 242 | 0.20 | 0.16 | 0.18 | 0.23 | 0.19 | 0.21 | ||
| TIME | 81 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |||
| System |
|
|
|
|
|
| ||||
| BioBERT-based | ARG1 | REACTION_STEP | OTHER_COMPOUND | 161 | 0.08 | 0.05 | 0.06 | 0.08 | 0.05 | 0.06 |
| REACTION_PRODUCT | 1,101 | 0.68 | 0.56 | 0.62 | 0.74 | 0.63 | 0.69 | |||
| REAGENT_CATALYST | 1,272 | 0.47 | 0.38 | 0.42 | 0.49 | 0.40 | 0.44 | |||
| SOLVENT | 1,134 | 0.49 | 0.40 | 0.44 | 0.50 | 0.41 | 0.45 | |||
| STARTING_MATERIAL | 1747 | 0.48 | 0.73 | 0.58 | 0.51 | 0.80 | 0.62 | |||
| WORKUP | OTHER_COMPOUND | 4,097 | 0.48 | 0.62 | 0.54 | 0.52 | 0.71 | 0.60 | ||
| REACTION_PRODUCT | 11 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |||
| SOLVENT | 4 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |||
| STARTING_MATERIAL | 4 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |||
| ARGM | REACTION_STEP | TEMPERATURE | 813 | 0.28 | 0.08 | 0.12 | 0.50 | 0.14 | 0.22 | |
| TIME | 839 | 0.45 | 0.44 | 0.44 | 0.51 | 0.50 | 0.50 | |||
| YIELD_OTHER | 1,043 | 0.80 | 0.87 | 0.83 | 0.81 | 0.89 | 0.85 | |||
| YIELD_PERCENT | 937 | 0.85 | 0.84 | 0.84 | 0.85 | 0.85 | 0.85 | |||
| WORKUP | TEMPERATURE | 242 | 0.28 | 0.15 | 0.20 | 0.30 | 0.16 | 0.21 | ||
| TIME | 81 | 0.17 | 0.02 | 0.04 | 0.17 | 0.02 | 0.04 | |||
| System |
|
|
|
|
|
| ||||
| ChEMU_2020 teams | Melaxtech |
|
|
|
|
|
| |||
| NextMove/Minesoft | 0.85 | 0.76 | 0.80 | 0.87 | 0.78 | 0.82 | ||||
| OntoChem | 0.80 | 0.38 | 0.51 | 0.84 | 0.40 | 0.54 | ||||
| Baseline | ChEMU organizers | 0.21 | 0.73 | 0.33 | 0.21 | 0.75 | 0.33 | |||
System performance of three methods (top 3 bold lines). The last bold line shows the best performance out of the chemu teams.