| Literature DB >> 35953524 |
Nicholas J Dobbins1, Tony Mullen2, Özlem Uzuner3, Meliha Yetisgen4.
Abstract
Identifying cohorts of patients based on eligibility criteria such as medical conditions, procedures, and medication use is critical to recruitment for clinical trials. Such criteria are often most naturally described in free-text, using language familiar to clinicians and researchers. In order to identify potential participants at scale, these criteria must first be translated into queries on clinical databases, which can be labor-intensive and error-prone. Natural language processing (NLP) methods offer a potential means of such conversion into database queries automatically. However they must first be trained and evaluated using corpora which capture clinical trials criteria in sufficient detail. In this paper, we introduce the Leaf Clinical Trials (LCT) corpus, a human-annotated corpus of over 1,000 clinical trial eligibility criteria descriptions using highly granular structured labels capturing a range of biomedical phenomena. We provide details of our schema, annotation process, corpus quality, and statistics. Additionally, we present baseline information extraction results on this corpus as benchmarks for future work.Entities:
Mesh:
Year: 2022 PMID: 35953524 PMCID: PMC9372145 DOI: 10.1038/s41597-022-01521-0
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 8.501
Annotation statistics for EliIE, Chia, and LCT corpora.
| Measure | EliIE[ | Chia[ | LCT Corpus |
|---|---|---|---|
| Disease domain | Alzheimer’s Disease | All | |
| No. of Eligibility Descriptions | 230 | 1,000 | |
| No. of Annotations | 15,596 | 68,174 | |
| No. of Entity types | 8 | 15 | |
| No. of Relation types | 3 | 12 | |
| Mean Entities per doc. | — | 46 | |
| Mean Relations per doc. | — | 19 |
Fig. 1Example eligibility criteria annotated used the LCT corpus annotation schema (left) and corresponding example SQL queries (right) using a hypothetical database table and columns. Annotations were done using the Brat annotation tool[19]. The ICD-10 codes shown are examples and not intended to be exhaustive.
Examples of representative LCT annotation schema entities.
| Category | Entity | Values | Example Text |
|---|---|---|---|
| Clinical | Condition | — | Diagnosed with |
| Contraindication | — | any | |
| Drug | — | on | |
| Encounter | emergency, outpatient, inpatient | recently | |
| Immunization | — | received | |
| Observation | lab, vital, clinical-score, survey, social-habit | ||
| Procedure | — | Undergoing or scheduled for a | |
| Demographic | Age | — | 43 years |
| Birth | — | ||
| Family-Member | mother, father, sibling, etc. | history of | |
| Language | — | Speaks | |
| Logical | Negation | — | with |
| `Qualifier | Assertion | intention, hypothetical, possible | which |
| Modifier | — | ||
| Polarity | low, high, positive, negative | showing | |
| Risk | — | at heightened | |
| Severity | mild, moderate, severe | with | |
| Stability | stable, change | conditions known to | |
| Temporal and Comparative | Criteria-Count | — | at least 3 of |
| Eq-Comparison | — | ||
| Eq-Temporal-Period | past, present, future | ||
| Eq-Temporal-Recency | first-time, most-recent | ||
| Other | Location | residence, clinic, hospital, unit, emergency-department | Seen at |
A full listing of all entities can be found in the LCT annotation guidelines at https://github.com/uw-bionlp/clinical-trials-gov-annotation/wiki.
Examples of representative relations.
| Category | Relation | Example Annotation |
|---|---|---|
| Alternatives and Examples | Abbrev-Of | |
| Equivalent-To | ||
| Example-Of | ||
| Clinical | Contraindicates | conditions |
| Dependent | Caused-By | |
| Found-By | ||
| Treatment-For | ||
| Using | ||
| Logical | If-Then | BMI |
| Qualifier | Risk-For | |
| Severity | ||
| Stability | ||
| Temporal and Comparative | After | |
| Before | ||
| Duration | ||
| During | ||
| Numeric-Filter | ||
| Minimum-Count | ||
| Temporality | ||
| Other | Location |
Direction of arrows indicates role, i.e., subject → target entity.
Fig. 2Examples of clinical trials eligibility criteria annotated with Chia and LCT annotation schemas. Each example shows a criterion from a Chia annotation (above) and an LCT annotation of the same text for purposes of comparison (below).
Hyperparameters and pre-trained embeddings used for named entity recognition and relation extraction baseline results.
| Task | Architecture | Hyperparameter/Embeddings | Training Value |
|---|---|---|---|
| Named Entity Recognition | biLSTM + CRF | Character Dimensions | 25 |
| Token Embedding Dimensions | 100 | ||
| Learning Rate | 0.005 | ||
| Dropout | 0.5 | ||
| Pretrained Embeddings | GloVe[ | ||
| Relation Extraction | BERT & R-BERT | Pretrained Model | SciBert |
| Learning Rate | 0.00003 |
For the NER task, the same architecture and hyperparameters were used for both general and fine-grained entity models. For the relation extraction task, the same hyperparameters were used with both the BERT and R-BERT architectures.
Baseline entity prediction scores (%, Precision/Recall/F1).
| Category | Entity | Count | biLSTM + CRF | PubMedBERT | SciBERT |
|---|---|---|---|---|---|
| Clinical | Condition | 7,087 | 78.6/78.1/78.3 | 76.1/79.4/77.7 | 78.4/83.3/80.8 |
| Contraindication | 142 | 93.7/78.9/85.7 | 77.4/80.0/78.6 | 100.0/96.6/98.3 | |
| Drug | 1,404 | 76.8/81.3/79.0 | 74.1/80.9/77.4 | 73.4/80.9/77.0 | |
| Encounter | 302 | 64.1/58.1/60.9 | 51.7/61.7/56.3 | 58.3/74.4/65.4 | |
| Observation | 2,558 | 74.3/66.1/69.9 | 67.9/73.5/70.6 | 72.1/77.6/74.7 | |
| Procedure | 3,016 | 68.4/75.5/71.9 | 67.0/75.9/71.2 | 71.3/79.4/75.1 | |
| Demographic | Age | 708 | 91.3/95.4/93.3 | 82.4/88.5/85.3 | 99.1/98.3/98.7 |
| Birth | 27 | 100.0/80.0/88.8 | 100.0/62.5/76.9 | 100.0/62.5/76.9 | |
| Death | 35 | 33.3/33.3/33.3 | 0.0/0.0/0.0 | 100.0/20.0/33.3 | |
| Family-Member | 147 | 40.0/19.0/25.8 | 33.3/55.5/41.6 | 44.9/61.1/51.7 | |
| Language | 194 | 92.5/96.1/94.3 | 73.8/100.0 84.9 | 96.6/93.5/95.0 | |
| Logical | Negation | 952 | 74.3/82.7/78.2 | 60.9/73.1/66.4 | 73.5/82.9/77.9 |
| Qualifier | Assertion | 1,157 | 66.6/62.8/64.7 | 56.1/58.9/57.5 | 62.1/65.8/63.9 |
| Modifier | 3,464 | 65.0/58.3/61.5 | 59.2/64.0/61.5 | 58.5/65.4/61.8 | |
| Polarity | 360 | 82.5/88.0/85.1 | 74.6/67.4/70.8 | 81.4/79.5/80.4 | |
| Risk | 117 | 93.1/96.4/94.7 | 91.3/91.3/91.3 | 95.4/91.3/93.3 | |
| Severity | 569 | 86.8/90.8/88.7 | 76.7/79.5/78.1 | 86.5/94.1/90.2 | |
| Stability | 397 | 84.2/67.6/75.0 | 79.4/75.0/77.1 | 75.3/84.7/79.7 | |
| Temporal and Comparative | Criteria-Count | 33 | 50.0/66.6/57.1 | 28.5/40.0/33.3 | 12.5/20.0/15.5 |
| Eq-Comparison | 5,298 | 83.1/83.8/83.4 | 81.4/85.0/83.2 | 85.3/89.3/87.3 | |
| Eq-Temporal-Period | 2,057 | 88.7/89.2/88.9 | 70.0/73.9/71.9 | 82.6/86.3/84.4 | |
| Eq-Temporal-Recency | 131 | 68.7/84.6/75.8 | 43.4/55.5/48.7 | 50.0/66.6/57.1 | |
| Eq-Temporal-Unit | 1,808 | 95.1/97.6/96.4 | 97.4/98.1/97.8 | 98.2/99.4/98.8 | |
| Eq-Value | 3,835 | 91.8/95.3/93.5 | 95.5/96.2/95.9 | 96.4/97.1/96.7 | |
| Other | Location | 371 | 68.5/58.7/63.2 | 65.4/71.6/68.3 | 73.4/78.3/75.8 |
| — | Total | 56,146 | 80.2/79.6/79.9 | 75.3/78.7/77.0 | 79.0/83.7/81.3 |
Corpus-level micro-averaged scores are shown in the bottom row. For brevity a representative sample of entities is shown. Count refers to the total count of unique spans annotated in the entire corpus. Entities included in the total count and scores but omitted for brevity are Acuteness, Allergy, Condition-Type, Code, Coreference, Ethnicity, Eq-Operator, Eq-Unit, Indication, Immunization, Insurance, Life-Stage-And-Gender, Organism, Other, Specimen, Study and Provider.
Baseline relation prediction scores (%, Precision/Recall/F1).
| Category | Relation | Count | SciBERT | R-BERT + SciBERT |
|---|---|---|---|---|
| Alternatives and Examples | Abbrev-Of | 462 | 95.2/90.9/93.0 | 92.3/93.1/94.2 |
| Equivalent-To | 516 | 61.5/69.5/65.3 | 59.6/67.3/63.2 | |
| Example-Of | 1,497 | 94.8/92.9/93.8 | 90.5/91.7/91.1 | |
| Clinical | Contraindicates | 153 | 90.9/90.9/90.9 | 90.9/90.9/90.9 |
| Caused-By | 726 | 63.0/86.4/72.9 | 78.6/86.4/82.3 | |
| Found-By | 293 | 90.4/59.3/71.7 | 9.3/71.8/75.4 | |
| Treatment-For | 457 | 69.2/69.2/69.2 | 61.7/74.3/67.4 | |
| Using | 405 | 73.8/83.7/78.4 | 66.6/64.8/65.7 | |
| Logical | And | 821 | 54.1/60.0/56.9 | 53.8/53.8/53.8 |
| If-Then | 261 | 57.6/65.2/61.2 | 55.5/65.2/60.0 | |
| Negates | 984 | 74.3/91.0/81.8 | 74.5/88.7/81.0 | |
| Or | 4,156 | 85.1 93.2 89.0 | 88.4/92.2/90.2 | |
| Qualifier | Asserted | 1,184 | 83.7/89.0/86.3 | 85.9/89.0/87.5 |
| Modifies | 3,400 | 90.9/94.2/92.5 | 92.2/95.4/93.8 | |
| Risk-For | 90 | 92.3/85.7/88.8 | 92.8/92.8/92.8 | |
| Severity | 529 | 80.2/96.6/87.6 | 86.3/96.6/91.2 | |
| Stability | 395 | 76.0/92.6/83.5 | 76.4/95.1/84.7 | |
| Temporal and Comparative | After | 166 | 75.0/70.5/72.7 | 72.2/76.4/74.2 |
| Before | 320 | 70.2/86.6/77.6 | 78.1/83.3/80.6 | |
| Duration | 243 | 59.3/79.1/67.8 | 64.5/83.3/72.7 | |
| During | 350 | 66.6/68.7/67.6 | 63.6/65.6/64.6 | |
| Numeric-Filter | 1,957 | 84.6/93.3/88.7 | 85.7/92.3/88.8 | |
| Minimum-Count | 173 | 64.2/69.2/66.7 | 71.4/76.9/74.0 | |
| Temporality | 2,645 | 80.7/90.7/85.4 | 81.8/92.2/86.7 | |
| Other | Location | 207 | 64.2/94.7/76.6 | 69.2/94.7/80.0 |
| — | Total | 24,379 | 80.2/88.2/84.0 | 82.5/88.0/85.2 |
Corpus-level micro-averaged scores are shown in the bottom row. For brevity a representative sample of relations is shown. Count refers to the total count annotated in the entire corpus, including relations not shown. The count total excludes general to fine-grained entity relations, which as overlapping spans are not used for relation prediction. Relations included in the total count and scores but omitted for brevity are Acuteness, Code, Criteria, Except, From, Indication-For, Is-Other, Max-Value, Min-Value, Polarity, Provider, Refers-To, Specimen, Stage, Study-Of and Type.
Results of NER experiments using the manually annotated and semi-automated portions of the corpus.
| Training Set | Test Set | Precision | Recall | F1 |
|---|---|---|---|---|
| Manual | Semi-automated | 75.4 | 82.1 | 78.6 |
| Semi-automated | Manual | 80.1 | 79.9 | 80.0 |
The manually annotated portion includes 513 documents while the semi-automatically annotated portion is 493 documents.
Fig. 4Example of an LCT annotated document (top) transformed into a Directed Acyclical Graph (bottom). LCT entities and relations are readily transformable into tree, graph, or object-oriented representations used for query generation.
Fig. 3Screenshot of a prototype web application for real-time entity and relation prediction on custom user input text.
| Measurement(s) | Clinical Trial Eligibility Criteria |
| Technology Type(s) | natural language processing |
| Sample Characteristic - Organism | Homo sapiens |