| Literature DB >> 29295240 |
Yuqi Si1, Chunhua Weng1.
Abstract
Eligibility criteria are important for clinical research protocols or clinical practice guidelines for determining who qualify for studies and to whom clinical evidence is applicable, but the free-text format is not amenable for computational processing. In this paper, we described a practical method for transforming free-text clinical research eligibility criteria of Alzheimer's clinical trials into a structured relational database compliant with standards for medical terminologies and clinical data models. We utilized a hybrid natural language processing system and a concept normalization tool to extract medical terms in clinical research eligibility criteria and represent them using the OMOP Common Data Model (CDM) v5. We created a database schema design to store syntactic relations to facilitate efficient cohort queries. We further discussed the potential of applying this method to trials on other diseases and the promise of using it to accelerate clinical research with electronic health records.Entities:
Keywords: Clinical Research Informatics; Electronic Health Record; Relational Data Management
Mesh:
Year: 2017 PMID: 29295240 PMCID: PMC5893219
Source DB: PubMed Journal: Stud Health Technol Inform ISSN: 0926-9630
Structured Output of Entity and Attribute in EC
| Condition | present | T0 | |
| Qualifier | present | T1 | |
| Temporal constraints | present | T2 | |
Structured Output of Relation in EC
|
| ||
| T4 | T3 | |
| Modified by | ||
|
| ||
| T16 | T15 | |
| Has_temp | ||
Figure 2Learning curve for NER tasks
Evaluation of Name Entity Recognition
| Domain | Precision | Recall | F1-score |
|---|---|---|---|
| Condition | 0.835 | 0.836 | 0.831 |
| Observation | 0.748 | 0.745 | 0.793 |
| Drug | 0.852 | 0.790 | 0.820 |
| Procedure | 0.721 | 0.583 | 0.645 |
| Qualifier | 0.820 | 0.756 | 0.786 |
| Measurement | 0.820 | 0.770 | 0.794 |
| Temporal_constraints | 0.826 | 0.788 | 0.807 |
Figure 3Mapping evaluation statistical analysis result
Statistical Matching Result of Extracted Terms
| Extra cted term | Num. of match | Perc. of match (%) | Unique term (compression ratio) | Unique CONCEPT_ID (compressionn ratio | |
|---|---|---|---|---|---|
| Condition | 23336 | 17474 | 74.88 | 4453 (0.19) | 1336 (0.08) |
| Observation | 8824 | 3919 | 44.41 | 2360 (0.27) | 391 (0.10) |
| Drug | 6775 | 3694 | 54.52 | 1930 (0.28) | 624 (0.17) |
| Procedure | 3195 | 2136 | 66.85 | 626 (0.20) | 193 (0.09) |
| Qualifier | 9354 | 8094 | 86.53 | 449 (0.05) | 188 (0.02) |
| Total | 51484 | 35317 | 68.60 | 9819 (0.19) | 2660 (0.08) |
Statistical Matching Result of Extracted Relations
| Number | Unique number | Attribute number | Perc. of Extracting (%) | |
|---|---|---|---|---|
| Has_value | 3005 | 2224 | 5816 | 38.24 |
| Has_temp | 4632 | 3051 | 4507 | 67.69 |
| Modified by | 10400 | 7477 | 9354 | 79.93 |
|
| ||||
| Total | 18037 | 12752 | 19677 | 54.81 |
Figure 4Evaluation of Relation Extraction
Definition of TP, FP, FN, TN of a NER System
| True positive | System extracts a concept that matches the label |
| False positive | System extracts a concept but there is no label or doesn’t match the correct label |
| False negative | System doesn’t extract a concept but there is a label |
| True negative | System doesn’t extract a concept and there is no label |