| Literature DB >> 30255810 |
Ruth Reátegui1,2, Sylvie Ratté3.
Abstract
BACKGROUND: Clinical notes such as discharge summaries have a semi- or unstructured format. These documents contain information about diseases, treatments, drugs, etc. Extracting meaningful information from them becomes challenging due to their narrative format. In this context, we aimed to compare the automatic extraction capacity of medical entities using two tools: MetaMap and cTAKES.Entities:
Keywords: Clinical documents; MetaMap; UMLS; cTAKES
Mesh:
Year: 2018 PMID: 30255810 PMCID: PMC6157281 DOI: 10.1186/s12911-018-0654-2
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
List of entities or concept
| Entities annotated by experts | Entities in the first experiment | Entities or groups in the second experiment |
|---|---|---|
| Name of disease | Preferred name, CUI, Semantic Type | Preferred name, CUI, Semantic Type |
| Hypertension | Hypertensive disease, C0020538, dsyn | Hypertensive disease, C0020538, dsyn |
| Diabetes | Diabetes mellitus, C0011849, dsyn | Diabetes mellitus, C0011849, dsyn |
| Atherosclerotic Cardiovascular Disease (CAD) | Coronary artery disease, C1956346, dsyn | Coronary artery disease, C1956346, dsyn |
| Congestive Heart Failure (CHF) | Congestive heart failure, C0018802, dsyn | Congestive heart failure, C0018802, dsyn |
| Hypercholesterolemia | Hypercholesterolemia, C0020443, dsyn | Hypercholesterolemia, C0020443, dsyn |
| Obstructive Sleep Apnea (OSA) | Sleep apnea obstructive, C0520679, dsyn | Sleep apnea obstructive, C0520679, dsyn |
| Osteoarthritis (OA) | Degenerative polyarthritis, C0029408, dsyn | Degenerative polyarthritis, C0029408, dsyn |
| Depression | Mental depression, C0011570, mobd | Mental depression, C0011570, mobd |
| Asthma | Asthma, C0004096, dsyn | Asthma, C0004096, dsyn |
| Gastroesophageal Reflux Disease (GERD) | Gastroesophageal reflux disease, C0017168, dsyn | Gastroesophageal reflux disease, C0017168, dsyn |
| Gallstones/Cholecystectomy | Cholecystectomy procedure, C0008320, topp | Cholecystectomy procedure, C0008320, topp |
| Gout | Gout, C0018099, dsyn | Gout, C0018099, dsyn |
| Peripheral Vascular Disease (PVD) | Peripheral vascular diseases, C0085096, dsyn | Peripheral vascular diseases, C0085096, dsyn |
| Venous Insufficiency | Venous insufficiency, C0042485, dsyn | Venous insufficiency, C0042485, dsyn |
The second experiment grouped together some entities related to the disease annotated by the experts
CUI Concept Unique Identifier, dsyn Disease or Syndrome, mobd Mental or Behavioral Dysfunction, topp Therapeutic or Preventive Procedure, patf Pathologic Function
Summary of first experiment
| Diseases | Number of patients | Evaluation | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Annotations | MetaMap | cTAKES | MetaMap | cTAKES | |||||
| Recall | Precision | F-score | Recall | Precision | F-score | ||||
| Hypertension | 325 | 336 | 340 | 0.99 | 0.96 | 0.98 | 0.99 | 0.95 | 0.97 |
| Diabetesa | 259 | 186 | 235 |
| 0.91 | 0.76 | 0.83 | 0.91 | 0.87 |
| Atherosclerotic Cardiovascular Disease (CAD)a | 181 | 95 | 199 |
| 0.86 | 0.59 | 0.92 | 0.84 | 0.88 |
| Congestive Heart Failure (CHF) | 172 | 175 | 183 | 0.89 | 0.87 | 0.88 | 0.92 | 0.86 | 0.89 |
| Hypercholesterolemiaa | 172 | 108 | 92 |
| 0.94 | 0.73 |
| 0.95 | 0.66 |
| Obstructive Sleep Apnea (OSA) | 127 | 105 | 102 | 0.78 | 0.94 | 0.85 | 0.76 | 0.94 | 0.84 |
| Osteoarthritis (OA)a | 87 | 76 | 61 | 0.76 | 0.87 | 0.81 |
| 0.95 | 0.78 |
| Depressiona | 83 | 105 | 116 | 0.89 |
| 0.79 | 0.99 |
| 0.82 |
| Asthma | 81 | 83 | 92 | 0.93 | 0.90 | 0.91 | 1.00 | 0.88 | 0.94 |
| Gastroesophageal Reflux Disease (GERD) | 76 | 83 | 85 | 0.97 | 0.89 | 0.93 | 0.99 | 0.88 | 0.93 |
| Gallstones/Cholecystectomya | 74 | 54 | 58 |
| 1.00 | 0.84 | 0.78 | 1.00 | 0.88 |
| Gout | 56 | 58 | 58 | 0.98 | 0.95 | 0.96 | 0.98 | 0.95 | 0.96 |
| Peripheral Vascular Disease (PVD) | 37 | 37 | 32 | 0.97 | 0.97 | 0.97 | 0.84 | 0.97 | 0.90 |
| Venous Insufficiencya | 21 | 6 | 6 |
| 1.00 | 0.44 |
| 1.00 | 0.44 |
| AVERAGE | 0.78 | 0.91 | 0.82 | 0.82 | 0.91 | 0.84 | |||
The lowest values for recall and precision are in bold
aDisease with low evaluation
Fig. 1Process for the second experiment. Discharge summaries were analyzed with MetaMap or cTAKES to extract CUIs. Then some CUIs were aggregated to obtain the 14 comorbidities related with obesity
Fig. 2Aggregation process
Summary of second experiment
| Diseases | Number of patients | Evaluation | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Annotations | MetaMap | cTAKES | MetaMap | cTAKES | |||||
| Recall | Precision | F-score | Recall | Precision | F-score | ||||
| Hypertension | 325 | 336 | 340 | 0.99 | 0.96 | 0.98 | 0.99 | 0.95 | 0.97 |
| Diabetesa | 259 | 254 | 266 |
|
|
|
| 0.89 |
|
| Atherosclerotic Cardiovascular Disease (CAD)a | 181 | 130 | 205 |
| 0.83 |
| 0.92 | 0.81 | 0.87 |
| Congestive Heart Failure (CHF) | 172 | 175 | 183 | 0.89 | 0.87 | 0.88 | 0.92 | 0.86 | 0.89 |
| Hypercholesterolemiaa | 172 | 159 | 146 |
|
|
|
|
|
|
| Obstructive Sleep Apnea (OSA) | 127 | 105 | 102 | 0.78 | 0.94 | 0.85 | 0.76 | 0.94 | 0.84 |
| Osteoarthritis (OA) | 87 | 76 | 61 | 0.76 | 0.87 | 0.81 | 0.67 | 0.95 | 0.78 |
| Depressiona | 83 | 109 | 116 |
|
|
| 0.99 | 0.71 | 0.82 |
| Asthma | 81 | 83 | 92 | 0.93 | 0.90 | 0.91 | 1.00 | 0.88 | 0.94 |
| Gastroesophageal Reflux Disease (GERD) | 76 | 83 | 85 | 0.97 | 0.89 | 0.93 | 0.99 | 0.88 | 0.93 |
| Gallstones/Cholecystectomya | 74 | 65 | 68 |
| 0.99 |
|
| 0.97 |
|
| Gout | 56 | 58 | 58 | 0.98 | 0.95 | 0.96 | 0.98 | 0.95 | 0.96 |
| Peripheral Vascular Disease (PVD) | 37 | 37 | 32 | 0.97 | 0.97 | 0.97 | 0.84 | 0.97 | 0.90 |
| Venous Insufficiencya | 21 | 27 | 30 |
| 0.704 |
|
| 0.7 |
|
| AVERAGE | 0.88 | 0.89 | 0.88 | 0.91 | 0.89 | 0.89 | |||
The values improved are in bold
aDiseases formed by two or more UMLS concepts