| Literature DB >> 24692393 |
Brian Connolly1, Pawel Matykiewicz1, K Bretonnel Cohen2, Shannon M Standridge3, Tracy A Glauser3, Dennis J Dlugos4, Susan Koh5, Eric Tham6, John Pestian1.
Abstract
OBJECTIVE: The constant progress in computational linguistic methods provides amazing opportunities for discovering information in clinical text and enables the clinical scientist to explore novel approaches to care. However, these new approaches need evaluation. We describe an automated system to compare descriptions of epilepsy patients at three different organizations: Cincinnati Children's Hospital, the Children's Hospital Colorado, and the Children's Hospital of Philadelphia. To our knowledge, there have been no similar previous studies.Entities:
Keywords: Epilepsy; Linguistics; Multicenter; Support vector machines; Text classification
Mesh:
Year: 2014 PMID: 24692393 PMCID: PMC4147613 DOI: 10.1136/amiajnl-2013-002601
Source DB: PubMed Journal: J Am Med Inform Assoc ISSN: 1067-5027 Impact factor: 4.497
The ICD-9-CM codes associated with each type of epilepsy diagnosis, and the corresponding number of clinical notes from each hospital
| Epilepsy classification | ICD-9-CM codes | CCHMC | CHCO | CHOP |
|---|---|---|---|---|
| Partial epilepsy | 345.40, 345.41, 345.50, 345.51, 345.70, 345.71 | 303 | 128 | 269 |
| Generalized epilepsy | 345.00, 345.01, 345.10, 345.11, 345.2 | 99 | 163 | 129 |
| Unclassified epilepsy | 345.80, 345.81, 345.90, 345.91 | 200 | 117 | 121 |
| Data missing | 345.3, 345.60, 345.61 | 12 | 25 | 32 |
CCHMC, Cincinnati Children's Hospital Medical Center; CHCO, Children's Hospital Colorado; CHOP, Children's Hospital of Philadelphia.
Results from the classification of partial epilepsy and generalized epilepsy in epilepsy progress notes
| Hospital used for training | Average F1 (training) | F1 SD (training) | Average F1 (remaining hospitals) | F1 SD (remaining hospitals) | p Value from baseline (remaining hospitals) |
|---|---|---|---|---|---|
| CCHMC | 0.865 | 0.213 | 0.691 | 0.095 | 0.043 |
| CHOP | 0.926 | 0.149 | 0.729 | 0.014 | <0.001 |
| CHCO | 0.823 | 0.224 | 0.754 | 0.062 | <0.001 |
| One-hospital average | 0.871 | 0.195 | 0.070 | 0.001 | |
| CCHMC and CHOP | 0.913 | 0.100 | 0.817 | 0.047 | <0.001 |
| CCHMC and CHCO | 0.904 | 0.097 | 0.807 | 0.031 | <0.001 |
| CHOP and CHCO | 0.904 | 0.097 | 0.807 | 0.031 | <0.001 |
| Two-hospital average | 0.899 | 0.105 | 0.047 | <0.001 |
The first column lists the hospital(s) used to optimize the support vector machine. The second and third columns list the 20-fold cross-validated average F1 and corresponding SDs of the training samples, respectively. The fourth and fifth columns list the average F1 and corresponding SDs for the remaining hospital(s). The last column shows the p value significance of the result compared to the largest class baseline F1=0.5. Systematic improvement when two hospitals are used is highlighted in bold, and the sample size is the same when one and two hospitals are used.
CCHMC, Cincinnati Children's Hospital Medical Center; CHCO, Children's Hospital Colorado; CHOP, Children's Hospital of Philadelphia.
Results from the classification of PE, GE, and UE in epilepsy progress notes
| Hospital used for training (remaining hospitals) | Average F1 (training) | F1 SD (training) | Average F1 (remaining hospitals) | F1 SD | p Value from baseline (remaining hospitals) |
|---|---|---|---|---|---|
| CCHMC | 0.647 | 0.311 | 0.417 | 0.147 | 0.567 |
| CHOP | 0.759 | 0.261 | 0.372 | 0.142 | 0.788 |
| CHCO | 0.625 | 0.327 | 0.376 | 0.143 | 0.763 |
| One hospital | 0.677 | 0.300 | 0.145 | 0.704 | |
| CCHMC and CHOP | 0.730 | 0.169 | 0.478 | 0.097 | 0.136 |
| CCHMC and CHCO | 0.670 | 0.185 | 0.574 | 0.191 | 0.207 |
| CHOP and CHCO | 0.724 | 0.172 | 0.424 | 0.113 | 0.421 |
| Two hospitals | 0.708 | 0.175 | 0.153 | 0.298 |
The first column lists the hospital(s) used to optimize the support vector machine. The second and third columns list the 20-fold cross-validated average F1 and corresponding SDs of the training samples, respectively. The fourth and fifth columns list the average F1 and corresponding SDs for the remaining hospital(s). The last column shows the p value significance of the result compared to the largest class baseline F1 ≈ 0.333. Systematic improvement when two hospitals are used is highlighted in bold, and the sample size is the same when one and two hospitals are used.
CCHMC, Cincinnati Children's Hospital Medical Center; CHCO, Children's Hospital Colorado; CHOP, Children's Hospital of Philadelphia; GE, generalized epilepsy; PE, partial epilepsy; UE, unclassified epilepsy.