| Literature DB >> 26664724 |
Wei Liu1, Bo Chuen Chung1, Rui Wang1, Jonathon Ng2, Nigel Morlet2.
Abstract
Despite the rapid global movement towards electronic health records, clinical letters written in unstructured natural languages are still the preferred form of inter-practitioner communication about patients. These letters, when archived over a long period of time, provide invaluable longitudinal clinical details on individual and populations of patients. In this paper we present three unsupervised approaches, sequential pattern mining (PrefixSpan); frequency linguistic based C-Value; and keyphrase extraction from co-occurrence graphs (TextRank), to automatically extract single and multi-word medical terms without domain-specific knowledge. Because each of the three approaches focuses on different aspects of the language feature space, we propose a genetic algorithm to learn the best parameters of linearly integrating the three extractors for optimal performance against domain expert annotations. Around 30,000 clinical letters sent over the past decade from ophthalmology specialists to general practitioners at an eye clinic are anonymised as the corpus to evaluate the effectiveness of the ensemble against individual extractors. With minimal annotation, the ensemble achieves an average F-measure of 65.65 % when considering only complex medical terms, and a F-measure of 72.47 % if we take single word terms (i.e. unigrams) into consideration, markedly better than the three term extraction techniques when used alone.Entities:
Keywords: Clinical term extraction; Genetic algorithm; Sequence mining algorithms
Year: 2015 PMID: 26664724 PMCID: PMC4674942 DOI: 10.1186/s13755-015-0013-y
Source DB: PubMed Journal: Health Inf Sci Syst ISSN: 2047-2501
Contingency table for word pair ab
|
|
| ||
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Fig. 1TextRank example graph. The graph is created by concatenating two clinical documents. Two terms are connected if they appear in the same sentence
The number of terms extracted before and after filtering
| PrefixSpan | C- | TextRank | |
|---|---|---|---|
| Before | 383,397 | 56,264 | 55,055 |
| After | 564 | 3138 | 3025 |
| Percentage | 0.14 % | 5.58 % | 5.49 % |
Before filtering top 10
| PrefixSpan | C- | TextRank |
|---|---|---|
| Possibly | Right eye | Eye |
| Possibly a | Left eye | Right eye |
| F | Month time | Left eye |
| F She | Intraocular pressure | Left |
| F FR | Visual acuity | Right |
| F FR FL | Cataract surgery | Vision |
| F FR FL N | Optic disc | Review |
| F FR FL N FR | Current glass | Time |
| F FR FL N FR FL | Eye | Diagnosis |
| F Left | Ocular examination | Month |
Before filtering bottom 10
| PrefixSpan | C- | TextRank |
|---|---|---|
| sharp in the right | Recent left cataract | Migraine prophylaxis medication |
| Sharp in the left | Eye institute | Monthly fundus check |
| sharp in the left eye | Disease for | Upper thorax |
| Sharp in each | Bar fusion range | Persistent low grade inflammation |
| Sharp in each eye | Single vision distance | Think mrtaylor |
| Sharp in each eye with | Sclerotic lens change | Occasional mobic medication |
| Sharp in each eye with a | Nuclear sclerotic lens change | 22 mhg |
| Sharp pain | Pressure today | Another ct head |
| Tumour | Titmus stereo | Continued annual pressure check |
| Stressed | Early nuclear sclerotic lens | Good condtion |
After filtering top 10
| PrefixSpan | C- | TextRank |
|---|---|---|
| Eye | Intraocular pressure | Eye |
| Examination | Visual acuity | Vision |
| Glasses | Cataract surgery | Diagnosis |
| Cataract | Optic disc | Examination |
| Surgery | Eye | Visual acuity |
| Intraocular | Ocular examination | Intraocular pressure |
| Acuity | Diabetic retinopathy | Symptom |
| Cataract surgery | Fundus examination | History |
| Glaucoma | Vision | Treatment |
| Lens | Visual field examination | Cornea |
After filtering bottom 10
| PrefixSpan | C- | TextRank |
|---|---|---|
| Retinal photocoagulation | Bilateral yag | Pigmentary sign |
| Homonymous | Intraocular len | Ocular fundus |
| Detached retina | Simple convergence | Odd microaneurysm |
| Band | Bilateral normal pressure | Non ischemic retinal vein occlusion |
| Sphere | Eye surgery | Lens measurement |
| Refract | External eye | Topical medical treatment |
| Proliferative | Maddox rod | Blood nose symptom |
| On examination on visual | External eye disease | Primary diagnosis |
| Keratic precipitates | Ocular history | Posterior retina |
| Atypical | Dilated fundus | Posterior chamber lens implant |
Performance measures with unigram
| Prefix (%) | C- | TextRank | GA (avg) | |
|---|---|---|---|---|
| Precision | 59.30 | 61.25 |
| |
| Recall | 5.36 |
| 75.36 | |
| F-Measure | 9.82 |
| 69.12 |
|
Performance measures without unigram
| Prefix (%) | C- | TextRank (%) | GA (avg) | |
|---|---|---|---|---|
| Precision | 18.09 | 49.93 |
| |
| Recall | 5.36 |
| 81.98 | |
| F-Measure | 8.27 | 62.16 |
|
|
Weights of GA-enabled ensemble (averaged over 100 runs)
|
| PrefixSpan (%) | C- | TextRank (%) |
|---|---|---|---|
| With unigrams | 92.84 | 0.28 | 6.88 |
| Without unigrams | 82.96 | 0.1 | 16.94 |
Fig. 2GA optimisation process with unigram
Fig. 3GA optimisation process without unigram
GA top 20
| With unigrams | Without unigrams |
|---|---|
| Eye | Cataract surgery |
| Examination | Intraocular pressure |
| Cataract | Visual acuity |
| Surgery | Macular degeneration |
| Cataract surgery | Contact lens |
| Glaucoma | Optic disc |
| History | Intraocular lens |
| Lens | Posterior vitreous detachment |
| Diagnosis | Anterior chamber |
| Acuity | Cataract extraction |
| Pressure | Dry eye |
| Macula | Retinal detachment |
| Cornea | Visual field |
| Pterygium | Double vision |
| Diplopia | Optic nerve |
| Contact | Posterior capsule |
| Tear | Retinal vein occlusion |
| Macular degeneration | Colour vision |
| Intraocular pressure | Fluorescein angiography |
| Angle | Meibomian gland dysfunction |
GA bottom 20
| With unigrams | Without unigrams |
|---|---|
| Bilateral defect | Conjunctival naevus |
| de | Chronic simple glaucoma |
| cl | Central visual field test |
| os | Blepharo spasm |
| Visual test | Bilateral posterior uveitis |
| Vision bilateral | Bilateral macular pattern dystrophy |
| Pupillary conjunctivitis | Bilateral iritis |
| Normal migraine | Atypical migraine |
| Macula i | Atropine occlusion |
| Jaw wink | Acute iritis |
| Inferior retinal break | Active epithelial disease |
| i o p | Macular microaneurysms |
| Diplopia n | Haptic lens |
| i | Choroidal naevi |
| uv | Senile ptosis |
| al | Lacrimal pressure |
| od | Arteritic ischaemic optic neuropathy |
| Arteritic ischaemic optic neuropathy | Choroidal neovascular |
| Inferior hemi retinal vein occlusion | Inferior hemi retinal vein occlusion |
| Eye i | Eye i |