| Literature DB >> 28606104 |
Olga V Patterson1,2, Matthew S Freiberg3,4, Melissa Skanderson5, Samah J Fodeh6, Cynthia A Brandt5,6, Scott L DuVall7,8.
Abstract
BACKGROUND: In order to investigate the mechanisms of cardiovascular disease in HIV infected and uninfected patients, an analysis of echocardiogram reports is required for a large longitudinal multi-center study. IMPLEMENTATION: A natural language processing system using a dictionary lookup, rules, and patterns was developed to extract heart function measurements that are typically recorded in echocardiogram reports as measurement-value pairs. Curated semantic bootstrapping was used to create a custom dictionary that extends existing terminologies based on terms that actually appear in the medical record. A novel disambiguation method based on semantic constraints was created to identify and discard erroneous alternative definitions of the measurement terms. The system was built utilizing a scalable framework, making it available for processing large datasets.Entities:
Keywords: Echocardiography; Heart function; Information extraction; Left ventricular ejection fraction; Natural language processing; Text mining
Mesh:
Year: 2017 PMID: 28606104 PMCID: PMC5469017 DOI: 10.1186/s12872-017-0580-8
Source DB: PubMed Journal: BMC Cardiovasc Disord ISSN: 1471-2261 Impact factor: 2.298
Fig. 1Overall system design
Fig. 2Examples of semi-structured text in echocardiogram reports
System performance by concept
| Mapping | TIU | Echo | Radiology | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Mentions | Precision | Recall | Mentions | Precision | Recall | Mentions | Precision | Recall | |
| Aortic valve max pressure gradient* | 11 | 1.000 | 0.611 | 2 | 1.000 | 1.000 | 0 | . | . |
| Aortic valve mean pressure gradient* | 8 | 1.000 | 0.471 | 0 | . | . | 0 | . | . |
| Aortic valve orifice area | 18 | 0.722 | 0.867 | 2 | 0.500 | 0.500 | 2 | 0.000 | 0.000 |
| Aortic valve regurgitation | 62 | 0.984 | 0.859 | 24 | 1.000 | 0.750 | 4 | 0.750 | 0.500 |
| Aortic valve stenosis | 24 | 1.000 | 0.558 | 8 | 1.000 | 0.364 | 2 | 1.000 | 0.667 |
| E/e prime ratio* | 7 | 0.571 | 1.000 | 0 | . | . | 1 | 1.000 | 1.000 |
| Interventricular septum dimension at end diastole | 75 | 0.853 | 1.000 | 27 | 0.926 | 1.000 | 10 | 0.900 | 1.000 |
| Left atrium size at end systole | 142 | 0.923 | 0.929 | 63 | 1.000 | 0.875 | 6 | 1.000 | 1.000 |
| Left ventricular contractility* | 4 | 0.250 | 0.063 | 0 | . | . | 5 | 1.000 | 1.000 |
| Left ventricular dimension at end diastole | 59 | 0.949 | 0.862 | 12 | 1.000 | 0.706 | 7 | 0.857 | 0.857 |
| Left ventricular dimension at end systole | 43 | 0.977 | 0.894 | 11 | 1.000 | 1.000 | 0 | . | . |
| Left ventricular ejection fraction | 377 | 0.968 | 0.899 | 137 | 1.000 | 0.801 | 176 | 0.989 | 0.883 |
| Left ventricular hypertrophy | 65 | 0.908 | 1.000 | 23 | 0.913 | 1.000 | 2 | 1.000 | 0.500 |
| Left ventricular posterior wall thickness at end diastole | 78 | 0.974 | 0.844 | 33 | 0.970 | 0.842 | 7 | 0.857 | 0.750 |
| Left ventricular size | 67 | 0.985 | 0.629 | 27 | 1.000 | 0.443 | 27 | 1.000 | 0.563 |
| Mitral valve mean pressure gradient* | 9 | 1.000 | 0.750 | 0 | . | . | 0 | . | . |
| Mitral valve orifice area* | 9 | 1.000 | 1.000 | 0 | . | . | 0 | . | . |
| Mitral valve regurgitation peak velocity* | 2 | 1.000 | 0.667 | 0 | . | . | 0 | . | . |
| Mitral valve regurgitation | 118 | 0.822 | 0.764 | 47 | 0.979 | 0.754 | 5 | 1.000 | 0.500 |
| Mitral valve stenosis* | 6 | 1.000 | 0.500 | 2 | 1.000 | 0.667 | 0 | . | . |
| Pulmonary artery pressure | 42 | 0.952 | 0.588 | 15 | 1.000 | 0.517 | 0 | . | . |
| Right atrial pressure | 33 | 0.939 | 0.689 | 7 | 1.000 | 0.389 | 0 | . | . |
| Tricuspid valve mean pressure gradient* | 4 | 1.000 | 1.000 | 1 | 1.000 | 1.000 | 0 | . | . |
| Tricuspid valve regurgitation peak velocity | 18 | 0.889 | 0.696 | 5 | 0.600 | 0.500 | 0 | . | . |
| Tricuspid valve regurgitation | 85 | 0.976 | 0.830 | 54 | 1.000 | 0.857 | 1 | 1.000 | 0.333 |
| Total | 1,366 | 0.936 | 0.817 | 500 | 0.982 | 0.741 | 255 | 0.969 | 0.802 |
Concepts with total counts below 20 mentions are marked with asterisk (*). Number of mentions detected by the system are listed in “Mentions” column. TIU refers to the narrative text records in the VistA Text Integration Utilities file. Echo refers to the narrative portion of the structured record in the VistA Echocardiogram file. Radiology refers to the narrative text content in the VistA Radiology/Nuclear Medicine file