BACKGROUND: Although the penetration of electronic health records is increasing rapidly, much of the historical medical record is only available in handwritten notes and forms, which require labor-intensive, human chart abstraction for some clinical research. The few previous studies on automated extraction of data from these handwritten notes have focused on monolithic, custom-developed recognition systems or third-party systems that require proprietary forms. METHODS: We present an optical character recognition processing pipeline, which leverages the capabilities of existing third-party optical character recognition engines, and provides the flexibility offered by a modular custom-developed system. The system was configured and run on a selected set of form fields extracted from a corpus of handwritten ophthalmology forms. OBSERVATIONS: The processing pipeline allowed multiple configurations to be run, with the optimal configuration consisting of the Nuance and LEADTOOLS engines running in parallel with a positive predictive value of 94.6% and a sensitivity of 13.5%. DISCUSSION: While limitations exist, preliminary experience from this project yielded insights on the generalizability and applicability of integrating multiple, inexpensive general-purpose third-party optical character recognition engines in a modular pipeline.
BACKGROUND: Although the penetration of electronic health records is increasing rapidly, much of the historical medical record is only available in handwritten notes and forms, which require labor-intensive, human chart abstraction for some clinical research. The few previous studies on automated extraction of data from these handwritten notes have focused on monolithic, custom-developed recognition systems or third-party systems that require proprietary forms. METHODS: We present an optical character recognition processing pipeline, which leverages the capabilities of existing third-party optical character recognition engines, and provides the flexibility offered by a modular custom-developed system. The system was configured and run on a selected set of form fields extracted from a corpus of handwritten ophthalmology forms. OBSERVATIONS: The processing pipeline allowed multiple configurations to be run, with the optimal configuration consisting of the Nuance and LEADTOOLS engines running in parallel with a positive predictive value of 94.6% and a sensitivity of 13.5%. DISCUSSION: While limitations exist, preliminary experience from this project yielded insights on the generalizability and applicability of integrating multiple, inexpensive general-purpose third-party optical character recognition engines in a modular pipeline.
Authors: Guergana K Savova; James J Masanz; Philip V Ogren; Jiaping Zheng; Sunghwan Sohn; Karin C Kipper-Schuler; Christopher G Chute Journal: J Am Med Inform Assoc Date: 2010 Sep-Oct Impact factor: 4.497
Authors: Hermann Bussmann; C William Wester; Ndwapi Ndwapi; Chris Vanderwarker; Tendani Gaolathe; Geoffrey Tirelo; Ava Avalos; Howard Moffat; Richard G Marlink Journal: Bull World Health Organ Date: 2006-02-23 Impact factor: 9.408
Authors: Abel N Kho; Jennifer A Pacheco; Peggy L Peissig; Luke Rasmussen; Katherine M Newton; Noah Weston; Paul K Crane; Jyotishman Pathak; Christopher G Chute; Suzette J Bielinski; Iftikhar J Kullo; Rongling Li; Teri A Manolio; Rex L Chisholm; Joshua C Denny Journal: Sci Transl Med Date: 2011-04-20 Impact factor: 17.956
Authors: Peggy L Peissig; Luke V Rasmussen; Richard L Berg; James G Linneman; Catherine A McCarty; Carol Waudby; Lin Chen; Joshua C Denny; Russell A Wilke; Jyotishman Pathak; David Carrell; Abel N Kho; Justin B Starren Journal: J Am Med Inform Assoc Date: 2012 Mar-Apr Impact factor: 4.497
Authors: Christopher W Halladay; Tamer Hadi; Matthew D Anger; Paul B Greenberg; Jack M Sullivan; P Eric Konicki; Neal S Peachey; Robert P Igo; Sudha K Iyengar; Wen-Chih Wu; Dana C Crawford Journal: AMIA Jt Summits Transl Sci Proc Date: 2019-05-06
Authors: Carol J Waudby; Richard L Berg; James G Linneman; Luke V Rasmussen; Peggy L Peissig; Lin Chen; Catherine A McCarty Journal: BMC Ophthalmol Date: 2011-11-11 Impact factor: 2.209