Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Toward better public health reporting using existing off the shelf approaches: A comparison of alternative cancer detection approaches using plaintext medical data and non-dictionary based feature selection.

Literature DB >> 26826453

Toward better public health reporting using existing off the shelf approaches: A comparison of alternative cancer detection approaches using plaintext medical data and non-dictionary based feature selection.

Suranga N Kasthurirathne¹, Brian E Dixon², Judy Gichoya³, Huiping Xu⁴, Yuni Xia³, Burke Mamlin⁵, Shaun J Grannis⁵.

Abstract

OBJECTIVES: Increased adoption of electronic health records has resulted in increased availability of free text clinical data for secondary use. A variety of approaches to obtain actionable information from unstructured free text data exist. These approaches are resource intensive, inherently complex and rely on structured clinical data and dictionary-based approaches. We sought to evaluate the potential to obtain actionable information from free text pathology reports using routinely available tools and approaches that do not depend on dictionary-based approaches.
MATERIALS AND METHODS: We obtained pathology reports from a large health information exchange and evaluated the capacity to detect cancer cases from these reports using 3 non-dictionary feature selection approaches, 4 feature subset sizes, and 5 clinical decision models: simple logistic regression, naïve bayes, k-nearest neighbor, random forest, and J48 decision tree. The performance of each decision model was evaluated using sensitivity, specificity, accuracy, positive predictive value, and area under the receiver operating characteristics (ROC) curve.
RESULTS: Decision models parameterized using automated, informed, and manual feature selection approaches yielded similar results. Furthermore, non-dictionary classification approaches identified cancer cases present in free text reports with evaluation measures approaching and exceeding 80-90% for most metrics.
CONCLUSION: Our methods are feasible and practical approaches for extracting substantial information value from free text medical data, and the results suggest that these methods can perform on par, if not better, than existing dictionary-based approaches. Given that public health agencies are often under-resourced and lack the technical capacity for more complex methodologies, these results represent potentially significant value to the public health field.

Entities: Disease

Keywords: Cancer; Data preprocessing; Decision models; Feature selection; Pathology; Public health reporting

Mesh：

Year: 2016 PMID： 26826453 DOI： 10.1016/j.jbi.2016.01.008

Source DB: PubMed Journal: J Biomed Inform ISSN： 1532-0464 Impact factor: 6.317

Keyword Cloud
Cited

3 in total

1. Automatic ICD-10 multi-class classification of cause of death from plaintext autopsy reports through expert-driven feature selection.

Authors: Ghulam Mujtaba; Liyana Shuib; Ram Gopal Raj; Retnagowri Rajandram; Khairunisa Shaikh; Mohammed Ali Al-Garadi
Journal: PLoS One Date: 2017-02-06 Impact factor: 3.240

2. Open Agile text mining for bioinformatics: the PubAnnotation ecosystem.

Authors: Jin-Dong Kim; Yue Wang; Toyofumi Fujiwara; Shujiro Okuda; Tiffany J Callahan; K Bretonnel Cohen
Journal: Bioinformatics Date: 2019-11-01 Impact factor: 6.937

3. Separation of Different Blogs from Skin Disease Data using Artificial Intelligence.

Authors: Mohammed J Abdulaal; Ibrahim M Mehedi; Abdulah Jeza Aljohani; Ahmad H Milyani; Mohamed Mahmoud; Abdullah M Abusorrah; Rahtul Jannat
Journal: Comput Intell Neurosci Date: 2022-08-23

3 in total