Literature DB >> 23921192

Practical implementation of an existing smoking detection pipeline and reduced support vector machine training corpus requirements.

Richard Khor1, Wai-Kuan Yip, Mathias Bressel, William Rose, Gillian Duchesne, Farshad Foroudi.   

Abstract

This study aimed to reduce reliance on large training datasets in support vector machine (SVM)-based clinical text analysis by categorizing keyword features. An enhanced Mayo smoking status detection pipeline was deployed. We used a corpus of 709 annotated patient narratives. The pipeline was optimized for local data entry practice and lexicon. SVM classifier retraining used a grouped keyword approach for better efficiency. Accuracy, precision, and F-measure of the unaltered and optimized pipelines were evaluated using k-fold cross-validation. Initial accuracy of the clinical Text Analysis and Knowledge Extraction System (cTAKES) package was 0.69. Localization and keyword grouping improved system accuracy to 0.9 and 0.92, respectively. F-measures for current and past smoker classes improved from 0.43 to 0.81 and 0.71 to 0.91, respectively. Non-smoker and unknown-class F-measures were 0.96 and 0.98, respectively. Keyword grouping had no negative effect on performance, and decreased training time. Grouping keywords is a practical method to reduce training corpus size.

Entities:  

Keywords:  Classification/methods; Medical Records Systems, Computerized; Natural Language Processing; Smoking

Mesh:

Year:  2013        PMID: 23921192      PMCID: PMC3912731          DOI: 10.1136/amiajnl-2013-002090

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  15 in total

1.  Predictive factors for late normal tissue complications following radiotherapy for breast cancer.

Authors:  Carmen Lilla; Christine B Ambrosone; Silke Kropp; Irmgard Helmbold; Peter Schmezer; Dietrich von Fournier; Wulf Haase; Marie-Luise Sautter-Bihl; Frederik Wenz; Jenny Chang-Claude
Journal:  Breast Cancer Res Treat       Date:  2007-01-13       Impact factor: 4.872

2.  Medical i2b2 NLP smoking challenge: the A-Life system architecture and methodology.

Authors:  Daniel T Heinze; Mark L Morsch; Brian C Potter; Ronald E Sheffer
Journal:  J Am Med Inform Assoc       Date:  2007-10-18       Impact factor: 4.497

3.  A study of transportability of an existing smoking status detection module across institutions.

Authors:  Mei Liu; Anushi Shah; Min Jiang; Neeraja B Peterson; Qi Dai; Melinda C Aldrich; Qingxia Chen; Erica A Bowton; Hongfang Liu; Joshua C Denny; Hua Xu
Journal:  AMIA Annu Symp Proc       Date:  2012-11-03

4.  Association between smoking during radiotherapy and prognosis in head and neck cancer: a follow-up study.

Authors:  George P Browman; E Ann Mohide; Andrew Willan; Ian Hodson; Gene Wong; Laval Grimard; Robert G MacKenzie; Samy El-Sayed; Edward Dunn; Sylvia Farrell
Journal:  Head Neck       Date:  2002-12       Impact factor: 3.147

5.  Lung cancer following chemotherapy and radiotherapy for Hodgkin's disease.

Authors:  Lois B Travis; Mary Gospodarowicz; Rochelle E Curtis; E Aileen Clarke; Michael Andersson; Bengt Glimelius; Timo Joensuu; Charles F Lynch; Flora E van Leeuwen; Eric Holowaty; Hans Storm; Ingrid Glimelius; Eero Pukkala; Marilyn Stovall; Joseph F Fraumeni; John D Boice; Ethel Gilbert
Journal:  J Natl Cancer Inst       Date:  2002-02-06       Impact factor: 13.506

6.  Tobacco smoking and cancer: a meta-analysis.

Authors:  Sara Gandini; Edoardo Botteri; Simona Iodice; Mathieu Boniol; Albert B Lowenfels; Patrick Maisonneuve; Peter Boyle
Journal:  Int J Cancer       Date:  2008-01-01       Impact factor: 7.396

7.  Bladder cancer and the risk of smoking-related cancers during followup.

Authors:  E Salminen; E Pukkala; L Teppo
Journal:  J Urol       Date:  1994-11       Impact factor: 7.450

8.  Mayo clinic NLP system for patient smoking status identification.

Authors:  Guergana K Savova; Philip V Ogren; Patrick H Duffy; James D Buntrock; Christopher G Chute
Journal:  J Am Med Inform Assoc       Date:  2007-10-18       Impact factor: 4.497

9.  Influence of cigarette smoking on the efficacy of radiation therapy in head and neck cancer.

Authors:  G P Browman; G Wong; I Hodson; J Sathya; R Russell; L McAlpine; P Skingley; M N Levine
Journal:  N Engl J Med       Date:  1993-01-21       Impact factor: 91.245

10.  Second cancers among 104,760 survivors of cervical cancer: evaluation of long-term risk.

Authors:  Anil K Chaturvedi; Eric A Engels; Ethel S Gilbert; Bingshu E Chen; Hans Storm; Charles F Lynch; Per Hall; Froydis Langmark; Eero Pukkala; Magnus Kaijser; Michael Andersson; Sophie D Fosså; Heikki Joensuu; John D Boice; Ruth A Kleinerman; Lois B Travis
Journal:  J Natl Cancer Inst       Date:  2007-10-30       Impact factor: 13.506

View more
  7 in total

1.  A systematic comparison of feature space effects on disease classifier performance for phenotype identification of five diseases.

Authors:  Christopher Kotfila; Özlem Uzuner
Journal:  J Biomed Inform       Date:  2015-08-01       Impact factor: 6.317

2.  Learning regular expressions for clinical text classification.

Authors:  Duy Duc An Bui; Qing Zeng-Treitler
Journal:  J Am Med Inform Assoc       Date:  2014-02-27       Impact factor: 4.497

3.  N-gram support vector machines for scalable procedure and diagnosis classification, with applications to clinical free text data from the intensive care unit.

Authors:  Ben J Marafino; Jason M Davies; Naomi S Bardach; Mitzi L Dean; R Adams Dudley
Journal:  J Am Med Inform Assoc       Date:  2014-04-30       Impact factor: 4.497

Review 4.  Clinical information extraction applications: A literature review.

Authors:  Yanshan Wang; Liwei Wang; Majid Rastegar-Mojarad; Sungrim Moon; Feichen Shen; Naveed Afzal; Sijia Liu; Yuqun Zeng; Saeed Mehrabi; Sunghwan Sohn; Hongfang Liu
Journal:  J Biomed Inform       Date:  2017-11-21       Impact factor: 6.317

Review 5.  Clinical concept extraction: A methodology review.

Authors:  Sunyang Fu; David Chen; Huan He; Sijia Liu; Sungrim Moon; Kevin J Peterson; Feichen Shen; Liwei Wang; Yanshan Wang; Andrew Wen; Yiqing Zhao; Sunghwan Sohn; Hongfang Liu
Journal:  J Biomed Inform       Date:  2020-08-06       Impact factor: 6.317

6.  TextHunter--A User Friendly Tool for Extracting Generic Concepts from Free Text in Clinical Research.

Authors:  Richard G Jackson MSc; Michael Ball; Rashmi Patel; Richard D Hayes; Richard J B Dobson; Robert Stewart
Journal:  AMIA Annu Symp Proc       Date:  2014-11-14

7.  Validation of Prediction Models for Critical Care Outcomes Using Natural Language Processing of Electronic Health Record Data.

Authors:  Ben J Marafino; Miran Park; Jason M Davies; Robert Thombley; Harold S Luft; David C Sing; Dhruv S Kazi; Colette DeJong; W John Boscardin; Mitzi L Dean; R Adams Dudley
Journal:  JAMA Netw Open       Date:  2018-12-07
  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.