Yuelin Li1,2,3, Bruce Rapkin4, Thomas M Atkinson5, Elizabeth Schofield5, Bernard H Bochner6. 1. Department of Psychiatry & Behavioral Sciences, Memorial Sloan Kettering Cancer Center, New York, NY, USA. liy12@mskcc.org. 2. Department of Epidemiology & Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA. liy12@mskcc.org. 3. , 641 Lexington Avenue, 7th Floor, New York, 10022, NY, USA. liy12@mskcc.org. 4. Albert Einstein College of Medicine, Montefiore Health System, Bronx, NY, USA. 5. Department of Psychiatry & Behavioral Sciences, Memorial Sloan Kettering Cancer Center, New York, NY, USA. 6. Urology Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
Abstract
PURPOSE: As we begin to leverage Big Data in health care settings and particularly in assessing patient-reported outcomes, there is a need for novel analytics to address unique challenges. One such challenge is in coding transcribed interview data, typically free-text entries of statements made during a face-to-face interview. Latent Dirichlet Allocation (LDA) offers statistical rigor and consistency in automating the interpretation of patients' expressed concerns and coping strategies. METHODS: LDA was applied to interview data collected as part of a prospective, longitudinal study of QOL in N = 211 patients undergoing radical cystectomy and urinary diversion for bladder cancer. LDA analyzed personal goal statements to extract the latent topics and themes, stratified by time, and on things patients wanted to accomplish and prevent. Model comparison metrics determined the number of topics to extract. RESULTS: LDA extracted seven latent topics. Prior to surgery, patients' priorities were primarily in cancer surgery and recovery. Six months after the surgery, they were replaced by goals on regaining a sense of normalcy, to resume work, to enjoy life more fully, and to appreciate friends and family more. LDA model parameters showed changing priorities, e.g., immediate concerns on surgery and resuming employment decreased post-surgery and were replaced by concerns over cancer recurrence and a desire to remain healthy and strong. CONCLUSIONS: Novel Big Data analytics such as LDA offer the possibility of summarizing personal goals without the need for conventional fixed-length measures and resource-intensive qualitative data coding.
PURPOSE: As we begin to leverage Big Data in health care settings and particularly in assessing patient-reported outcomes, there is a need for novel analytics to address unique challenges. One such challenge is in coding transcribed interview data, typically free-text entries of statements made during a face-to-face interview. Latent Dirichlet Allocation (LDA) offers statistical rigor and consistency in automating the interpretation of patients' expressed concerns and coping strategies. METHODS:LDA was applied to interview data collected as part of a prospective, longitudinal study of QOL in N = 211 patients undergoing radical cystectomy and urinary diversion for bladder cancer. LDA analyzed personal goal statements to extract the latent topics and themes, stratified by time, and on things patients wanted to accomplish and prevent. Model comparison metrics determined the number of topics to extract. RESULTS:LDA extracted seven latent topics. Prior to surgery, patients' priorities were primarily in cancer surgery and recovery. Six months after the surgery, they were replaced by goals on regaining a sense of normalcy, to resume work, to enjoy life more fully, and to appreciate friends and family more. LDA model parameters showed changing priorities, e.g., immediate concerns on surgery and resuming employment decreased post-surgery and were replaced by concerns over cancer recurrence and a desire to remain healthy and strong. CONCLUSIONS: Novel Big Data analytics such as LDA offer the possibility of summarizing personal goals without the need for conventional fixed-length measures and resource-intensive qualitative data coding.
Entities:
Keywords:
Big Data analysis; Bladder cancer; Latent Dirichlet Allocation; Qualitative data; Text analysis
Authors: Matthew B Clements; Thomas M Atkinson; Guido M Dalbagni; Yuelin Li; Andrew J Vickers; Harry W Herr; S Machele Donat; Jaspreet S Sandhu; Daniel S Sjoberg; Amy L Tin; Bruce D Rapkin; Bernard H Bochner Journal: Eur Urol Date: 2021-10-08 Impact factor: 20.096
Authors: Bach Xuan Tran; Son Nghiem; Clifford Afoakwah; Giang Hai Ha; Linh Phuong Doan; Thao Phuong Nguyen; Tuan Thanh Le; Carl A Latkin; Cyrus S H Ho; Roger C M Ho Journal: Health Qual Life Outcomes Date: 2020-07-29 Impact factor: 3.186
Authors: Bach Xuan Tran; Carl A Latkin; Giang Thu Vu; Huong Lan Thi Nguyen; Son Nghiem; Ming-Xuan Tan; Zhi-Kai Lim; Cyrus S H Ho; Roger C M Ho Journal: Int J Environ Res Public Health Date: 2019-07-29 Impact factor: 3.390
Authors: Bach Xuan Tran; Roger S McIntyre; Carl A Latkin; Hai Thanh Phan; Giang Thu Vu; Huong Lan Thi Nguyen; Kenneth K Gwee; Cyrus S H Ho; Roger C M Ho Journal: Int J Environ Res Public Health Date: 2019-06-18 Impact factor: 3.390
Authors: Bach Xuan Tran; Carl A Latkin; Noha Sharafeldin; Katherina Nguyen; Giang Thu Vu; Wilson W S Tam; Ngai-Man Cheung; Huong Lan Thi Nguyen; Cyrus S H Ho; Roger C M Ho Journal: JMIR Med Inform Date: 2019-09-15
Authors: Hai Thanh Phan; Giap Van Vu; Giang Thu Vu; Giang Hai Ha; Hai Quang Pham; Carl A Latkin; Bach Xuan Tran; Cyrus S H Ho; Roger C M Ho Journal: Int J Environ Res Public Health Date: 2020-05-19 Impact factor: 3.390
Authors: Giap Van Vu; Giang Hai Ha; Cuong Tat Nguyen; Giang Thu Vu; Hai Quang Pham; Carl A Latkin; Bach Xuan Tran; Roger C M Ho; Cyrus S H Ho Journal: Int J Environ Res Public Health Date: 2020-04-29 Impact factor: 3.390
Authors: Giang Thu Vu; Bach Xuan Tran; Roger S McIntyre; Hai Quang Pham; Hai Thanh Phan; Giang Hai Ha; Kenneth K Gwee; Carl A Latkin; Roger C M Ho; Cyrus S H Ho Journal: Int J Environ Res Public Health Date: 2020-03-17 Impact factor: 3.390