Raymond H Mak1, Michael G Endres2,3, Jin H Paik2,4, Rinat A Sergeev2,4, Hugo Aerts1,5, Christopher L Williams1, Karim R Lakhani2,4,6, Eva C Guinan1,2. 1. Department of Radiation Oncology, Brigham and Women's Hospital/Dana-Farber Cancer Institute/Harvard Medical School, Boston, Massachusetts. 2. Laboratory for Innovation Science at Harvard, Harvard University, Boston, Massachusetts. 3. Institute for Quantitative Social Science, Harvard University, Cambridge, Massachusetts. 4. Harvard Business School, Boston, Massachusetts. 5. Department of Radiology, Brigham and Women's Hospital, Boston, Massachusetts. 6. The National Bureau of Economic Research, Cambridge, Massachusetts.
Abstract
IMPORTANCE: Radiation therapy (RT) is a critical cancer treatment, but the existing radiation oncologist work force does not meet growing global demand. One key physician task in RT planning involves tumor segmentation for targeting, which requires substantial training and is subject to significant interobserver variation. OBJECTIVE: To determine whether crowd innovation could be used to rapidly produce artificial intelligence (AI) solutions that replicate the accuracy of an expert radiation oncologist in segmenting lung tumors for RT targeting. DESIGN, SETTING, AND PARTICIPANTS: We conducted a 10-week, prize-based, online, 3-phase challenge (prizes totaled $55 000). A well-curated data set, including computed tomographic (CT) scans and lung tumor segmentations generated by an expert for clinical care, was used for the contest (CT scans from 461 patients; median 157 images per scan; 77 942 images in total; 8144 images with tumor present). Contestants were provided a training set of 229 CT scans with accompanying expert contours to develop their algorithms and given feedback on their performance throughout the contest, including from the expert clinician. MAIN OUTCOMES AND MEASURES: The AI algorithms generated by contestants were automatically scored on an independent data set that was withheld from contestants, and performance ranked using quantitative metrics that evaluated overlap of each algorithm's automated segmentations with the expert's segmentations. Performance was further benchmarked against human expert interobserver and intraobserver variation. RESULTS: A total of 564 contestants from 62 countries registered for this challenge, and 34 (6%) submitted algorithms. The automated segmentations produced by the top 5 AI algorithms, when combined using an ensemble model, had an accuracy (Dice coefficient = 0.79) that was within the benchmark of mean interobserver variation measured between 6 human experts. For phase 1, the top 7 algorithms had average custom segmentation scores (S scores) on the holdout data set ranging from 0.15 to 0.38, and suboptimal performance using relative measures of error. The average S scores for phase 2 increased to 0.53 to 0.57, with a similar improvement in other performance metrics. In phase 3, performance of the top algorithm increased by an additional 9%. Combining the top 5 algorithms from phase 2 and phase 3 using an ensemble model, yielded an additional 9% to 12% improvement in performance with a final S score reaching 0.68. CONCLUSIONS AND RELEVANCE: A combined crowd innovation and AI approach rapidly produced automated algorithms that replicated the skills of a highly trained physician for a critical task in radiation therapy. These AI algorithms could improve cancer care globally by transferring the skills of expert clinicians to under-resourced health care settings.
IMPORTANCE: Radiation therapy (RT) is a critical cancer treatment, but the existing radiation oncologist work force does not meet growing global demand. One key physician task in RT planning involves tumor segmentation for targeting, which requires substantial training and is subject to significant interobserver variation. OBJECTIVE: To determine whether crowd innovation could be used to rapidly produce artificial intelligence (AI) solutions that replicate the accuracy of an expert radiation oncologist in segmenting lung tumors for RT targeting. DESIGN, SETTING, AND PARTICIPANTS: We conducted a 10-week, prize-based, online, 3-phase challenge (prizes totaled $55 000). A well-curated data set, including computed tomographic (CT) scans and lung tumor segmentations generated by an expert for clinical care, was used for the contest (CT scans from 461 patients; median 157 images per scan; 77 942 images in total; 8144 images with tumor present). Contestants were provided a training set of 229 CT scans with accompanying expert contours to develop their algorithms and given feedback on their performance throughout the contest, including from the expert clinician. MAIN OUTCOMES AND MEASURES: The AI algorithms generated by contestants were automatically scored on an independent data set that was withheld from contestants, and performance ranked using quantitative metrics that evaluated overlap of each algorithm's automated segmentations with the expert's segmentations. Performance was further benchmarked against human expert interobserver and intraobserver variation. RESULTS: A total of 564 contestants from 62 countries registered for this challenge, and 34 (6%) submitted algorithms. The automated segmentations produced by the top 5 AI algorithms, when combined using an ensemble model, had an accuracy (Dice coefficient = 0.79) that was within the benchmark of mean interobserver variation measured between 6 human experts. For phase 1, the top 7 algorithms had average custom segmentation scores (S scores) on the holdout data set ranging from 0.15 to 0.38, and suboptimal performance using relative measures of error. The average S scores for phase 2 increased to 0.53 to 0.57, with a similar improvement in other performance metrics. In phase 3, performance of the top algorithm increased by an additional 9%. Combining the top 5 algorithms from phase 2 and phase 3 using an ensemble model, yielded an additional 9% to 12% improvement in performance with a final S score reaching 0.68. CONCLUSIONS AND RELEVANCE: A combined crowd innovation and AI approach rapidly produced automated algorithms that replicated the skills of a highly trained physician for a critical task in radiation therapy. These AI algorithms could improve cancer care globally by transferring the skills of expert clinicians to under-resourced health care settings.
Authors: Binsheng Zhao; Geoffrey R Oxnard; Chaya S Moskowitz; Mark G Kris; William Pao; Pingzhen Guo; Valerie M Rusch; Marc Ladanyi; Naiyer A Rizvi; Lawrence H Schwartz Journal: Clin Cancer Res Date: 2010-06-09 Impact factor: 12.531
Authors: Vishesh Agrawal; Thibaud P Coroller; Ying Hou; Stephanie W Lee; John L Romano; Elizabeth H Baldini; Aileen B Chen; David M Jackman; David Kozono; Scott J Swanson; Jon O Wee; Hugo J W L Aerts; Raymond H Mak Journal: Lung Cancer Date: 2016-10-14 Impact factor: 5.705
Authors: Elyn H Wang; Charles E Rutter; Christopher D Corso; Roy H Decker; Lynn D Wilson; Anthony W Kim; James B Yu; Henry S Park Journal: J Thorac Oncol Date: 2015-06 Impact factor: 15.609
Authors: Samuel G Armato; Geoffrey McLennan; Luc Bidaut; Michael F McNitt-Gray; Charles R Meyer; Anthony P Reeves; Binsheng Zhao; Denise R Aberle; Claudia I Henschke; Eric A Hoffman; Ella A Kazerooni; Heber MacMahon; Edwin J R Van Beeke; David Yankelevitz; Alberto M Biancardi; Peyton H Bland; Matthew S Brown; Roger M Engelmann; Gary E Laderach; Daniel Max; Richard C Pais; David P Y Qing; Rachael Y Roberts; Amanda R Smith; Adam Starkey; Poonam Batrah; Philip Caligiuri; Ali Farooqi; Gregory W Gladish; C Matilda Jude; Reginald F Munden; Iva Petkovska; Leslie E Quint; Lawrence H Schwartz; Baskaran Sundaram; Lori E Dodd; Charles Fenimore; David Gur; Nicholas Petrick; John Freymann; Justin Kirby; Brian Hughes; Alessi Vande Casteele; Sangeeta Gupte; Maha Sallamm; Michael D Heath; Michael H Kuhn; Ekta Dharaiya; Richard Burns; David S Fryd; Marcos Salganicoff; Vikram Anand; Uri Shreter; Stephen Vastagh; Barbara Y Croft Journal: Med Phys Date: 2011-02 Impact factor: 4.071
Authors: Evan J Wuthrick; Qiang Zhang; Mitchell Machtay; David I Rosenthal; Phuc Felix Nguyen-Tan; André Fortin; Craig L Silverman; Adam Raben; Harold E Kim; Eric M Horwitz; Nancy E Read; Jonathan Harris; Qian Wu; Quynh-Thu Le; Maura L Gillison Journal: J Clin Oncol Date: 2014-12-08 Impact factor: 44.544
Authors: Kimberly D Miller; Rebecca L Siegel; Chun Chieh Lin; Angela B Mariotto; Joan L Kramer; Julia H Rowland; Kevin D Stein; Rick Alteri; Ahmedin Jemal Journal: CA Cancer J Clin Date: 2016-06-02 Impact factor: 508.702
Authors: Jacques Ferlay; Isabelle Soerjomataram; Rajesh Dikshit; Sultan Eser; Colin Mathers; Marise Rebelo; Donald Maxwell Parkin; David Forman; Freddie Bray Journal: Int J Cancer Date: 2014-10-09 Impact factor: 7.396
Authors: Daniel S Kermany; Michael Goldbaum; Wenjia Cai; Carolina C S Valentim; Huiying Liang; Sally L Baxter; Alex McKeown; Ge Yang; Xiaokang Wu; Fangbing Yan; Justin Dong; Made K Prasadha; Jacqueline Pei; Magdalene Y L Ting; Jie Zhu; Christina Li; Sierra Hewett; Jason Dong; Ian Ziyar; Alexander Shi; Runze Zhang; Lianghong Zheng; Rui Hou; William Shi; Xin Fu; Yaou Duan; Viet A N Huu; Cindy Wen; Edward D Zhang; Charlotte L Zhang; Oulan Li; Xiaobo Wang; Michael A Singer; Xiaodong Sun; Jie Xu; Ali Tafreshi; M Anthony Lewis; Huimin Xia; Kang Zhang Journal: Cell Date: 2018-02-22 Impact factor: 41.582
Authors: Isaac S Chua; Michal Gaziel-Yablowitz; Zfania T Korach; Kenneth L Kehl; Nathan A Levitan; Yull E Arriaga; Gretchen P Jackson; David W Bates; Michael Hassett Journal: Cancer Med Date: 2021-05-07 Impact factor: 4.452
Authors: Michael V Sherer; Diana Lin; Sharif Elguindi; Simon Duke; Li-Tee Tan; Jon Cacicedo; Max Dahele; Erin F Gillespie Journal: Radiother Oncol Date: 2021-05-11 Impact factor: 6.901
Authors: Andrea Blasco; Michael G Endres; Rinat A Sergeev; Anup Jonchhe; N J Maximilian Macaluso; Rajiv Narayan; Ted Natoli; Jin H Paik; Bryan Briney; Chunlei Wu; Andrew I Su; Aravind Subramanian; Karim R Lakhani Journal: PLoS One Date: 2019-09-27 Impact factor: 3.240
Authors: Rebecca E Stewart; Rinad S Beidas; Briana S Last; Katelin Hoskins; Y Vivian Byeon; Nathaniel J Williams; Alison M Buttenheim Journal: Adm Policy Ment Health Date: 2021-01
Authors: Cody J Callahan; Rose Lee; Kate Zulauf; Lauren Tamburello; Keneth P Smith; Joe Previtera; Annie Cheng; Alex Green; Ahmed Abdul Azim; Amanda Yano; Nancy Doraiswami; James Kirby; Ramy Arnaout Journal: medRxiv Date: 2020-04-17
Authors: Rose Lee; Katelyn E Zulauf; Cody J Callahan; Lauren Tamburello; Kenneth P Smith; Joe Previtera; Annie Cheng; Alex Green; Ahmed Abdul Azim; Amanda Yano; Nancy Doraiswami; James E Kirby; Ramy A Arnaout Journal: J Clin Microbiol Date: 2020-07-23 Impact factor: 5.948
Authors: John Kang; Reid F Thompson; Sanjay Aneja; Constance Lehman; Andrew Trister; James Zou; Ceferino Obcemea; Issam El Naqa Journal: Pract Radiat Oncol Date: 2020-06-13
Authors: Danielle S Bitterman; Daniel N Cagney; Lisa L Singer; Paul L Nguyen; Paul J Catalano; Raymond H Mak Journal: J Natl Cancer Inst Date: 2020-03-01 Impact factor: 13.506