Thomas M Ward1,2, Daniel A Hashimoto3,4, Yutong Ban3,4,5, David W Rattner4, Haruhiro Inoue6, Keith D Lillemoe4, Daniela L Rus5, Guy Rosman3,5, Ozanan R Meireles3,4. 1. Surgical AI and Innovation Laboratory, Massachusetts General Hospital, 15 Parkman St., WAC 460, Boston, MA, 02114, USA. tmward@mgh.harvard.edu. 2. Department of Surgery, Massachusetts General Hospital, Boston, MA, USA. tmward@mgh.harvard.edu. 3. Surgical AI and Innovation Laboratory, Massachusetts General Hospital, 15 Parkman St., WAC 460, Boston, MA, 02114, USA. 4. Department of Surgery, Massachusetts General Hospital, Boston, MA, USA. 5. Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA. 6. Digestive Disease Center, Showa University Koto Toyosu Hospital, Tokyo, Japan.
Abstract
BACKGROUND: Artificial intelligence (AI) and computer vision (CV) have revolutionized image analysis. In surgery, CV applications have focused on surgical phase identification in laparoscopic videos. We proposed to apply CV techniques to identify phases in an endoscopic procedure, peroral endoscopic myotomy (POEM). METHODS: POEM videos were collected from Massachusetts General and Showa University Koto Toyosu Hospitals. Videos were labeled by surgeons with the following ground truth phases: (1) Submucosal injection, (2) Mucosotomy, (3) Submucosal tunnel, (4) Myotomy, and (5) Mucosotomy closure. The deep-learning CV model-Convolutional Neural Network (CNN) plus Long Short-Term Memory (LSTM)-was trained on 30 videos to create POEMNet. We then used POEMNet to identify operative phases in the remaining 20 videos. The model's performance was compared to surgeon annotated ground truth. RESULTS: POEMNet's overall phase identification accuracy was 87.6% (95% CI 87.4-87.9%). When evaluated on a per-phase basis, the model performed well, with mean unweighted and prevalence-weighted F1 scores of 0.766 and 0.875, respectively. The model performed best with longer phases, with 70.6% accuracy for phases that had a duration under 5 min and 88.3% accuracy for longer phases. DISCUSSION: A deep-learning-based approach to CV, previously successful in laparoscopic video phase identification, translates well to endoscopic procedures. With continued refinements, AI could contribute to intra-operative decision-support systems and post-operative risk prediction.
BACKGROUND: Artificial intelligence (AI) and computer vision (CV) have revolutionized image analysis. In surgery, CV applications have focused on surgical phase identification in laparoscopic videos. We proposed to apply CV techniques to identify phases in an endoscopic procedure, peroral endoscopic myotomy (POEM). METHODS: POEM videos were collected from Massachusetts General and Showa University Koto Toyosu Hospitals. Videos were labeled by surgeons with the following ground truth phases: (1) Submucosal injection, (2) Mucosotomy, (3) Submucosal tunnel, (4) Myotomy, and (5) Mucosotomy closure. The deep-learning CV model-Convolutional Neural Network (CNN) plus Long Short-Term Memory (LSTM)-was trained on 30 videos to create POEMNet. We then used POEMNet to identify operative phases in the remaining 20 videos. The model's performance was compared to surgeon annotated ground truth. RESULTS: POEMNet's overall phase identification accuracy was 87.6% (95% CI 87.4-87.9%). When evaluated on a per-phase basis, the model performed well, with mean unweighted and prevalence-weighted F1 scores of 0.766 and 0.875, respectively. The model performed best with longer phases, with 70.6% accuracy for phases that had a duration under 5 min and 88.3% accuracy for longer phases. DISCUSSION: A deep-learning-based approach to CV, previously successful in laparoscopic video phase identification, translates well to endoscopic procedures. With continued refinements, AI could contribute to intra-operative decision-support systems and post-operative risk prediction.
Authors: H Inoue; H Minami; Y Kobayashi; Y Sato; M Kaga; M Suzuki; H Satodate; N Odaka; H Itoh; S Kudo Journal: Endoscopy Date: 2010-03-30 Impact factor: 10.093
Authors: Andru P Twinanda; Sherif Shehata; Didier Mutter; Jacques Marescaux; Michel de Mathelin; Nicolas Padoy Journal: IEEE Trans Med Imaging Date: 2016-07-22 Impact factor: 10.048
Authors: Daniel A Hashimoto; Guy Rosman; Elan R Witkowski; Caitlin Stafford; Allison J Navarette-Welton; David W Rattner; Keith D Lillemoe; Daniela L Rus; Ozanan R Meireles Journal: Ann Surg Date: 2019-09 Impact factor: 12.969
Authors: Andre Esteva; Brett Kuprel; Roberto A Novoa; Justin Ko; Susan M Swetter; Helen M Blau; Sebastian Thrun Journal: Nature Date: 2017-01-25 Impact factor: 49.962
Authors: Todd C Hollon; Balaji Pandian; Arjun R Adapa; Esteban Urias; Akshay V Save; Siri Sahib S Khalsa; Daniel G Eichberg; Randy S D'Amico; Zia U Farooq; Spencer Lewis; Petros D Petridis; Tamara Marie; Ashish H Shah; Hugh J L Garton; Cormac O Maher; Jason A Heth; Erin L McKean; Stephen E Sullivan; Shawn L Hervey-Jumper; Parag G Patil; B Gregory Thompson; Oren Sagher; Guy M McKhann; Ricardo J Komotar; Michael E Ivan; Matija Snuderl; Marc L Otten; Timothy D Johnson; Michael B Sisti; Jeffrey N Bruce; Karin M Muraszko; Jay Trautman; Christian W Freudiger; Peter Canoll; Honglak Lee; Sandra Camelo-Piragua; Daniel A Orringer Journal: Nat Med Date: 2020-01-06 Impact factor: 53.440