Daichi Kitaguchi1,2,3, Nobuyoshi Takeshita4,5, Hiroki Matsuzaki2, Hiroaki Takano2, Yohei Owada3, Tsuyoshi Enomoto3, Tatsuya Oda3, Hirohisa Miura6, Takahiro Yamanashi6, Masahiko Watanabe6, Daisuke Sato7, Yusuke Sugomori7, Seigo Hara7, Masaaki Ito1,2. 1. Department of Colorectal Surgery, National Cancer Center Hospital East, 6-5-1, Kashiwanoha, Kashiwa-City, Chiba, 277-8577, Japan. 2. Surgical Device Innovation Office, National Cancer Center Hospital East, 6-5-1, Kashiwanoha, Kashiwa-City, Chiba, 277-8577, Japan. 3. Department of Gastrointestinal and Hepato-Biliary-Pancreatic Surgery, Faculty of Medicine, University of Tsukuba, Tsukuba, Ibaraki, 305-8575, Japan. 4. Department of Colorectal Surgery, National Cancer Center Hospital East, 6-5-1, Kashiwanoha, Kashiwa-City, Chiba, 277-8577, Japan. ntakeshi@east.ncc.go.jp. 5. Surgical Device Innovation Office, National Cancer Center Hospital East, 6-5-1, Kashiwanoha, Kashiwa-City, Chiba, 277-8577, Japan. ntakeshi@east.ncc.go.jp. 6. Department of Surgery, Kitasato University School of Medicine, Kitasato 1-15-1, Minami-ku, Sagamihara, 252-0374, Japan. 7. MICIN, Inc, Nihon Bldg 13F, Otemachi, Chiyoda-ku, Tokyo, 100-0004, Japan.
Abstract
BACKGROUND: Automatic surgical workflow recognition is a key component for developing the context-aware computer-assisted surgery (CA-CAS) systems. However, automatic surgical phase recognition focused on colorectal surgery has not been reported. We aimed to develop a deep learning model for automatic surgical phase recognition based on laparoscopic sigmoidectomy (Lap-S) videos, which could be used for real-time phase recognition, and to clarify the accuracies of the automatic surgical phase and action recognitions using visual information. METHODS: The dataset used contained 71 cases of Lap-S. The video data were divided into frame units every 1/30 s as static images. Every Lap-S video was manually divided into 11 surgical phases (Phases 0-10) and manually annotated for each surgical action on every frame. The model was generated based on the training data. Validation of the model was performed on a set of unseen test data. Convolutional neural network (CNN)-based deep learning was also used. RESULTS: The average surgical time was 175 min (± 43 min SD), with the individual surgical phases also showing high variations in the duration between cases. Each surgery started in the first phase (Phase 0) and ended in the last phase (Phase 10), and phase transitions occurred 14 (± 2 SD) times per procedure on an average. The accuracy of the automatic surgical phase recognition was 91.9% and those for the automatic surgical action recognition of extracorporeal action and irrigation were 89.4% and 82.5%, respectively. Moreover, this system could perform real-time automatic surgical phase recognition at 32 fps. CONCLUSIONS: The CNN-based deep learning approach enabled the recognition of surgical phases and actions in 71 Lap-S cases based on manually annotated data. This system could perform automatic surgical phase recognition and automatic target surgical action recognition with high accuracy. Moreover, this study showed the feasibility of real-time automatic surgical phase recognition with high frame rate.
BACKGROUND: Automatic surgical workflow recognition is a key component for developing the context-aware computer-assisted surgery (CA-CAS) systems. However, automatic surgical phase recognition focused on colorectal surgery has not been reported. We aimed to develop a deep learning model for automatic surgical phase recognition based on laparoscopic sigmoidectomy (Lap-S) videos, which could be used for real-time phase recognition, and to clarify the accuracies of the automatic surgical phase and action recognitions using visual information. METHODS: The dataset used contained 71 cases of Lap-S. The video data were divided into frame units every 1/30 s as static images. Every Lap-S video was manually divided into 11 surgical phases (Phases 0-10) and manually annotated for each surgical action on every frame. The model was generated based on the training data. Validation of the model was performed on a set of unseen test data. Convolutional neural network (CNN)-based deep learning was also used. RESULTS: The average surgical time was 175 min (± 43 min SD), with the individual surgical phases also showing high variations in the duration between cases. Each surgery started in the first phase (Phase 0) and ended in the last phase (Phase 10), and phase transitions occurred 14 (± 2 SD) times per procedure on an average. The accuracy of the automatic surgical phase recognition was 91.9% and those for the automatic surgical action recognition of extracorporeal action and irrigation were 89.4% and 82.5%, respectively. Moreover, this system could perform real-time automatic surgical phase recognition at 32 fps. CONCLUSIONS: The CNN-based deep learning approach enabled the recognition of surgical phases and actions in 71 Lap-S cases based on manually annotated data. This system could perform automatic surgical phase recognition and automatic target surgical action recognition with high accuracy. Moreover, this study showed the feasibility of real-time automatic surgical phase recognition with high frame rate.
Authors: Chaitanya S Kulkarni; Shiyu Deng; Tianzi Wang; Jacob Hartman-Kenzler; Laura E Barnes; Sarah Henrickson Parker; Shawn D Safford; Nathan Lau Journal: Surg Endosc Date: 2022-09-19 Impact factor: 3.453
Authors: Andrew A Gumbs; Vincent Grasso; Nicolas Bourdel; Roland Croner; Gaya Spolverato; Isabella Frigerio; Alfredo Illanes; Mohammad Abu Hilal; Adrian Park; Eyad Elyan Journal: Sensors (Basel) Date: 2022-06-29 Impact factor: 3.847
Authors: Thomas M Ward; Daniel A Hashimoto; Yutong Ban; David W Rattner; Haruhiro Inoue; Keith D Lillemoe; Daniela L Rus; Guy Rosman; Ozanan R Meireles Journal: Surg Endosc Date: 2020-07-27 Impact factor: 3.453