Daichi Kitaguchi1, Nobuyoshi Takeshita2, Hiroki Matsuzaki3, Tatsuya Oda4, Masahiko Watanabe5, Kensaku Mori6, Etsuko Kobayashi7, Masaaki Ito8. 1. Surgical Device Innovation Office, National Cancer Center Hospital East, 6-5-1 Kashiwanoha, Kashiwa, Chiba, 277-8577, Japan; Department of Colorectal Surgery, National Cancer Center Hospital East, 6-5-1 Kashiwanoha, Kashiwa, Chiba, 277-8577, Japan; Department of Gastrointestinal and Hepato-Biliary-Pancreatic Surgery, Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8575, Japan. 2. Surgical Device Innovation Office, National Cancer Center Hospital East, 6-5-1 Kashiwanoha, Kashiwa, Chiba, 277-8577, Japan; Department of Colorectal Surgery, National Cancer Center Hospital East, 6-5-1 Kashiwanoha, Kashiwa, Chiba, 277-8577, Japan. Electronic address: ntakeshi@east.ncc.go.jp. 3. Surgical Device Innovation Office, National Cancer Center Hospital East, 6-5-1 Kashiwanoha, Kashiwa, Chiba, 277-8577, Japan. 4. Department of Gastrointestinal and Hepato-Biliary-Pancreatic Surgery, Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8575, Japan. 5. Department of Surgery, Kitasato University School of Medicine, 1-15-1 Kitasato, Minami-ku, Sagamihara, Kanagawa, 252-0374, Japan. 6. Graduate School of Informatics, Nagoya University, Furo-cho, Chikusa-ku, Nagoya, Aichi, 464-8601, Japan. 7. Institute of Advanced Biomedical Engineering and Science, Tokyo Women's Medical University, 8-1 Kawada-cho, Shinjuku-ku, Tokyo, 162-8666, Japan. 8. Surgical Device Innovation Office, National Cancer Center Hospital East, 6-5-1 Kashiwanoha, Kashiwa, Chiba, 277-8577, Japan; Department of Colorectal Surgery, National Cancer Center Hospital East, 6-5-1 Kashiwanoha, Kashiwa, Chiba, 277-8577, Japan. Electronic address: maito@east.ncc.go.jp.
Abstract
BACKGROUND: Identifying laparoscopic surgical videos using artificial intelligence (AI) facilitates the automation of several currently time-consuming manual processes, including video analysis, indexing, and video-based skill assessment. This study aimed to construct a large annotated dataset comprising laparoscopic colorectal surgery (LCRS) videos from multiple institutions and evaluate the accuracy of automatic recognition for surgical phase, action, and tool by combining this dataset with AI. MATERIALS AND METHODS: A total of 300 intraoperative videos were collected from 19 high-volume centers. A series of surgical workflows were classified into 9 phases and 3 actions, and the area of 5 tools were assigned by painting. More than 82 million frames were annotated for a phase and action classification task, and 4000 frames were annotated for a tool segmentation task. Of these frames, 80% were used for the training dataset and 20% for the test dataset. A convolutional neural network (CNN) was used to analyze the videos. Intersection over union (IoU) was used as the evaluation metric for tool recognition. RESULTS: The overall accuracies for the automatic surgical phase and action classification task were 81.0% and 83.2%, respectively. The mean IoU for the automatic tool segmentation task for 5 tools was 51.2%. CONCLUSIONS: A large annotated dataset of LCRS videos was constructed, and the phase, action, and tool were recognized with high accuracy using AI. Our dataset has potential uses in medical applications such as automatic video indexing and surgical skill assessments. Open research will assist in improving CNN models by making our dataset available in the field of computer vision.
BACKGROUND: Identifying laparoscopic surgical videos using artificial intelligence (AI) facilitates the automation of several currently time-consuming manual processes, including video analysis, indexing, and video-based skill assessment. This study aimed to construct a large annotated dataset comprising laparoscopic colorectal surgery (LCRS) videos from multiple institutions and evaluate the accuracy of automatic recognition for surgical phase, action, and tool by combining this dataset with AI. MATERIALS AND METHODS: A total of 300 intraoperative videos were collected from 19 high-volume centers. A series of surgical workflows were classified into 9 phases and 3 actions, and the area of 5 tools were assigned by painting. More than 82 million frames were annotated for a phase and action classification task, and 4000 frames were annotated for a tool segmentation task. Of these frames, 80% were used for the training dataset and 20% for the test dataset. A convolutional neural network (CNN) was used to analyze the videos. Intersection over union (IoU) was used as the evaluation metric for tool recognition. RESULTS: The overall accuracies for the automatic surgical phase and action classification task were 81.0% and 83.2%, respectively. The mean IoU for the automatic tool segmentation task for 5 tools was 51.2%. CONCLUSIONS: A large annotated dataset of LCRS videos was constructed, and the phase, action, and tool were recognized with high accuracy using AI. Our dataset has potential uses in medical applications such as automatic video indexing and surgical skill assessments. Open research will assist in improving CNN models by making our dataset available in the field of computer vision.