Nobuyoshi Takeshita, Masaaki Ito. National Cancer Center Hospital East
INTRODUCTION: Recently, the spread of laparoscopic surgery as a standard treatment and the development of information & communication technology have yielded abundant video data of laparoscopic procedures. These data have been accumulated and we can access them anytime, anywhere. However, the direction of how to use the abundant video data are still unclear. Conventionally, surgical procedures have been performed based on surgeon’s subjective decisions and skills, so called “tacit knowledge”. For the purpose of objective analysis of laparoscopic procedures in video data, automatic recognition of surgical tools and understanding of surgical workflow must be the first critical step. We used convolutional neural network (CNN) which is the current trend in machine learning and computer vision tasks.
METHODS: Using video database of laparoscopic sigmoid colectomy in our institute, we performed annotation of tools and phases in every frame of the operating videos. For the tool detection, we annotated bounding boxes for both left and right tools in the videos. Furthermore, phase annotation was performed by watching the videos in consultation with laparoscopic surgeons. The laparoscopic sigmoid colectomy operation passes through 10 phases; 1-Placement of ports and preparation, 2-Dissection of retrorectal space, 3-Medial approach to IMA, 4-Isolation and division of IMA, 5-Medial-to-lateral retromesenteric dissection, 6-Lateral mobilization of left colon, 7-Rectosigmoid mobilization, 8-Division of mesorectum, 9-Rectosigmoid resection and anastomosis, 10-Finishing. We used CNN architecture to perform surgical tool detection and workflow recognition.
RESULTS: We totally labeled 8 tools used in the procedures of laparoscopic sigmoid colectomy and successfully developed tool detection system by CNN. As for surgical workflow, average times of phase 1-10 were 11.3, 9.9, 8.7, 5.9, 11.5, 10.2, 8.7, 11.6, 17.8, 2.7 minutes, respectively. Workflow recognition system using CNN was also successfully developed, while we needed to extract pure operating scenes in advance for efficient recognition outcomes.
CONCLUSION: We’ve developed tool detection and phase recognition systems using CNN. We need more datasets to improve the detecting ability for future clinical uses.
Presented at the SAGES 2017 Annual Meeting in Houston, TX.
Abstract ID: 86764
Program Number: P487
Presentation Session: iPoster Session (Non CME)
Presentation Type: Poster