Maria Paola Forte, Katherine J Kuchenbecker, PhD. Max Planck Institute for Intelligent Systems
Objective: Most robotic surgery applications of augmented reality (AR) focus on bringing pre-operative images (e.g., MRI) into the surgeon’s intra-operative workspace. We are exploring alternative uses and interaction methods of AR.
In collaboration with laparoscopic surgeons, we identified five possible AR tool categories: virtual markers (that can be attached to intra-operative visual features), computational tools (e.g., measuring distances), rehearsal of procedure segments recorded from an expert, visual alarms (e.g., indicating out-of-view instruments), and viewing patient data.
We created prototypes of these tools by applying real-time computer vision and augmented reality techniques to the surgical field captured by a da Vinci stereo-endoscope.
Methods: To interface with the robot’s vision system, we used a Blackmagic Design DeckLink Quad 2 video capture and playback card, which allows keying (i.e., overlaying of virtual content over the source video). We developed drivers for the card to be endoscope independent.
The surgeon interacts with our AR system through voice commands and by using the robotic instruments as cursors. Voice commands are recognized by a dedicated language model. For example, the command “da Vinci mark right” places a virtual marker at the 3D location of the right instrument tip.
The instrument tips are detected on rectified left-channel images using the neural network TernausNet-16 for multi-class segmentation of the instruments, and the Harris corner detector for subsequent image processing. The position of the identified tip is retrieved in the right channel using a brute-force matcher. The resulting binocular disparity, combined with the stereo camera calibration parameters, identifies the 3D position of the cursor, and our software draws the marker at this position in both left and right views.
To better integrate the virtual markers with the real world, we use the semi-global block matching algorithm to detect whether AR pixels are covered (e.g., by a tool); if so, they disappear from the surgeon’s view. Additional AR information is shown in 2D in front of the projection plane.
Preliminary results: The developed system is user friendly and requires few constraints for good performance. The surgeon’s experience during surgery should not be compromised because the virtual content is minimal and clearly distinguishable from the real environment. Additionally, the system’s rendering latency averages only 56 ms; our use of keying means the latency affects only the AR elements.
The rendered AR items appear in 3D (without diplopia) when displayed by the da Vinci stereo viewer. The accuracy in measuring depths and distances is submillimeter. As shown in the sample stereo-pair image below, the computed distance between the virtual markers (blue dots at the tips of the robotic instruments) is 2.96 cm, while the ground-truth distance is 2.87 cm. The yellow ellipses were added in post-processing to highlight the AR content.
Conclusions: The developed system has the potential to improve safety and task efficiency in robot-assisted surgery. Two remaining challenges are to speed-up task response times and to keep the virtual markers fixed in non-static environments. Additionally, a human-subject study with potential users is required to gather quantitative and qualitative feedback.
Presented at the SAGES 2017 Annual Meeting in Houston, TX.
Abstract ID: 98691
Program Number: ETP715
Presentation Session: Emerging Technology Poster Session (Non CME)
Presentation Type: Poster