Surgical Phase Recognition in Inguinal Hernia Repair—AI-Based Confirmatory Baseline and Exploration of Competitive Models
Abstract
:1. Introduction
2. Materials and Methods
2.1. Video Acquisition, Annotation, and Processing
2.2. Competitive Model Creation
2.3. Confirmatory Baseline Study
2.4. Explorations of Advanced Models
3. Results
3.1. Dataset and Annotations
3.2. Competition Results
3.3. Pre-Processing and Evaluation Pipelines
3.3.1. Phase Merging
3.3.2. Data Diversity
3.3.3. Edge Cropping
3.4. Other Advanced Models
4. Discussion
4.1. Ground-Truth Annotations
4.2. Classroom Competition
4.3. CV/DL Methodologies
4.4. Clinical Applications
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
ID | Phase Name | Start | End |
---|---|---|---|
1 | Adhesiolysis | First dissection of adhesion | Last cauterization/dissection of adhesion |
2 | Peritoneal scoring | First cautery against peritoneum | End of last cautery against peritoneum |
3 | Preperitoneal dissection 1 | First dissection movement after peritoneum opened | Grasping hernia sac |
4 | Reduction of hernia 2 | First grasp of hernia contents | Hernia is released |
5 | Mesh positioning | Mesh first grasped | Mesh placed and operator moves away |
6 | Mesh placement | First grasping mesh prior to placement over hernia site | Last movement of mesh |
7 | Positioning suture | Suture is grasped | First suture placed in peritoneum/primary repair or returned to resting site |
8 | Primary hernia repair | Stitch placed at hernia defect | Suture is cut or operator moves away, stitch after finishing the last knot |
9 | Catheter insertion | Needle penetrates peritoneum | Last movement of catheter |
10 | Peritoneal closure | Initial stitch to close peritoneum | Cutting suture |
11 | Transitory idle | End of preceding defined phase, with instrument movement | Start of subsequent defined phase |
12 | Stationary idle | End of preceding defined phase, without instrument movement | Start of subsequent defined phase |
13 | Out of body | Intracavitary space is no longer visible or when static begins | Intracavitary space is again visible |
14 | Blurry | Abnormal resolution > 50% | Resolution normalizes or camera removed |
ID | Phase Name | Merged to |
---|---|---|
1 | Adhesiolysis | Preperitoneal dissection |
2 | Peritoneal scoring | - |
3 | Preperitoneal dissection | - |
4 | Reduction of hernia | - |
5 | Mesh positioning | Mesh placement |
6 | Mesh placement | - |
7 | Positioning suture | Mesh placement |
8 | Primary hernia repair | Reduction of hernia |
9 | Catheter insertion | Mesh placement |
10 | Peritoneal closure | - |
11 | Transitory idle | - |
12 | Stationary idle | Transitionary idle |
13 | Out of body | - |
14 | Blurry | Previous * |
References
- Meskó, B.; Görög, M. A Short Guide for Medical Professionals in the Era of Artificial Intelligence. Npj Digit. Med. 2020, 3, 126. [Google Scholar] [CrossRef] [PubMed]
- Hashimoto, D.A.; Rosman, G.; Rus, D.; Meireles, O.R. Artificial Intelligence in Surgery: Promises and Perils. Ann. Surg. 2018, 268, 70–76. [Google Scholar] [CrossRef] [PubMed]
- Ward, T.M.; Mascagni, P.; Ban, Y.; Rosman, G.; Padoy, N.; Meireles, O.; Hashimoto, D.A. Computer Vision in Surgery. Surgery 2021, 169, 1253–1256. [Google Scholar] [CrossRef] [PubMed]
- Anteby, R.; Horesh, N.; Soffer, S.; Zager, Y.; Barash, Y.; Amiel, I.; Rosin, D.; Gutman, M.; Klang, E. Deep Learning Visual Analysis in Laparoscopic Surgery: A Systematic Review and Diagnostic Test Accuracy Meta-Analysis. Surg. Endosc. 2021, 35, 1521–1533. [Google Scholar] [CrossRef]
- Hashimoto, D.A.; Rosman, G.; Witkowski, E.R.; Stafford, C.; Navarette-Welton, A.J.; Rattner, D.W.; Lillemoe, K.D.; Rus, D.L.; Meireles, O.R. Computer Vision Analysis of Intraoperative Video: Automated Recognition of Operative Steps in Laparoscopic Sleeve Gastrectomy. Ann. Surg. 2019, 270, 414–421. [Google Scholar] [CrossRef]
- Zhang, B.; Ghanem, A.; Simes, A.; Choi, H.; Yoo, A. Surgical Workflow Recognition with 3DCNN for Sleeve Gastrectomy. Int. J. Comput. Assist. Radiol. Surg. 2021, 16, 2029–2036. [Google Scholar] [CrossRef]
- Kitaguchi, D.; Takeshita, N.; Matsuzaki, H.; Takano, H.; Owada, Y.; Enomoto, T.; Oda, T.; Miura, H.; Yamanashi, T.; Watanabe, M.; et al. Real-Time Automatic Surgical Phase Recognition in Laparoscopic Sigmoidectomy Using the Convolutional Neural Network-Based Deep Learning Approach. Surg. Endosc. 2020, 34, 4924–4931. [Google Scholar] [CrossRef]
- Ward, T.M.; Hashimoto, D.A.; Ban, Y.; Rattner, D.W.; Inoue, H.; Lillemoe, K.D.; Rus, D.L.; Rosman, G.; Meireles, O.R. Automated Operative Phase Identification in Peroral Endoscopic Myotomy. Surg. Endosc. 2021, 35, 4008–4015. [Google Scholar] [CrossRef]
- Twinanda, A.P.; Shehata, S.; Mutter, D.; Marescaux, J.; de Mathelin, M.; Padoy, N. EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos. IEEE Trans. Med. Imaging 2016, 36, 86–97. [Google Scholar] [CrossRef]
- Jin, Y.; Dou, Q.; Chen, H.; Yu, L.; Qin, J.; Fu, C.-W.; Heng, P.-A. SV-RCNet: Workflow Recognition from Surgical Videos Using Recurrent Convolutional Network. IEEE Trans. Med. Imaging 2018, 37, 1114–1126. [Google Scholar] [CrossRef]
- Czempiel, T.; Paschali, M.; Keicher, M.; Simson, W.; Feussner, H.; Kim, S.T.; Navab, N. TeCNO: Surgical Phase Recognition with Multi-Stage Temporal Convolutional Networks. In Proceedings of the Medical Image Computing and Computer Assisted Intervention–MICCAI 2020: 23rd International Conference, Lima, Peru, 4–8 October 2020; Volume 12263, pp. 343–352. [Google Scholar]
- Jin, Y.; Long, Y.; Chen, C.; Zhao, Z.; Dou, Q.; Heng, P.-A. Temporal Memory Relation Network for Workflow Recognition from Surgical Video. IEEE Trans. Med. Imaging 2021, 40, 1911–1923. [Google Scholar] [CrossRef] [PubMed]
- Gao, X.; Jin, Y.; Long, Y.; Dou, Q.; Heng, P.-A. Trans-SVNet: Accurate Phase Recognition from Surgical Videos via Hybrid Embedding Aggregation Transformer. In Proceedings of the Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, 27 September–1 October 2021. [Google Scholar]
- Park, M.; Oh, S.; Jeong, T.; Yu, S. Multi-Stage Temporal Convolutional Network with Moment Loss and Positional Encoding for Surgical Phase Recognition. Diagnostics 2022, 13, 107. [Google Scholar] [CrossRef] [PubMed]
- Cholectriplet 2021—Grand Challenge. Available online: https://cholectriplet2021.grand-challenge.org (accessed on 1 May 2022).
- Nwoye, C.I.; Yu, T.; Gonzalez, C.; Seeliger, B.; Mascagni, P.; Mutter, D.; Marescaux, J.; Padoy, N. Rendezvous: Attention Mechanisms for the Recognition of Surgical Action Triplets in Endoscopic Videos. Med. Image Anal. 2022, 78, 102433. [Google Scholar] [CrossRef] [PubMed]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
- Jaegle, A.; Borgeaud, S.; Alayrac, J.-B.; Doersch, C.; Ionescu, C.; Ding, D.; Koppula, S.; Zoran, D.; Brock, A.; Shelhamer, E.; et al. Perceiver IO: A General Architecture for Structured Inputs & Outputs. arXiv 2021, arXiv:2107.14795. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
- Liu, Z.; Ning, J.; Cao, Y.; Wei, Y.; Zhang, Z.; Lin, S.; Hu, H. Video Swin Transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–20 June 2022; pp. 3202–3321. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 10012–10022. [Google Scholar]
- Kay, W.; Carreira, J.; Simonyan, K.; Zhang, B.; Hillier, C.; Vijayanarasimhan, S.; Viola, F.; Green, T.; Back, T.; Natsev, P.; et al. The Kinetics Human Action Video Dataset. arXiv 2017, arXiv:1705.06950. [Google Scholar]
- Feichtenhofer, C. X3D: Expanding Architectures for Efficient Video Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
- Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. Int. J. Comput. Vis. 2020, 128, 336–359. [Google Scholar] [CrossRef]
- Chattopadhyay, A.; Sarkar, A.; Howlader, P.; Balasubramanian, V.N. Grad-CAM++: Improved Visual Explanations for Deep Convolutional Networks. In Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA, 12–15 March 2018; pp. 839–847. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Ward, T.M.; Fer, D.M.; Ban, Y.; Rosman, G.; Meireles, O.R.; Hashimoto, D.A. Challenges in Surgical Video Annotation. Comput. Assist. Surg. 2021, 26, 58–68. [Google Scholar] [CrossRef]
- Meireles, O.R.; Rosman, G.; Altieri, M.S.; Carin, L.; Hager, G.; Madani, A.; Padoy, N.; Pugh, C.M.; Sylla, P.; Ward, T.M.; et al. SAGES Consensus Recommendations on an Annotation Framework for Surgical Video. Surg. Endosc. 2021, 35, 4918–4929. [Google Scholar] [CrossRef] [PubMed]
- Kondratyuk, D.; Yuan, L.; Li, Y.; Zhang, L.; Tan, M.; Brown, M.; Gong, B. MoViNets: Mobile Video Networks for Efficient Video Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 16020–16030. [Google Scholar]
- Bar, O.; Neimark, D.; Zohar, M.; Hager, G.D.; Girshick, R.; Fried, G.M.; Wolf, T.; Asselmann, D. Impact of Data on Generalization of AI for Surgical Intelligence Applications. Sci. Rep. 2020, 10, 22208. [Google Scholar] [CrossRef] [PubMed]
Case | Total #Videos | Train | Test | ResNet Accuracy | ||
---|---|---|---|---|---|---|
Surgeon 01 | Others | Surgeon 01 | Others | |||
1 | 120 | 70 | - | 47 | 3 | 0.7870 |
2 | 186 | 136 | - | 50 | - | 0.8015 |
3 | 40 * | 17 | 23 | - | - | 0.6916 |
4 | 209 | 186 | - | - | 23 | 0.4808 |
5 | 209 | 173 | 15 | 15 | 6 | 0.7704 |
ID | Accuracy | Model | Architecture |
---|---|---|---|
1 | 0.8199 | TMRNet | CNN + LSTM + Attention |
2 | 0.8111 | TMRNet | CNN + LSTM + Attention |
3 | 0.7955 | MobileNet | CNN + Output Smoothing |
4 | 0.7951 | TeCNO | CNN 3D |
5 | 0.7948 | TMRNet | CNN + LSTM + Attention |
6 | 0.7930 | - | CNN + LSTM |
7 | 0.7917 | EfficientNet | CNN + Output Smoothing |
8 | 0.7816 | SV-RCNet | CNN + LSTM |
9 | 0.7809 | - | CNN + LSTM |
10 | 0.7659 | ConvNeXt | CNN |
11 | 0.7619 | - | CNN + LSTM |
12 | 0.1006 | X3D | CNN 3D |
Source | Accuracy | Videos |
---|---|---|
Surgeon 01 | 0.8096 | 15 |
Surgeons 02–08 | 0.7025 | 6 |
All surgeons | 0.7704 | 21 |
Model | Accuracy | Clip Length | Parameters (M) | Inference Time * (ms) |
---|---|---|---|---|
ResNet-50 | 0.7704 | 1 | 25.6 | 8.7 ± 0.7 |
Perceiver IO | 0.8414 | 16 | 36.3 | 47.4 ± 0.5 |
Swin-T | 0.8491 | 10 | 28.0 | 13.14 ± 5.2 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zang, C.; Turkcan, M.K.; Narasimhan, S.; Cao, Y.; Yarali, K.; Xiang, Z.; Szot, S.; Ahmad, F.; Choksi, S.; Bitner, D.P.; et al. Surgical Phase Recognition in Inguinal Hernia Repair—AI-Based Confirmatory Baseline and Exploration of Competitive Models. Bioengineering 2023, 10, 654. https://doi.org/10.3390/bioengineering10060654
Zang C, Turkcan MK, Narasimhan S, Cao Y, Yarali K, Xiang Z, Szot S, Ahmad F, Choksi S, Bitner DP, et al. Surgical Phase Recognition in Inguinal Hernia Repair—AI-Based Confirmatory Baseline and Exploration of Competitive Models. Bioengineering. 2023; 10(6):654. https://doi.org/10.3390/bioengineering10060654
Chicago/Turabian StyleZang, Chengbo, Mehmet Kerem Turkcan, Sanjeev Narasimhan, Yuqing Cao, Kaan Yarali, Zixuan Xiang, Skyler Szot, Feroz Ahmad, Sarah Choksi, Daniel P. Bitner, and et al. 2023. "Surgical Phase Recognition in Inguinal Hernia Repair—AI-Based Confirmatory Baseline and Exploration of Competitive Models" Bioengineering 10, no. 6: 654. https://doi.org/10.3390/bioengineering10060654
APA StyleZang, C., Turkcan, M. K., Narasimhan, S., Cao, Y., Yarali, K., Xiang, Z., Szot, S., Ahmad, F., Choksi, S., Bitner, D. P., Filicori, F., & Kostic, Z. (2023). Surgical Phase Recognition in Inguinal Hernia Repair—AI-Based Confirmatory Baseline and Exploration of Competitive Models. Bioengineering, 10(6), 654. https://doi.org/10.3390/bioengineering10060654