Various techniques using artificial intelligence (AI) have resulted in a significant contribution to field of medical image and video-based diagnoses, such as radiology, pathology, and endoscopy, including the classification of gastrointestinal (GI) diseases. Most previous studies on the classification of GI diseases use only spatial features, which demonstrate low performance in the classification of multiple GI diseases. Although there are a few previous studies using temporal features based on a three-dimensional convolutional neural network, only a specific part of the GI tract was involved with the limited number of classes. To overcome these problems, we propose a comprehensive AI-based framework for the classification of multiple GI diseases by using endoscopic videos, which can simultaneously extract both spatial and temporal features to achieve better classification performance. Two different residual networks and a long short-term memory model are integrated in a cascaded mode to extract spatial and temporal features, respectively. Experiments were conducted on a combined dataset consisting of one of the largest endoscopic videos with 52,471 frames. The results demonstrate the effectiveness of the proposed classification framework for multi-GI diseases. The experimental results of the proposed model (97.057% area under the curve) demonstrate superior performance over the state-of-the-art methods and indicate its potential for clinical applications.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited