The Diagnostic Classification of the Pathological Image Using Computer Vision

Matsuzaka, Yasunari; Yashiro, Ryu

doi:10.3390/a18020096

Open AccessReview

The Diagnostic Classification of the Pathological Image Using Computer Vision

by

Yasunari Matsuzaka

^1,2,3,* and

Ryu Yashiro

^3,4

¹

Department of Microbiology and Immunology, Showa University School of Medicine, Tokyo 142-8555, Japan

²

Division of Molecular and Medical Genetics, Center for Gene and Cell Therapy, The Institute of Medical Science, The University of Tokyo, Tokyo 108-8639, Japan

³

Administrative Section of Radiation Protection, National Institute of Neuroscience, National Center of Neurology and Psychiatry, Tokyo 187-8551, Japan

⁴

Department of Mycobacteriology, Leprosy Research Center, National Institute of Infectious Diseases, Tokyo 162-8640, Japan

^*

Author to whom correspondence should be addressed.

Algorithms 2025, 18(2), 96; https://doi.org/10.3390/a18020096

Submission received: 25 December 2024 / Revised: 24 January 2025 / Accepted: 24 January 2025 / Published: 8 February 2025

(This article belongs to the Special Issue Machine Learning Algorithms for Biomedical Image Analysis and Applications)

Download Versions Notes

Abstract

Computer vision and artificial intelligence have revolutionized the field of pathological image analysis, enabling faster and more accurate diagnostic classification. Deep learning architectures like convolutional neural networks (CNNs), have shown superior performance in tasks such as image classification, segmentation, and object detection in pathology. Computer vision has significantly improved the accuracy of disease diagnosis in healthcare. By leveraging advanced algorithms and machine learning techniques, computer vision systems can analyze medical images with high precision, often matching or even surpassing human expert performance. In pathology, deep learning models have been trained on large datasets of annotated pathology images to perform tasks such as cancer diagnosis, grading, and prognostication. While deep learning approaches show great promise in diagnostic classification, challenges remain, including issues related to model interpretability, reliability, and generalization across diverse patient populations and imaging settings.

Keywords:

computer vision; deep learning; convolutional neural networks; medical imaging data

1. Introduction

Deep learning models, particularly convolutional neural networks (CNNs), have shown remarkable performance in classifying pathological images. Deep learning-based approaches have shown significant promise for the diagnostic classification of pathological images, particularly whole-slide images (WSIs) [1,2,3,4]. Pre-trained CNN architectures like DenseNet-161 and ResNet-50, have achieved classification accuracies of over 97% on histopathology image datasets. These models can automatically extract relevant features from WSIs to classify them into different disease categories. CNNs have emerged as the primary deep learning architecture for pathological image classification tasks. Several CNN models have demonstrated impressive performance. With high accuracy, the Inception-v3 architecture has been used to classify epithelial tumors of the stomach and colon into adenocarcinoma, adenoma, and non-neoplastic categories. Models like ResNet50, ResNeXt50, EfficientNet, and DenseNet121 have been applied to diagnose different types of thyroid tumors [5]. GoogleNet and AlexNet have also shown good classification accuracy for pathological images. Deep learning models have been successfully applied to various cancer types. In prostate cancer, artificial intelligence (AI)-based models have been developed to classify prostate Magnetic Resonance Imaging (MRI) images, potentially reducing the need for invasive biopsies [6]. In melanoma, a CNN outperformed 11 histopathologists in classifying histopathological melanoma images, showing promise in assisting human melanoma diagnosis [7]. In thyroid tumors, deep neural networks have shown value in the pathological classification of thyroid tumors [5].

WSIs present unique challenges due to their extremely large size. To address this, researchers have developed novel approaches. (I) HipoMap: A framework that creates a WSIs representation map that can be applied to various slide-based problems. It outperformed existing methods in lung cancer classification experiments, achieving an Area Under the Curve (AUC) of 0.96 ± 0.026 [8]. (II) Tile-based approaches: CNNs are trained on millions of tiles extracted from WSIs, with predictions aggregated using strategies like max pooling or recurrent neural networks (RNNs) [9].

Deep learning models have demonstrated the ability to simultaneously detect and classify multiple pathological findings. For example, models have been developed to classify seven different histopathological findings in rat liver WSIs, including vacuolation, bile duct hyperplasia, and single-cell necrosis [10]. In gastric and colonic epithelial tumors, models have been trained to differentiate between adenocarcinoma, adenoma, and non-neoplastic lesions [10]. Transfer learning techniques have been successfully applied to adapt pre-trained CNN models for pathology image classification tasks [11]. This approach allows leveraging knowledge gained from large natural image datasets to improve performance on smaller pathology datasets. Some advanced methods combine multiple imaging modalities or fuse different types of data. For example, integrating H&E staining results with multiphoton microscopy images has led to more accurate diagnoses of certain conditions like microinvasion in ductal carcinoma [12]. As deep learning technology continues to advance, it has the potential to become an invaluable tool in supporting pathologists and improving the accuracy and efficiency of pathological diagnoses.

There are some relevant studies on the diagnostic classification of pathological images using computer vision techniques. CNNs have shown remarkable success in pathological image analysis. A study trained CNNs on WSIs of gastric biopsy histopathology, achieving state-of-the-art results in tumor classification and segmentation [9]. Another study used CNNs to classify medical images and detect cancerous tissue in histopathological images from tissue microarrays (TMAs) [13]. VGG19, a CNN architecture, proved effective in achieving high classification accuracy by directly evaluating tumors from histopathological images [13]. Researchers utilized transfer learning to train a neural network on a small subset of retinal tomography optical coherence tomography (OCT) images, achieving 93.4% accuracy in classifying multiple eye conditions [13]. A weakly supervised multi-label image classification framework was developed to detect multiple pathologies in X-ray images using a DCNN model [13]. An interpretable deep learning model was developed for predicting microsatellite instability (MSI) status in cancer patients using H&E-stained WSIs [14]. A random forest model was also created, providing feature-level interpretability for MSI prediction. Both models highlighted the importance of color features and texture characteristics in classification [14]. PathCLIP, a model designed specifically for pathology image analysis, has shown impressive performance in zero-shot classification tasks [15]. The PathTree model introduced hierarchical pathological image classification, proposing a novel representation learning approach [16]. These studies demonstrate the significant progress in applying computer vision and machine learning techniques for pathological image classification, potentially improving diagnostic accuracy and efficiency in clinical settings. In addition, cerviLearnNet is a novel approach to cervical cancer diagnosis that combines reinforcement learning with convolutional neural networks [17]. This innovative model, also known as RL-CancerNet, integrates EfficientNetV2 and Vision Transformers (ViTs) within a reinforcement learning (RL) framework to enhance the accuracy of cervical cancer detection. Further, the novel teacher–student network model leverages staggered distillation methods to optimize both speed and model size [18]. This design enhances the student network’s learning potential and offers a reshaped concise form upon convergence, leading to more effective knowledge transfer. In this review, we summarize the current advances, architectures, applications, and future directions of deep learning approaches for the diagnostic classification of various diseases.

2. Applications of Deep Learning Approaches for Diagnostic Classification

Deep learning approaches have shown significant promise for the diagnostic classification of various diseases. Deep learning has been applied to diagnose and classify a wide range of diseases, including heart disease, diabetes, cancer, Alzheimer’s disease, liver disease, pneumonia, skin diseases, genetic disorders, and neurological conditions [17,18,19,20,21,22,23,24]. Some of the most frequently used deep learning algorithms for disease diagnosis include CNNs, recurrent neural networks (RNNs), artificial neural networks (ANNs), and Support Vector Machines (SVMs) [25]. Among them, CNNs are particularly effective for analyzing medical imaging data like X-rays, Computed Tomography (CT) scans, and MRIs [26,27,28]. They can automatically extract relevant features from images to classify diseases [29]. RNNs are useful for analyzing sequential data like time series of vital signs or electronic health records [30]. They can capture temporal dependencies in the data. As for ANNs, standard feedforward neural networks have been applied to structured clinical data for disease prediction and classification [31]. While not strictly deep learning, SVMs are still commonly used and compared against deep learning approaches.

Deep learning algorithms have demonstrated impressive accuracy in diagnosing various diseases, often matching or exceeding the performance of human healthcare professionals [32]. For example, a study proposed using DenseNet-169 and ResNet-50 CNN architectures for diagnosing and classifying Alzheimer’s disease. This approach achieved high accuracy, with some models reaching up to 98% accuracy in classification tasks [33]. Another research investigated the use of VUNO Med-DeepBrain AD, a deep learning algorithm, as a decision support tool for Alzheimer’s diagnosis. This model showed an accuracy of 87.1%, which was slightly higher than the 84.3% accuracy achieved by medical experts using traditional diagnostic methods [34]. A systematic review highlighted that deep learning methods, such as CNNs and recurrent neural networks (RNNs), have been used extensively for Alzheimer’s disease classification. These methods have achieved accuracies of up to 96% for Alzheimer’s classification and 84.2% for predicting the conversion from mild cognitive impairment (MCI) to Alzheimer’s [35]. Despite the promising results, there are challenges associated with using deep learning for Alzheimer’s diagnosis. The performance of deep learning models can be affected by the quality and variability of neuroimaging data used for training [36]. Understanding how these models make decisions is crucial for clinical adoption [36]. Ensuring that models generalize well across different populations and imaging protocols is essential for widespread clinical use [37,38]. While deep learning models show significant promise in improving the accuracy and speed of Alzheimer’s disease diagnosis, ongoing research is needed to address these challenges and enhance their clinical applicability.

In addition, deep learning models have shown high accuracy in detecting cancers like gastrointestinal and liver cancer, at early stages. A study developed a deep learning system using CNNs to classify liver tumors based on MRI. This system achieved performance comparable to experienced radiologists in classifying liver tumors into seven categories, with an AUC of 0.985 for hepatocellular carcinoma and 0.998 for metastatic tumors [39]. The use of unenhanced images combined with clinical data improved the classification accuracy, potentially reducing the need for contrast agents. Another approach used an improved U-Net model for liver tumor segmentation from CT scan images, achieving a dice score of 0.96 for liver subdivision and 0.74 for tumor segmentation [36]. This method enhanced efficiency in diagnosing liver tumors by leveraging 3D spatial features and temporal domain information. AutoML models have been applied to predict liver metastasis in patients with gastrointestinal stromal tumors. These models utilize algorithms like Gradient Boosting Machine (GBM) to provide clinicians with tools for individualized treatment planning, demonstrating high accuracy in predicting liver metastasis [40]. Deep learning models using CNNs have been proposed for the automatic detection of liver cancer, transferring knowledge from pre-trained global models to improve diagnostic accuracy [41]. These models are designed to handle various imaging modalities and clinical data, enhancing their applicability in clinical settings.

Deep learning models are proving to be valuable tools in the early detection and diagnosis of liver and gastrointestinal cancers, offering high accuracy and efficiency that can complement traditional diagnostic methods. In addition, deep learning models have shown impressive performance in detecting and classifying liver tumors. CNNs applied to B-mode ultrasound images achieved higher accuracy than expert radiologists in distinguishing benign from malignant focal liver lesions [42]. A CNN model for liver tumor detection and 6-class discrimination (including hepatocellular carcinoma, hemangiomas, and cysts) reached 87% detection rate, 83.9% sensitivity, and 97.1% specificity in internal evaluation [42]. Deep learning systems integrating unenhanced MRI images and clinical data demonstrated excellent performance in classifying liver malignancies, with AUCs of 0.985 for hepatocellular carcinoma, 0.998 for metastatic tumors, and 0.963 for other primary malignancies [43]. Deep learning has also shown promise for other gastrointestinal cancers. For gastric cancer, a combined AI model incorporating handcrafted radiomics and deep learning radiomics achieved an AUC of 0.786 and an accuracy of 71.6% for diagnosing signet ring cell carcinoma [44]. In colorectal cancer, deep learning models have been developed for tasks like polyp detection and classification during colonoscopy, enhancing early detection capabilities. The application of deep learning offers several benefits. Many deep learning models achieve performance on par with or exceeding that of experienced radiologists [39]. AI models can rapidly analyze large volumes of imaging data, potentially reducing diagnostic time [40]. Deep learning models have indeed revolutionized the analysis of medical imaging data, offering significant potential to reduce diagnostic time and improve accuracy [45]. Deep learning models, particularly CNNs, excel at automatically extracting relevant features from medical images [46]. This capability eliminates the need for time-consuming manual feature engineering, allowing for rapid analysis of large datasets. Once trained, deep learning models can process and analyze images at high speeds, enabling them to handle large imaging data much faster than human radiologists. This high-throughput capability is especially valuable in screening applications or in situations where quick turnaround times are critical. Advanced deep learning architectures can integrate and analyze data from multiple imaging modalities simultaneously, providing a more comprehensive view of a patient’s condition [47]. This ability to synthesize information from various sources can lead to more accurate and timely diagnoses. Deep learning models can rapidly screen large numbers of images to identify potential abnormalities, allowing radiologists to focus on cases requiring closer examination. This approach can significantly reduce the time needed to process large screening datasets, such as in mammography or lung cancer screening [48,49]. Deep learning models can help prioritize urgent cases by quickly analyzing incoming imaging studies, ensuring that critical conditions receive prompt attention. This triage capability can lead to faster diagnoses for time-sensitive conditions. Deep learning methods can perform comprehensive quantifications of tissue characteristics much faster than manual methods. This rapid quantitative analysis can provide clinicians with objective measurements to support their diagnostic decisions. While deep learning models offer tremendous potential for rapid image analysis, it is important to note some challenges. Data quality and standardization are crucial for model performance [50]. Proper validation and testing are necessary to ensure reliability and generalizability. Integration with existing clinical workflows and systems can be complex. Despite these challenges, the ability of deep learning models to rapidly analyze large volumes of imaging data holds great promise for reducing diagnostic time and improving patient care in radiology and other medical imaging fields.

Deep learning approaches can indeed provide consistent and quantitative assessments of medical imaging data that reduce inter-observer variability compared to traditional visual analysis methods [51]. Deep learning models, once trained, apply the same criteria consistently across all images, eliminating the subjectivity and variability inherent in human visual assessments [52]. Deep learning models can provide precise numerical measurements and probabilities, rather than broad qualitative categories [52]. Studies have shown that deep learning approaches can achieve similar or better performance compared to expert human raters while eliminating inter-observer differences [52]. Researchers developed deep learning models to quantitatively assess histopathological markers of Alzheimer’s disease, including amyloid plaques and vascular deposits. The models showed very good to excellent performance compared to human experts, with strong correlations to visual semiquantitative scores [52]. A deep neural network model was able to provide quantitative assessments of swimming training effects with over 60% accuracy, offering a more objective evaluation compared to traditional methods [53]. This represents a significant improvement in objectivity compared to conventional assessment approaches. By leveraging machine learning algorithms, the model can analyze complex patterns and relationships in training data that may not be apparent through human observation alone. The model incorporated several important elements. A deep neural network was used as the feature extractor [54]. A gradient-boosting decision tree served as the classifier [55]. Combining deep learning and gradient boosting proved especially effective [56]. In experimental comparisons, the model demonstrated more than 60% accuracy, no more than a 1.00% decrease in recognition rate using the deep backpropagation neural network (DBPNN) + gradient-boosting decision tree (GBDT) approach, 78.5% parameter reduction, 54.5% floating-point reduction, and 32.1% reduction in video memory usage [57]. The experiments concluded that deep neural network models are more effective at handling high-dimensional sparse features than shallow learning approaches. They can obtain relatively accurate results more easily when dealing with complex training data. A key innovation of this model was accounting for uncertainties in the virtual training environment. By incorporating factors like power delays in the actual training operation, the model aimed to improve the objectivity and usability of the evaluation results. The researchers developed a training evaluation software module to validate the model, used simulated case data to test the model, and compared results to an unimproved evaluation method. This systematic validation process helped verify the correctness and objectivity of the new evaluation approach. The deep neural network model represents a promising step forward in the quantitative assessment of swimming training effects [57]. By leveraging advanced machine learning techniques, it offers a more data-driven, objective evaluation compared to traditional methods, while also accounting for real-world uncertainties in the training environment. However, further research may be needed to increase accuracy and validate the model across diverse swimming disciplines and athlete populations.

Deep learning models applied to cryo-electron microscopy (cryo-EM) data were able to quantitatively predict protein dynamics properties with higher accuracy than conventional methods, demonstrating the ability to extract subtle patterns from imaging data [58]. An automated deep learning method for segmenting and quantifying coronary atherosclerosis from CT angiography showed promise for providing consistent, quantitative measurements to guide clinical decision-making [55,59,60,61,62]. While deep learning approaches offer significant advantages in terms of consistency and quantification, it is important to note that the performance depends on the quality of the training data and model design. Ongoing validation against expert assessments remains important. However, the evidence suggests deep learning can reduce inter-observer variability while providing detailed quantitative insights from medical imaging data [63,64].

AI in radiology is indeed revolutionizing diagnostic capabilities by identifying subtle imaging features that may be overlooked in traditional radiological practice [65]. This enhancement in diagnostic precision is transforming the field of medical imaging and improving patient care. AI algorithms, particularly those utilizing deep learning, excel at detecting minute patterns and anomalies in medical images that might escape human detection [64]. These systems are trained on vast datasets of radiological images, allowing them to recognize subtle indicators of various conditions with remarkable accuracy, such as early-stage cancers, brain aneurysms, and cardiac abnormalities. By identifying these subtle features, AI enables earlier and more accurate diagnoses, which can significantly impact patient outcomes and treatment success rates [66]. The integration of AI into radiology workflows has led to significant improvements in both efficiency and diagnostic accuracy [48,65,67]. AI systems can rapidly analyze large volumes of imaging data, prioritizing cases that require urgent attention [65,68]. They provide consistent and precise image analysis, reducing variability among radiologists and ensuring more uniform interpretations. AI algorithms can detect abnormalities in imaging scans with a level of detail and speed that often surpasses human capability [69,70]. This enhanced efficiency allows radiologists to focus their expertise on complex cases, ultimately improving the overall quality of patient care. The synergy between AI and advanced imaging technologies is opening new frontiers in radiological assessment [65]. AI combined with 3D imaging technology has significantly improved image clarity and detail, allowing for more nuanced interpretations of complex anatomical structures [69]. Integration with molecular imaging facilitates detailed analyses of biological processes at the cellular and molecular levels, contributing to earlier and more accurate diagnoses [46,70,71]. One of the most powerful aspects of AI in radiology is its ability to continuously learn and improve [72,73]. AI algorithms engage in iterative learning processes, constantly refining their capabilities as they are exposed to new data [74]. This ongoing improvement ensures that diagnostic tools become increasingly reliable and robust over time. As AI systems evolve, they have the potential to uncover new insights from imaging scans that might have been previously overlooked, further advancing early disease detection and treatment. The enhanced diagnostic capabilities of AI are making significant impacts across various medical specialties [75]. In hepatology and pancreatology, AI has improved the diagnosis of liver and pancreatic diseases using various imaging techniques, including ultrasound, CT, MRI, and Positron Emission Tomography (PET)/CT [76,77]. For abdominal and pelvic imaging, AI facilitates automated segmentation and registration of organs and lesions, increasing diagnostic accuracy and treatment efficacy [78,79,80,81,82,83]. In nephrology, AI applications show potential in predicting acute kidney injury before significant biochemical changes are evident, allowing for earlier interventions [82,84,85]. AI is enhancing diagnostic capabilities across the board by identifying subtle imaging features not recognized in current radiological practice [65,86]. This technological advancement is not only improving the accuracy and efficiency of diagnoses but also paving the way for more personalized and effective patient care. As AI continues to evolve, its integration with radiology promises to revolutionize medical imaging and diagnostic practices further.

While promising, deep learning models are best viewed as complementary tools to enhance, rather than replace, traditional diagnostic methods [87,88,89]. They can serve as a “second opinion” to support radiologists’ assessments. AI can help prioritize cases for expert review, improving workflow efficiency. Combining AI and human expertise may lead to more accurate and timely diagnoses. Thus, deep learning models are proving to be valuable assets in the early detection and diagnosis of liver and gastrointestinal cancers [47,90]. Their high accuracy, efficiency, and ability to complement traditional methods make them powerful tools for improving patient outcomes through earlier and more precise diagnoses. Further research and validation in larger patient populations are needed to integrate these technologies into routine clinical practice fully.

Deep learning can rapidly analyze large volumes of medical data and images to generate diagnostic hypotheses [91,92]. This allows for the following:

Faster screening and triage of patients.
Reduced workload for physicians and clinicians.
More efficient use of healthcare resources.

Deep learning excels at extracting subtle patterns from complex medical data that may not be apparent to human observers [48,91]. This enables the detection of nuanced disease indicators in medical imaging and the identification of novel biomarkers or risk factors. Deep learning models can integrate data from various sources like imaging, genomics, and electronic health records to provide more comprehensive diagnostic insights [46,93,94,95,96]. The pattern recognition capabilities of deep learning allow for earlier detection of diseases, which can significantly improve treatment outcomes and survival rates [97,98,99,100,101]. Deep learning algorithms can provide consistent diagnostic performance without fatigue, potentially reducing human errors and diagnosis variability [62,102,103]. While deep learning shows great promise, challenges remain around data quality, model interpretability, and clinical integration [90]. Nonetheless, it represents a powerful tool for enhancing disease diagnosis when used appropriately in conjunction with clinical expertise.

AI models have demonstrated high accuracy in detecting and classifying various diseases from pathology slides, including different types of cancer [104,105,106,107,108,109]. Meta-analyses have shown mean sensitivities and specificities above 90% across multiple studies and disease types. Deep learning models like U-Net have been used to segment and extract important pathological features, such as cell nuclei, elastic fibers, and tumor regions [110]. This automated feature extraction can provide quantitative data to support diagnoses. Beyond diagnosis, some AI models can predict patient prognosis based on pathological image analysis, potentially aiding in treatment planning [111,112,113].

3. How Does Computer Vision Improve the Accuracy of Disease Diagnosis

Computer vision algorithms excel at detecting subtle patterns and anomalies in medical images that may be invisible to the human eye [89]. By analyzing vast datasets of medical images, these systems can identify minute details crucial for accurate diagnosis, particularly in complex cases like cancer or neurological disorders [114,115]. One of the major advantages of computer vision is its ability to maintain consistent accuracy, regardless of external factors like fatigue or distractions that can affect human performance [92]. This leads to more reliable interpretations of medical images and helps minimize diagnostic errors. Computer vision enables earlier and more accurate detection of diseases, which is critical for improving treatment outcomes [100]. For example, algorithms can detect early-stage conditions like diabetic retinopathy, glaucoma, and age-related macular degeneration in ophthalmology [116,117,118,119,120]. This early intervention capability can significantly impact patient prognosis and treatment effectiveness.

Studies have shown that computer vision can achieve remarkable accuracy in disease detection [117,121,122,123]. For instance, a system for detecting pneumonia in chest X-rays achieved 85% accuracy [124,125,126,127,128]. In lung cancer detection, a computer vision system demonstrated 95% accuracy compared to 65% for trained physicians [129]. Japanese researchers found that computer vision could increase cerebral aneurysm detection by up to 13% in MRI scans [130,131,132]. Computer vision systems can simultaneously analyze multiple aspects of medical images, providing a more holistic view of a patient’s condition [133]. This comprehensive approach helps identify complex patterns and relationships that might be missed in manual analysis.

By using consistent criteria and algorithms, computer vision helps standardize the diagnostic process across different healthcare settings. This scalability ensures that high-quality diagnostic capabilities can be extended to underserved or remote areas, improving overall healthcare accessibility. Computer vision significantly improves diagnostic accuracy by enhancing detection capabilities, reducing human error, enabling earlier interventions, and providing a more comprehensive and standardized analysis of medical images [134,135]. As these technologies evolve, they promise to revolutionize disease diagnosis and improve patient outcomes across various medical specialties.

4. Architectures, Features, and Advantages of CNNs

CNNs are a powerful deep learning model that revolutionized computer vision and image processing tasks. CNNs comprise multiple layers that process and extract features from input data, typically images. The main components include convolutional, pooling, and fully connected layers. The convolutional layers apply filters (kernels) to detect features like edges, textures, and shapes [136]. The filters slide across the input, performing convolution operations to create feature maps. The pooling layers reduce the spatial dimensions of feature maps, retaining the most important information. Common types include max pooling and average pooling. Fully connected layers are usually found near the end of the network and connect every neuron to all neurons in the previous layer for final classification or regression tasks. Also, CNNs have several distinguishing characteristics: (i) Local Receptive Fields: neurons that connect to small regions of the input, allowing for efficient processing of spatial data, (ii) Weight Sharing: filter parameters that are shared across the entire input, reducing the number of parameters and improving efficiency, and (iii) Hierarchical Feature Learning: CNNs that automatically learn features from low-level (e.g., edges) to high-level (e.g., complex shapes) without manual feature engineering [137].

CNNs are powerful artificial neural networks specifically designed for processing and analyzing visual data. While no single formula fully encapsulates CNNs, we can break down their key components and operations to understand how they work. CNNs typically consist of three main types of layers: convolutional layers, pooling layers, and fully connected layers. Each layer plays a crucial role in the network’s ability to learn and extract features from visual data. The convolutional layer is the fundamental building block of a CNN. Its operation can be described by the following formula:

F (i,j) = ∑x (p, q ∗ w (i − p, j − q)

(1)

where

F(i,j) is the feature map;
x is the input image;
w is the filter;
i, j, p, q are the indices of the pixels.

This formula represents a 2D convolution operation, which is a fundamental operation in image processing and convolutional neural networks. This operation slides the kernel over the input image, computing the dot product at each position.

F(i,j) = The output of the convolution at position (i,j)

This operation is used in various applications, such as edge detection, blurring, sharpening in image processing, and feature extraction in convolutional neural networks.

Pooling layers reduce the spatial dimensions of the feature maps. The most common pooling operation is max pooling, which can be represented as:

Z(i,j)_k⁽^l⁺¹⁾ = max₍_x_,_y_)∈_R₍_i_,_j₎ Z(x,y)_k⁽^l⁾

(2)

This formula describes the max pooling operation, where

Z(i,j)_k⁽^l⁺¹⁾ is the output of the pooling layer;
R(i,j) represents the receptive field (kernel window) at position (i,j);
Z(x,y)_k⁽^l⁾ is the input feature map.

Max pooling selects the maximum value within each kernel window, effectively downsampling the input while preserving the most prominent features. This operation is typically performed with a stride of 2 and a kernel size of 2 × 2, which reduces the spatial dimensions of the input by half.

After each convolutional layer, an activation function is applied. The most common is the Rectified Linear Unit (ReLU):

ReLU(x) = max(0,x)

(3)

The fully connected layer at the end of the network performs the final classification [138]. Its operation can be described as

y = σ (Wx + b)

(4)

where

y is the output vector;
x is the input vector;
b is the bias vector;
σ is a nonlinear activation function;
W is the weight matrix.

This equation represents an affine transformation (Wx + b) followed by a nonlinear activation function. The fully connected layer connects every input to every output, combining all features learned by previous layers to produce the final output of the network.

The network is trained using backpropagation and gradient descent to minimize a loss function. For a classification task, the cross-entropy loss is commonly used:

L = −∑iyi log(yⁱ)

(5)

These formulas represent the core operations in a CNN. The power of CNNs comes from stacking multiple convolutional and pooling layers, allowing the network to learn hierarchical features from the input data. This architecture enables CNNs to excel at tasks like image classification, object detection, and other computer vision applications.

In addition, CNNs excel in various domains, particularly those involving visual data as follows:

Image classification and recognition;
Object detection;
Medical image analysis;
Facial recognition;
Natural language processing (adapted versions);
Video analysis.

Several influential CNN architectures have advanced the field, as follows:

AlexNet: As the winner of the 2012 ImageNet Large Scale Visual Recognition Challenge (ILSVRC), AlexNet marked a significant breakthrough in computer vision and deep learning. AlexNet is a pioneering CNN architecture that significantly impacted the field of computer vision. This CNN architecture, developed by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, achieved a top-5 error rate of 15.3%, outperforming the nearest competitor by 9.8 percentage points. Key features of AlexNet include the following: (i) an eight-layer architecture with five convolutional layers and three fully connected layers. (ii) The use of Rectified Linear Units (ReLU) activation functions for faster convergence. (iii) The implementation of dropout to control overfitting. (iv) The utilization of GPUs for efficient training. AlexNet’s success demonstrated the power of deep learning in handling large-scale visual recognition tasks, paving the way for further advancements in CNN architectures and applications across various fields, including medical imaging, agriculture, and autonomous driving. The output size of a convolutional layer in AlexNet is calculated using the following formula:

O = SW − K + 2P + 1

(6)

where

O is the output size;
W is the input size;
K is the kernel (filter) size;
P is the padding;
S is the stride.

For example, the output size in the first convolutional layer of AlexNet includes the following:

Input size: 224 × 224;
Kernel size: 11 × 11;
Stride: 4;
Padding: 2.

Applying the formula (224 − 11 + 2 × 2)/4 + 1=55(224 − 11 + 2 × 2)/4 + 1=55 results in an output size of 55 × 55 for the first convolutional layer.

AlexNet uses the ReLU as its activation function:

f(x) = max(0,x) f(x) = max(0,x)

(7)

This non-saturating activation function showed improved training performance compared to traditional sigmoid or tanh functions.

The max pooling layers in AlexNet use a similar formula to the convolutional layers for calculating output size. For instance, the first max pooling layer includes the following:

Input size: 55 × 55;
Kernel size: 3 × 3;
Stride: 2;

Resulting in an output size of 27 × 27.

The fully connected layers perform matrix multiplication. For example, in the first fully connected layer

Y = WX + bY = WX + b

(8)

where

X is the input vector (13 × 13 × 256 = 43,264 neurons);
W is the weight matrix (43,264 × 4096);
b is the bias vector (4096);
Y is the output vector (4096 neurons).

The softmax function in the final layer of a neural network converts raw output scores (logits) into class probabilities. The mathematical expression for the softmax function is

softmax (xi) = e^{x i} \sum_{j = 1}^{k} e^{x j}

(9)

where

xi is the input value for class i;
K is the total number of classes;
e is the mathematical constant (approximately 2.718).

This function ensures that the output values are in the range (0,1) and sum up to 1, making them interpretable as probabilities. The softmax function exponentiates each input value and then normalizes these values by dividing by the sum of all the exponentials, effectively creating a probability distribution over the classes.

These formulas form the mathematical backbone of AlexNet, enabling it to process and classify images with remarkable accuracy.

VGG-16: This architecture is known for its simplicity and depth, with 16 weight layers. VGG-16 has proven effective in pathological image classification tasks. It achieves high accuracy in cataract detection, with a classification accuracy of 98.24%, sensitivity of 99.77%, and specificity of 97.83% when used as part of a larger architecture. Its deep structure allows for robust feature extraction, making it suitable for complex pathological image analysis. However, VGG-16 has limitations: (i) a large number of parameters, leading to high computational costs, (ii) potential for overfitting on smaller datasets, and (iii) limited ability to capture multi-scale features compared to more modern architectures.
ResNet: This architecture introduced residual connections to train very deep networks, with some versions exceeding 100 layers. ResNet’s key advantage is its ability to train very deep networks effectively, mitigating the vanishing gradient problem through skip connections [139]. This allows for improved performance in complex image recognition tasks, including pathological image analysis. The advantages facilitate the training of extremely deep networks effective in handling complex image recognition tasks and are computationally efficient compared to some other architectures. On the other hand, limitations include that it may not be as efficient in feature reuse as DenseNet and can require more parameters to achieve similar performance to DenseNet.
Inception (GoogLeNet): This architecture features inception modules for efficient feature extraction. Inception architectures, like Inception-v3 and Inception-v4, are designed to be computationally efficient while maintaining high accuracy [140]. The advantages are computationally efficient, flexible in handling different scales of features, and performs well even with limited training data. On the other hand, as for limitations, complex architecture can be challenging to modify or interpret and may suffer from overfitting if not properly tuned.
DenseNet: This architecture features dense connections between layers, allowing for improved information flow and gradient propagation [141]. It has shown effectiveness in various medical imaging tasks, including classification and segmentation [142]. DenseNet excels in feature reuse and parameter efficiency, making it particularly effective for tasks requiring fine-grained feature analysis [140]. The advantage is highly parameter-efficient, achieving high accuracy with fewer parameters, excellent feature reuse across the network, and resilience to vanishing gradient problems. On the other hand, its limitations are higher memory usage due to feature map concatenation and that it can be computationally intensive, especially for very deep versions.
U-Net: Specifically designed for biomedical image segmentation, U-Net features a contracting path to capture context and a symmetric expanding path for precise localization [142]. U-Net excels in medical image segmentation tasks, particularly in identifying structures in pathological images. When combined with attention mechanisms, such as in the Deep Attention U-Net for Cataract Diagnosis (DAUCD) model, it achieves impressive results in blood vessel segmentation for cataract detection. Advantages of U-Net include efficient use of context information, ability to work with limited training data, and good performance in biomedical image segmentation. On the other hand, limitations may include its struggle with very small or highly imbalanced datasets and can be computationally intensive for large images.
SegNet: SegNet is a deep convolutional encoder–decoder architecture designed for semantic pixel-wise segmentation. An encoder–decoder architecture for semantic pixel-wise segmentation, which has been applied successfully to medical image analysis tasks [142]. There are some advantages: improved boundary delineation [143], memory efficiency, end-to-end training, efficient upsampling, and constant feature map size. SegNet’s architecture enhances the accuracy of object boundaries in segmented images. The network uses a smaller number of parameters compared to other architectures, making it more memory-efficient during inference. SegNet can be trained end-to-end using stochastic gradient descent, which simplifies the training process. The use of max pooling indices for upsampling in the decoder network reduces the number of parameters and improves efficiency. SegNet maintains a constant number of features per layer, which decreases the computational cost for deeper encoder–decoder pairs. On the other hand, there are some limitations, such as performance trade-offs, limited contextual information, training complexity, dataset dependency, etc. While SegNet is efficient, some other architectures like DeepLab-LargeFOV may achieve higher accuracy at the cost of increased computational resources. The flat architecture of SegNet may capture less contextual information compared to networks with expanding deep encoder structures. Although end-to-end training is possible, SegNet may still require careful tuning of hyperparameters and training strategies to achieve optimal performance. The performance of SegNet can vary depending on the specific dataset and segmentation task, as demonstrated by comparisons at different benchmarks. SegNet’s design prioritizes efficiency and practicality, making it suitable for applications with limited computational resources while still providing competitive segmentation performance.
EfficientNet: This is an improved version of ResNet and MobileNet, trained on low parameters while yielding excellent results for image classification [144,145]. EfficientNet has shown promise in skin disease classification. An ensemble network, including EfficientNet, along with ResNet50 and MobileNetV3, has been proposed for classifying skin diseases. EfficientNet’s main advantages are improved accuracy and efficiency through compound scaling and better performance with fewer parameters compared to other models. On the other hand, its limitations include that it may require careful tuning of hyperparameters and can be complex to implement and optimize.
Xception: This architecture is an extension of the Inception architecture that replaces Inception modules with depthwise separable convolutions, showing promise in medical image classification tasks. Xception, a deep learning model for image classification, offers several advantages: efficiency, improved accuracy, generalization, and parameter efficiency. It uses depthwise separable convolutions, reducing computational complexity and parameter count, leading to faster training and inference times. In addition, Xception’s architecture enables better capture of spatial and cross-channel dependencies, resulting in state-of-the-art performance on various image classification benchmarks. Also, the model’s design allows for better performance on unseen data, crucial for real-world applications. Further, Xception significantly reduces the number of parameters compared to traditional CNNs, making it more lightweight. However, Xception also has some limitations: greater memory requirements, more training data, and complexity. The depthwise separable convolutions consume more memory compared to traditional convolutions, which can be challenging in resource-constrained environments. Due to its higher number of parameters, Xception generally requires larger amounts of training data to achieve optimal performance. While more efficient than some earlier models, Xception’s architecture is still complex, which can make it challenging to implement and optimize in certain scenarios [146].
Custom CNN architectures: Researchers have developed task-specific CNN models for medical imaging. For example, a study proposed two simplified CNN models for Alzheimer’s disease classification using MRI data, achieving high accuracy with a straightforward structure [147]. Custom CNNs have been successfully applied in various pathological tasks. For instance, the Mask Cell of the multi-class deep network (MCNet) achieves high accuracy in blood cell detection and classification, with a mAP@IoU0.50 of 95.70 for the PBC dataset and 96.76 for the Blood Cell count and Detection (BCCD) dataset. The advantages of custom CNNs are that they can be tailored to specific pathological tasks, with flexibility in architecture design and the potential for high performance when optimized. On the other hand, the limitations require significant expertise to design and optimize and may not generalize well to other tasks without modification.

These architectures offer various advantages, such as improved efficiency, better feature extraction, and specialized capabilities for specific medical imaging tasks. The choice of architecture often depends on the specific requirements of the medical image processing task at hand [148].

When comparing these architectures, it is important to consider the specific pathological task. For classification tasks (e.g., tumor pathology prediction), VGG-16 and EfficientNet have shown strong performance. For segmentation tasks (e.g., blood vessel segmentation in retinal images), U-Net and its variants like DAUCD have demonstrated excellent results. For object detection and instance segmentation (e.g., blood cell detection), custom architectures like MCNet have proven effective. For tasks with limited labeled data, semi-supervised learning (SSL) approaches have shown promise, achieving comparable performance to fully supervised models while significantly reducing the need for annotations [148]. Thus, the choice of architecture depends on the specific pathological task, available data, and computational resources. Hybrid approaches, such as combining different architectures or using ensemble methods, often yield the best results in complex pathological image analysis tasks.

CNNs offer several significant advantages that have made them a powerful and widely used tool in machine learning, particularly for image and video processing tasks. CNNs excel at automatically learning and extracting relevant features from input data, especially images. Their convolutional layers apply filters that capture spatial dependencies and patterns, eliminating the need for manual feature engineering. This allows CNNs to discover salient characteristics like edges, shapes, colors, and textures directly from raw pixel data. The architecture of CNNs provides spatial invariance, allowing them to recognize objects and patterns regardless of their position or orientation within an image [149]. This makes CNNs robust to variations in input data. CNNs build increasingly abstract and high-level representations as they process data through deeper layers [150]. This hierarchical approach enables them to capture complex features and relationships in visual data. CNNs use parameter sharing, which significantly reduces the number of parameters compared to fully connected networks. This makes CNNs more computationally efficient and helps prevent overfitting. Pre-trained CNN models can be fine-tuned for specific tasks, enabling effective transfer learning [151]. This allows CNNs to leverage knowledge gained from large datasets to perform well on smaller, specialized datasets. While CNNs are renowned for their performance in computer vision tasks, their benefits extend to other domains. CNNs have shown promise in text analysis tasks like sentiment analysis, topic categorization, and language translation [141,152]. They can capture hierarchical patterns in text data, mirroring human language comprehension. In speech recognition and sound classification, CNNs excel at processing time-series audio data and extracting relevant features. This makes them valuable for applications like voice assistants and environmental sound monitoring. CNNs are increasingly used in analyzing genetic data and predicting protein structures [153,154,155]. Their ability to process large, complex datasets is transforming personalized medicine and genomics research. By leveraging these advantages, CNNs have become a cornerstone of modern deep learning, driving advancements in various fields and enabling more sophisticated AI applications [41]. However, CNNs typically require large amounts of labeled data for training and can be computationally intensive, often necessitating specialized hardware like Graphics Processing Units (GPUs) for optimal performance [156]. As deep learning continues to evolve, CNNs remain a cornerstone in computer vision and image processing, with ongoing research focused on improving their efficiency, interpretability, and applicability to new domains [46].

5. The Datasets Referenced in Studies on the Diagnostic Classification of Pathological Images Using Computer Vision, Including Both Publicly Available and Proprietary Datasets

As for publicly available datasets, the CPIA dataset combines 103 open-source datasets, including standardized WSIs and regions of interest (ROIs), covering over 48 organs/tissues and about 100 diseases [157]. Datasets, such as The Cancer Genome Atlas (TCGA), Camelyon, and others, are commonly used in pathology research. TCGA, for instance, provides over 1.2 petabytes of cancer-related pathology slides [158]. Synthetic datasets like SNOW (Synthetic Nuclei and Annotation Wizard) are also publicly available, providing annotated synthetic pathological images for tasks like nuclei segmentation [159]. Platforms like Kaggle and HuggingFace host medical imaging datasets, which have become increasingly accessible to the public [160]. In addition, as for proprietary datasets, some foundation models are trained on proprietary datasets collected from specific institutions. For example, the Virchow model utilizes a dataset of over 1.4 million WSIs from Memorial Sloan Kettering Cancer Center. Many studies still rely on proprietary datasets due to the limited availability of public WSIs with detailed annotations [158,161]. Thus, while there is a growing trend toward using publicly available datasets, proprietary datasets remain significant in computational pathology research due to their scale and specificity.

Studies on diagnostic classification of pathological images using computer vision have employed various strategies to curate datasets and ensure their quality and representativeness. There are some dataset curation approaches, such as synthetic data generation, dataset distillation, and automated curation. Synthetic datasets, such as the SNOW dataset, have been created to address the challenges of privacy concerns and annotation burdens in pathology. These datasets are generated using standardized workflows and automated tools for image generation and annotation, ensuring data diversity and cost-effectiveness. For example, SNOW includes 20,000 image tiles with over 1.4 million annotated nuclei, enabling both supervised and semi-supervised training scenarios [159]. Techniques like Histo-DD compress large-scale datasets into condensed synthetic samples optimized for downstream tasks. These methods integrate stain normalization and model augmentation to enhance compatibility with histopathology images, which often exhibit high variability in color and texture. The synthetic samples generated are architecture-agnostic, reducing training efforts while preserving discriminative information [162]. Deep learning-based methods have been used to automatically curate large-scale datasets, sorting and labeling WSIs efficiently. For instance, a study demonstrated that low-resolution thumbnail images could classify slide types with high accuracy (AUROC > 0.99), making vast pathology archives more accessible for AI applications [163].

Also, there are some quality assurance measures, such as image quality control, representation of the target population, data augmentation, and diversity and bias mitigation. Automated quality assessment tools, such as PathProfiler, evaluate WSIs for usability at a diagnostic level by detecting common image artifacts (e.g., focus issues or staining quality). These tools ensure that only high-quality images are included in datasets, minimizing biases introduced by poor-quality slides [164]. Test datasets are curated to represent the entire target population of images encountered in real-world scenarios. This involves covering all dimensions of biological and technical variability (e.g., tissue features or staining protocols) to ensure unbiased model evaluation and prevent shortcut learning [165]. Generative adversarial networks (GANs) have been utilized for data augmentation to enhance dataset diversity, improving model performance in histopathological image classification by simulating realistic variations in pathological features [166]. Efforts are made to include diverse cases across multiple dimensions (e.g., tissue types or disease states) while addressing low-prevalence subsets to minimize spurious correlations between confounding variables and target outcomes [165]. Thus, by employing these strategies, researchers aim to create robust datasets that facilitate reliable training and evaluation of AI models for diagnostic applications in pathology.

6. A Comprehensive Comparison of CNN Models and State-of-the-Art Methods for Images in Different Modalities for Different Diseases

CNNs have become a powerful tool for analyzing medical images across various modalities and diseases. The quantitative evaluation of these models is crucial for assessing their performance and reliability in clinical settings. Common metrics used for the quantitative assessment of CNN models in medical image analysis include the Structural Similarity Index (SSIM), which measures the similarity between the original and processed images, Peak Signal-to-Noise Ratio (PSNR), which assesses the quality of denoised or reconstructed images, Precision and Recall, which evaluate the model’s detection performance, Coefficient of Determination (R²), which measures the goodness of fit between predicted and ground truth values, and Mean Absolute Deviation (MAD), which quantifies the error between estimated and true values while being less sensitive to outliers than RMSD [167,168]. As applications in different modalities, CNNs have shown significant improvements in denoising low-dose CT images, especially at ultra-low-dose levels. These models can enhance image quality while maintaining sharpness, with SSIM improvements of approximately 10% compared to the original methods [167]. Additionally, Alzheimer’s disease classification using MRI data has developed a quantitative disease-focusing approach. This method combines saliency maps and brain segmentations to evaluate how well CNN models focus on Alzheimer’s disease-relevant brain regions [169]. Also, a deep learning model has been proposed for quantitatively assessing chest X-ray findings with varying annotations. This model provides a quantitative evaluation of the probability of findings, which is proportional to their obviousness [170]. Further, CNN-based pipelines have been developed to analyze cell culture images acquired by lens-free microscopy. These models can perform multi-parameter analysis of ~10,000 cells in less than 5 s, demonstrating efficiency in bio-imaging analysis [168]. When comparing different CNN architectures, cross-validation is used to assess the reproducibility of classification outputs [170]. The model outputs are compared with the consistency of findings from multiple physicians [170]. Performance is evaluated across various pre-trained and randomly initialized networks to demonstrate the methodology’s independence from specific model characteristics [170].

CNN models may lead to excessive smoothing of images under small dose-reduction conditions in CT scans [167]. Techniques like saliency maps enhance the interpretability of CNN models, especially in complex tasks like Alzheimer’s disease classification [169]. While many approaches are developed for specific imaging modalities or diseases, efforts are being made to create adaptable frameworks that can be applied to various samples, microscope modalities, and cell measurements [168]. Thus, the quantitative evaluation of CNN models for medical image analysis is a rapidly evolving field, with researchers continuously developing new metrics and methodologies to assess and improve model performance across different imaging modalities and diseases.

State-of-the-art methods for the diagnostic classification of pathological images using computer vision have made significant advancements in recent years. These methods primarily leverage deep learning techniques, particularly CNNs and ViTs, to achieve high accuracy in various diagnostic tasks [106,171,172]. CNNs remain a cornerstone in pathological image analysis. Pre-trained models like DenseNet-161 and ResNet-50 have shown excellent performance in classifying digital histopathology patches into corresponding WSIs [172]. These models outperform previous state-of-the-art methods across all performance metrics for classifying digital pathology patches into 24 categories. Recent studies have demonstrated the effectiveness of ViTs in pathological image analysis. Hierarchical ViT architectures have been successfully employed for various diagnostic tasks, including the identification of malignant cells, prediction of histological grades, glioma-type classification, according to the 2021 WHO classification system, and molecular marker prediction [106,172].

There are some advanced techniques, such as transfer learning, weakly supervised learning, and few-shot learning. Transfer learning has proven to be a powerful technique in pathological image classification. By utilizing pre-trained models on large datasets and fine-tuning them for specific pathological tasks, researchers have achieved high accuracy even with limited labeled data [173]. Also, weakly supervised CNNs have shown promise in diagnosing specific conditions like intracranial germinomas, oligodendrogliomas, and low-grade astrocytomas from H&E-stained and intraoperative frozen sections. These models have demonstrated the ability to improve the diagnostic accuracy of pathologists by up to 40% [106,172]. Further, in scenarios with limited labeled data, few-shot learning techniques have made significant progress. The best methods have achieved accuracies exceeding 70%, 80%, and 85% in 5-way 1-shot, 5-way 5-shot, and 5-way 10-shot cases, respectively [174].

As emerging approaches, there are hierarchical classification or muti-modal learning. Novel approaches like PathTree introduce the concept of hierarchical pathological image classification. This method represents the multi-classification of diseases as a binary tree structure, using professional pathological text descriptions to guide the aggregation of hierarchical multiple representations [175]. In addition, integrating textual information with image data has shown promise in improving classification accuracy. Methods like PathTree use slide-text similarity and tree-specific losses to enhance the association between textual descriptions and slide images [175]. These state-of-the-art methods are significantly improving the accuracy and efficiency of pathological image analysis, potentially revolutionizing clinical diagnosis and treatment decision-making in the field of digital pathology.

7. Integration into Real-World Clinical Workflows

AI tools for the diagnostic classification of pathological images using computer vision have been successfully integrated into real-world clinical workflows through advancements in digital pathology, deep learning, and computational pathology. The application in clinical workflows includes (i) digital pathology and whole slide imaging (WSI), (ii) automated classification and prediction, (iii) enhanced pathologist support, and (iv) innovative architectures. AI-powered systems now utilize WSIs, which are high-resolution digitized versions of traditional glass slides, for tasks such as tumor grading, histopathological subtyping, and immunohistochemical scoring. These tools enhance diagnostic accuracy and efficiency by automating image analysis and providing objective assessments [16,176,177]. FDA-approved AI algorithms, such as those for prostate cancer diagnosis, have facilitated their adoption for primary diagnosis in clinical settings [176]. AI models have demonstrated high diagnostic accuracy in classifying various pathological conditions. For example, studies have shown over 97% accuracy in classifying colon lesions into six categories using convolutional neural networks [178]. These tools also predict clinical endpoints, patient prognosis, and therapy responses by analyzing tissue morphology and biomarker expression [177]. AI systems assist pathologists by identifying subtle features difficult to detect manually, such as isolated tumor cells or rare histopathological patterns. They also provide content-based image retrieval (CBIR), enabling pathologists to compare cases with large databases for better decision-making [16,176]. Novel AI architectures like DeepTree, mimic the diagnostic process of pathologists by incorporating prior knowledge of tissue morphology. This approach improves accuracy and transparency, fostering trust among clinicians [179]. In addition, there are some benefits to clinical integration, such as efficiency, standardization, and accessibility. AI reduces the time required for routine tasks like tumor grading or biomarker quantification. Standardization ensures consistency in scoring criteria for diseases like prostate or breast cancer. Digital pathology enables remote consultations and secondary reviews, expanding access to expert diagnoses [16,180].

Despite their success, challenges remain in scaling these technologies: (i) the need for robust validation to ensure diagnostic accuracy across diverse datasets, (ii) addressing technical barriers like computational costs and integration with existing hospital infrastructure, and (iii) increasing clinician trust by improving interpretability and transparency of AI models [177,181]. Thus, AI tools are transforming pathology by automating workflows, improving diagnostic precision, and supporting personalized medicine. Their continued development is expected to further integrate them into routine clinical practice.

8. Interpretability, Regulatory Concerns, and Cost

Diagnostic classification of pathological images using computer vision and artificial intelligence (AI) faces several challenges related to interpretability, regulatory concerns, and cost. These issues are critical to address for the successful implementation and widespread adoption of AI in pathology.

8.1. Interpretability Is a Significant Challenge in AI-Based Pathological Image Analysis

Many deep learning models, particularly those using complex neural networks, are often perceived as “black boxes” lacking clear explanations for their decisions [177]. The lack of interpretability can make it difficult for pathologists to trust and validate AI-generated results, potentially limiting clinical adoption. Efforts are being made to develop more interpretable AI models. For example, the PathTree approach uses professional pathological text descriptions to guide the aggregation of hierarchical representations, potentially improving interpretability [175].

8.2. AI Algorithms for Pathological Image Analysis Face Stringent Regulatory Requirements

Regulatory bodies like the FDA, EMA, and others require a clear description of how AI software works, which can be challenging for complex deep learning models [176]. AI-based models typically fall within Class II or III medical devices, requiring rigorous premarket approval processes. The lack of dedicated procedure codes for AI approaches in digital pathology complicates billing and reimbursement, though efforts are underway to establish new codes.

8.3. The Implementation of AI in Pathological Image Analysis Can Be Expensive

High-performance computing resources and large-scale data storage solutions required for processing and analyzing gigapixel-sized WSIs can be costly [176]. The initial investment in digital scanners, staff training, and technical support can be prohibitive, especially for small laboratories. While some studies project long-term cost savings in large academic centers, the cost–benefit ratio for small pathology laboratories and low-resource settings remains uncertain. The annotation cost for creating supervised learning models is substantial, with an estimated cost of approximately $12 per pathology slide based on average pathologist salaries. Therefore, addressing these challenges is crucial for the successful integration of AI in pathological image analysis. Efforts to improve interpretability, streamline regulatory processes, and reduce costs will be essential for widespread adoption in clinical practice.

9. Challenges and Future Directions

Transfer learning and domain adaptation are crucial techniques for improving the generalizability of AI models across different tasks and domains. These methods enable models to leverage knowledge gained from one task or domain to enhance performance on related but distinct tasks or domains [182]. Transfer learning involves using pre-trained models as a starting point for new tasks. This approach offers several benefits: (i) improved efficiency by reducing the need for large, labeled datasets in the target domain, (ii) enhanced generalization by leveraging knowledge from related tasks, and (iii) faster convergence during training, as the model already has a good starting point. Common transfer learning strategies include (i) feature extraction: using the pre-trained model as a fixed feature extractor, (ii) fine-tuning: adjusting some or all layers of the pre-trained model for the new task, and (iii) progressive fine-tuning: gradually unfreezing and fine-tuning layers from top to bottom. Domain adaptation focuses on adapting models trained in one domain (source) to perform well in a different but related domain (target). This is particularly useful when labeled data in the target domain is scarce with key approaches, such as (i) feature alignment: mapping features from source and target domains into a common space, (ii) adversarial training: using adversarial techniques to learn domain-invariant features, and (iii) self-training: leveraging unlabeled data in the target domain to improve adaptation [182]. The generalizability of AI models using transfer learning and domain adaptation could be enhanced by using regularization techniques like L1/L2 regularization to prevent overfitting, employing data augmentation to expose the model to a wider range of examples, utilizing ensemble methods to combine predictions from multiple adapted models, implementing cross-validation to ensure robust performance estimation across domains, and considering meta-learning approaches for faster adaptation to new tasks. By combining these techniques, AI models can better generalize across different domains and tasks, leading to more robust and versatile applications in real-world scenarios [182,183].

Several approaches are used to enhance the interpretability of deep learning models for pathological image classification, such as localization heat maps, feature importance analysis, pathological feature interaction analysis, in-context learning, regression concept vectors, attention mechanisms, and transparent machine learning pipelines. Localization heat maps visually highlight important regions in the image that contribute to the model’s decision. Techniques like Grad-CAM (Gradient-weighted Class Activation Mapping) generate heat maps to show which areas of a pathology image are most influential for classification [14]. Feature importance analysis involves quantifying the importance of different image features (e.g., color, texture, and cell characteristics) in making predictions. Machine learning techniques are used to rank features based on their contribution to the classification [14]. Pathological feature interaction analysis examines how different pathological features interact and jointly influence the model’s predictions, providing insights into complex relationships within the image data [14]. In-context learning improves the model’s ability to analyze and interpret test images by providing example images during inference. It can enhance the separation of embeddings corresponding to different classes and improve classification accuracy [184]. Regression concept vectors maps image features to higher-order concepts, which is useful for assessing global features in images [185]. Attention-based models, such as context-aware graph convolutional neural networks, can provide interpretable results by highlighting which parts of the input contribute most to the output [186]. Transparent machine learning pipelines combine multiple interpretability techniques to provide comprehensive insights at both the image and feature levels, helping pathologists understand the model’s decision-making process [14]. These approaches aim to make the decision-making process of deep learning models more transparent and understandable to clinicians, potentially increasing trust and facilitating the integration of these models into clinical practice [186].

The future of diagnostic classification of pathological images using computer vision is poised for significant advancements, driven by innovative deep learning techniques and emerging technologies.

9.1. Advanced Neural Network Architectures

CNNs will continue to play a crucial role in medical image analysis, with ongoing improvements in their architecture. These networks excel at extracting hierarchical features from images, making them particularly suitable for pathological image classification. Future CNNs are likely to incorporate more sophisticated structures, potentially integrating other algorithms like transformers and generative adversarial networks to enhance segmentation and classification capabilities.

Capsule networks (CapsNets) represent a novel approach to medical image classification, particularly beneficial for small datasets and complex anatomical structures. Their ability to maintain spatial relationships between features makes them especially promising for tasks like brain tumor detection.

Graph neural networks (GNNs) are emerging as a powerful tool for histopathological analysis, capable of capturing intricate spatial dependencies in WSIs [182]. Future developments in GNNs for pathological image classification include (i) hierarchical GNNs, (ii) Adaptive Graph Structure Learning, (iii) multimodal GNNs, and (iv) higher-order GNNs. These advancements will enable more nuanced modeling of tissue and cellular structures, potentially leading to improved diagnostic accuracy [179].

9.2. Multimodal Integration

Future diagnostic systems will likely integrate multiple data sources, combining pathological images with radiological, genomic, and proteomic measurements [180]. This holistic approach aims to improve diagnosis and prognosis by leveraging diverse data types.

9.3. Automated Feature Detection

Research will continue to focus on automating the detection and classification of morphological features, such as cells, nuclei, and mitoses [180]. These advancements will enhance the efficiency and accuracy of routine tasks in pathological image analysis.

9.4. Self-Supervised Learning

Self-supervised learning techniques are expected to play a larger role in the future, potentially reducing the need for extensive manual annotations and enabling models to learn from larger, unlabeled datasets.

9.5. Explainable AI

As these systems become more complex, there will be an increased emphasis on developing explainable AI models. This will be crucial for clinical adoption, allowing pathologists to understand and trust the decision-making process of AI systems.

Grad-CAM, SHAP, and LIME are three popular explainable AI (XAI) techniques used to interpret and understand the decisions made by machine learning models, particularly in image classification tasks. Grad-CAM is a visualization technique specifically designed for CNNs. It uses the gradients of a target concept flowing into the final convolutional layer to generate a coarse localization map of important regions in the input image. Key features of Grad-CAM include that it is (i) applicable to a wide range of CNN architectures without requiring architectural changes, (ii) provides spatial information about important regions in the input image [187], (iii) faster to compute compared to some other XAI methods [187], and (iv) useful for visualizing the contribution of each convolutional layer to the final prediction. Also, SHAP (Shapley Additive exPlanations) is based on game theory concepts, particularly Shapley values, to explain the output of any machine learning model. It offers several advantages that can be applied to various types of machine learning models, not just CNNs, providing detailed feature attribution, showing how each input feature contributes to the model’s output [187], offering both local (per-sample) and global (model-wide) explanations, and has desirable mathematical properties, including consistency and efficiency. In addition, LIME (Local Interpretable Model-agnostic Explanations) is another model-agnostic explanation technique that aims to explain individual predictions. Its key characteristics include that it creates locally faithful explanations by approximating the model around a specific prediction, can be applied to various types of data and models, and uses an interpretable model (e.g., ridge regression) to approximate the complex model’s behavior locally. When comparing these methods, Grad-CAM is particularly useful for CNN-based image classification tasks, providing spatially oriented explanations [187]. SHAP offers more detailed feature attribution and can be applied to a wider range of models but may be computationally more intensive [187]. LIME provides local explanations that are easy to interpret but may be less consistent across similar inputs compared to SHAP. The choice between these methods depends on the specific application, model type, and the level of detail required in the explanations. In some cases, using multiple methods can provide complementary insights into model behavior.

Future research is likely to focus on developing more robust and explainable AI models, as well as validating their performance in real-world clinical settings. The integration of AI with other emerging technologies like multiphoton microscopy may lead to more powerful multi-dimensional diagnostic approaches with the following requirements:

Ensuring model generalizability across different laboratories and staining protocols.
Addressing potential biases in training data.
Integrating AI tools into existing clinical workflows.
Maintaining interpretability of AI-assisted diagnoses.

Computer vision and AI techniques have shown great potential for enhancing pathological image classification and analysis [175,178,180,181,182,183,184,185,186,187,188,189,190,191]. As these technologies continue to advance, they are poised to become valuable tools supporting pathologists in making more accurate and efficient diagnoses.

Future research will need to address several key challenges, including (i) improving model generalization across different imaging modalities and patient populations, (ii) developing more efficient methods for handling large-scale WSIs, (iii) addressing issues of data privacy and security in medical imaging, and (iv) enhancing the interpretability of deep learning models for clinical use. By tackling these challenges, the field of pathological image classification using computer vision is set to revolutionize disease detection, enable more precise diagnoses, and ultimately improve patient care through tailored treatment plans [182,188].

The integration of AI with advanced microscopy techniques like multiphoton microscopy (MPM) is poised to revolutionize biomedical research and diagnostic pathology. AI algorithms will be crucial in optimizing image acquisition and processing in MPM, including real-time adaptive imaging, image restoration and super-resolution, and noise reduction. AI could dynamically adjust imaging parameters based on the sample’s characteristics, optimizing resolution and reducing phototoxicity [192]. Deep learning models will continue to improve image quality, enabling the extraction of more detailed information from MPM data [12]. Advanced AI techniques will further enhance the signal-to-noise ratio in MPM images, allowing for clearer visualization of cellular structures. Future research will likely focus on integrating MPM with other imaging modalities, including AI-driven image fusion and multimodal feature extraction. Algorithms will be developed to seamlessly combine data from MPM with different techniques like OCT or quantitative phase imaging (QPI) [12]. AI models will be trained to identify and correlate features across different imaging modalities, providing a more comprehensive view of biological systems. AI will enhance the diagnostic potential of MPM, such as automated disease classification, predictive diagnostics, and personalized medicine. Deep learning models will be refined to accurately classify diseases based on MPM images, potentially surpassing human expert performance [12]. AI algorithms will be developed to predict disease progression and treatment outcomes based on subtle changes in MPM images [12]. AI-powered MPM analysis could help tailor treatments to individual patients based on their unique cellular characteristics [12]. In addition, some research will focus on developing AI systems specifically tailored for MPM, such as self-enhancing AI, meta-learning approaches, and explainable AI. Systems like SEMPAI (Self-Enhancing Multi-Photon Artificial Intelligence) will be further developed to optimize prior knowledge integration and data representation for MPM analysis [193]. Meta-learning approaches will enable AI models to learn from smaller datasets, which is crucial for rare diseases or specialized research areas [193]. Developing interpretable AI models will be a priority to provide insights into the decision-making process and increase trust in AI-assisted diagnoses [12]. Further, some future research will explore synergies between AI-enhanced MPM and other cutting-edge technologies, including organ-on-a-chip, CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)-based techniques, and single-cell omics. AI-powered MPM could be used to analyze complex 3D cellular structures in organ-on-a-chip models, advancing drug discovery and toxicology studies. AI could help analyze MPM images of CRISPR-edited cells, providing insights into gene function and disease mechanisms. Integrating AI-enhanced MPM with single-cell sequencing data could provide unprecedented insights into cellular heterogeneity and function. By pursuing these research directions, the integration of AI with MPM and other advanced microscopy techniques has the potential to significantly advance our understanding of biological systems and improve diagnostic capabilities in pathology.

Synthetic data generation and SSL play crucial roles in overcoming current limitations in AI and machine learning, while also shaping future advancements in these fields. Synthetic data generation addresses the challenges of data scarcity and privacy concerns. It allows organizations to create large, diverse datasets without compromising sensitive information. This is particularly valuable in fields like healthcare, where patient data privacy is paramount. Synthetic data can help reduce biases in AI models by up to 15% [194]. By generating balanced and unbiased datasets, developers can create more equitable AI systems that treat all users fairly, regardless of demographic characteristics. Synthetic data generation can reduce data collection costs by 40% while improving model accuracy by 10%. This makes it a cost-effective alternative to traditional data collection methods, especially for large-scale or complex datasets. SSL techniques, combined with synthetic data, can significantly improve model performance, especially in scenarios with limited labeled data [195]. This approach allows models to learn from both labeled and unlabeled (or synthetic) data, potentially outperforming traditional supervised learning methods. Synthetic data enables the creation of diverse datasets that consider outliers and edge cases, improving the robustness of AI models across a broader range of scenarios. This is particularly valuable in medical imaging, where synthetic data has improved diagnostic AI accuracy by up to 20% for rare diseases. Synthetic data can help reduce regulatory barriers that prevent the widespread sharing and integration of data across multiple sources [196]. This could lead to more collaborative research and development in AI and machine learning. Despite their potential, both synthetic data generation and SSL face challenges, including data fidelity and diversity, computational costs, distribution mismatch, theoretical limitations, and complexity and assumptions. Ensuring synthetic data accurately represents real-world data properties remains complex [197]. Generating high-quality synthetic data often requires significant computational resources. Synthetic data may not perfectly match the distribution of real data, potentially leading to performance degradation in SSL methods [196,197,198,199,200]. SSL may not always provide significant advantages over supervised learning in terms of sample complexity. SSL methods often rely on complex assumptions about data distribution, which may not always hold. Therefore, while synthetic data generation and SSL offer promising solutions to current AI limitations, their effective implementation requires careful consideration of their inherent challenges and limitations. As these technologies continue to evolve, they are likely to play an increasingly important role in shaping the future of AI and machine learning.

10. Conclusions

Computer vision has transformed pathological image analysis by enabling faster, more accurate diagnoses through advanced deep learning techniques. While challenges remain, ongoing innovations in AI architectures, synthetic data generation, and SSL promise to further enhance diagnostic workflows and patient outcomes globally. Computer vision and deep learning techniques have revolutionized the field of pathology, enabling automated diagnosis and classification of pathological images. These advancements have significant implications for cancer detection, diagnosis, and treatment planning by CNNs, GANs, interpretable models, etc. While significant progress has been made, challenges remain in the field of computational pathology: (i) ensuring reproducibility and reusability of deep learning models across different institutions and datasets, (ii) developing models that can generalize across various cancer types and tissue preservation methods, and (iii) integrating pathological knowledge into deep learning models to improve accuracy and interpretability. As the field continues to evolve, the integration of computer vision techniques in pathology is expected to enhance diagnostic accuracy, streamline workflows, and ultimately improve patient care.

Author Contributions

Conceptualization, Y.M.; writing—original draft preparation, Y.M.; writing—review and editing, Y.M.; supervision, R.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

Chatzipanagiotou, O.P.; Loukas, C.; Vailas, M.; Machairas, N.; Kykalos, S.; Charalampopoulos, G.; Filippiadis, D.; Felekouras, E.; Schizas, D. Artificial intelligence in hepatocellular carcinoma diagnosis: A comprehensive review of current literature. J. Gastroenterol. Hepatol. 2024, 39, 1994–2005. [Google Scholar] [CrossRef] [PubMed]
Priya, C.V.L.; Biju, V.G.; Vinod, B.R.; Ramachandran, S. Deep learning approaches for breast cancer detection in histopathology images: A review. Cancer Biomark. 2024, 40, 1–25. [Google Scholar] [CrossRef] [PubMed]
Luo, L.; Wang, X.; Lin, Y.; Ma, X.; Tan, A.; Chan, R.; Vardhanabhuti, V.; Chu, W.C.; Cheng, K.T.; Chen, H. Deep Learning in Breast Cancer Imaging: A Decade of Progress and Future Directions. IEEE Rev. Biomed. Eng. 2024, in press. [CrossRef] [PubMed]
Dong, C.; Hayashi, S. Deep learning applications in vascular dementia using neuroimaging. Curr. Opin. Psychiatry 2024, 37, 101–106. [Google Scholar] [CrossRef] [PubMed]
Deng, C.; Li, D.; Feng, M.; Han, D.; Huang, Q. The value of deep neural networks in the pathological classification of thyroid tumors. Diagn. Pathol. 2023, 18, 95. [Google Scholar] [CrossRef] [PubMed]
Khosravi, P.; Lysandrou, M.; Eljalby, M.; Li, Q.; Kazemi, E.; Zisimopoulos, P.; Sigaras, A.; Brendel, M.; Barnes, J.; Ricketts, C.; et al. A Deep Learning Approach to Diagnostic Classification of Prostate Cancer Using Pathology-Radiology Fusion. J. Magn. Reson. Imaging 2021, 54, 462–471. [Google Scholar] [CrossRef] [PubMed]
Hekler, A.; Utikal, J.S.; Enk, A.H.; Solass, W.; Schmitt, M.; Klode, J.; Schadendorf, D.; Sondermann, W.; Franklin, C.; Bestvater, F.; et al. Deep learning outperformed 11 pathologists in the classification of histopathological melanoma images. Eur. J. Cancer 2019, 118, 91–96. [Google Scholar] [CrossRef] [PubMed]
Kosaraju, S.; Park, J.; Lee, H.; Yang, J.W.; Kang, M. Deep learning-based framework for slide-based histopathological image analysis. Sci. Rep. 2022, 12, 19075. [Google Scholar] [CrossRef] [PubMed]
Iizuka, O.; Kanavati, F.; Kato, K.; Rambeau, M.; Arihiro, K.; Tsuneki, M. Deep Learning Models for Histopathological Classification of Gastric and Colonic Epithelial Tumours. Sci. Rep. 2020, 10, 1504. [Google Scholar] [CrossRef] [PubMed]
Shimazaki, T.; Deshpande, A.; Hajra, A.; Thomas, T.; Muta, K.; Yamada, N.; Yasui, Y.; Shoda, T. Deep learning-based image-analysis algorithm for classification and quantification of multiple histopathological lesions in rat liver. J. Toxicol. Pathol. 2022, 35, 135–147. [Google Scholar] [CrossRef] [PubMed]
Kim, H.E.; Cosa-Linan, A.; Santhanam, N.; Jannesari, M.; Maros, M.E.; Ganslandt, T. Transfer learning for medical image classification: A literature review. BMC Med. Imaging 2022, 22, 69. [Google Scholar] [CrossRef] [PubMed]
Wang, S.; Pan, J.; Zhang, X.; Li, Y.; Liu, W.; Lin, R.; Wang, X.; Kang, D.; Li, Z.; Huang, F.; et al. Towards next-generation diagnostic pathology: AI-empowered label-free multiphoton microscopy. Light Sci. Appl. 2024, 13, 254. [Google Scholar] [CrossRef] [PubMed]
Tsai, M.J.; Tao, Y.H. Deep Learning Technology Applied to Medical Image Tissue Classification. Diagnostics 2022, 12, 2430. [Google Scholar] [CrossRef] [PubMed]
Zhu, J.; Wu, W.; Zhang, Y.; Lin, S.; Jiang, Y.; Liu, R.; Wang, X. Computational analysis of pathological image enables interpretable prediction for microsatellite instability. arXiv 2020, arXiv:2010.03130. Available online: https://arxiv.org/abs/2010.03130 (accessed on 7 October 2020). [CrossRef]
Zheng, S.; Cui, X.; Sun, Y.; Li, J.; Li, H.; Zhang, Y.; Chen, P.; Jing, X.; Ye, Z.; Yang, L. Benchmarking PathCLIP for Pathology Image Analysis. arXiv 2024, arXiv:2401.02651. Available online: https://arxiv.org/abs/2401.02651 (accessed on 5 January 2024). [CrossRef] [PubMed]
Li, J.; Sun, Q.; Yan, R.; Wang, Y.; Fu, Y.; Wei, Y.; Guan, T.; Shi, H.; He, Y.; Han, A. Diagnostic Text-guided Representation Learning in Hierarchical Classification for Pathological Whole Slide Image. arXiv 2024, arXiv:2411.10709. Available online: https://arxiv.org/abs/2411.10709 (accessed on 16 November 2024).
Muksimova, S.; Umirzakova, S.; Kang, S.; Cho, Y.I. CerviLearnNet: Advancing cervical cancer diagnosis with reinforcement learning-enhanced convolutional networks. Heliyon 2024, 10, e29913. [Google Scholar] [CrossRef] [PubMed]
Muksimova, S.; Umirzakova, S.; Mardieva, S.; Cho, Y.I. Enhancing Medical Image Denoising with Innovative Teacher-Student Model-Based Approaches for Precision Diagnostics. Sensors 2023, 23, 9502. [Google Scholar] [CrossRef] [PubMed]
Jo, T.; Nho, K.; Bice, P.; Saykin, A.J. Alzheimer’s Disease Neuroimaging Initiative. Deep learning-based identification of genetic variants: Application to Alzheimer’s disease classification. Brief. Bioinform. 2022, 23, bbac022. [Google Scholar] [CrossRef] [PubMed]
Alsubai, S.; Khan, H.U.; Alqahtani, A.; Sha, M.; Abbas, S.; Mohammad, U.G. Ensemble deep learning for brain tumor detection. Front. Comput. Neurosci. 2022, 16, 1005617. [Google Scholar] [CrossRef] [PubMed]
Ragab, M.; Albukhari, A.; Alyami, J.; Mansour, R.F. Ensemble Deep-Learning-Enabled Clinical Decision Support System for Breast Cancer Diagnosis and Classification on Ultrasound Images. Biology 2022, 11, 439. [Google Scholar] [CrossRef] [PubMed]
Tahmid, M.T.; Kader, M.E.; Mahmud, T.; Fattah, S.A. MD-CardioNet: A Multi-Dimensional Deep Neural Network for Cardiovascular Disease Diagnosis from Electrocardiogram. IEEE J. Biomed. Health Inform. 2023, 28, 2005–2013. [Google Scholar] [CrossRef] [PubMed]
García-Jaramillo, M.; Luque, C.; León-Vargas, F. Machine Learning and Deep Learning Techniques Applied to Diabetes Research: A Bibliometric Analysis. J. Diabetes Sci. Technol. 2024, 18, 287–301. [Google Scholar] [CrossRef] [PubMed]
Lakshmipriya, B.; Pottakkat, B.; Ramkumar, G. Deep learning techniques in liver tumour diagnosis using CT and MR imaging—A systematic review. Artif. Intell. Med. 2023, 141, 102557. [Google Scholar] [CrossRef] [PubMed]
Anai, S.; Hisasue, J.; Takaki, Y.; Hara, N. Deep Learning Models to Predict Fatal Pneumonia Using Chest X-Ray Images. Can. Respir. J. 2022, 2022, 8026580. [Google Scholar] [CrossRef] [PubMed]
Jaradat, A.S.; Al Mamlook, R.E.; Almakayeel, N.; Alharbe, N.; Almuflih, A.S.; Nasayreh, A.; Gharaibeh, H.; Gharaibeh, M.; Gharaibeh, A.; Bzizi, H. Automated Monkeypox Skin Lesion Detection Using Deep Learning and Transfer Learning Techniques. Int. J. Environ. Res. Public Health 2023, 20, 4422. [Google Scholar] [CrossRef] [PubMed]
Ahsan, M.M.; Luna, S.A.; Siddique, Z. Machine-Learning-Based Disease Diagnosis: A Comprehensive Review. Healthcare 2022, 10, 541. [Google Scholar] [CrossRef] [PubMed]
Albahli, S.; Ahmad Hassan Yar, G.N. AI-driven deep convolutional neural networks for chest X-ray pathology identification. J. Xray Sci. Technol. 2022, 30, 365–376. [Google Scholar] [CrossRef] [PubMed]
Liang, H.; Wang, M.; Wen, Y.; Du, F.; Jiang, L.; Geng, X.; Tang, L.; Yan, H. Predicting acute pancreatitis severity with enhanced computed tomography scans using convolutional neural networks. Sci. Rep. 2023, 13, 17514. [Google Scholar] [CrossRef] [PubMed]
de Oliveira, M.; Piacenti-Silva, M.; da Rocha, F.C.G.; Santos, J.M.; Cardoso, J.D.S.; Lisboa-Filho, P.N. Lesion Volume Quantification Using Two Convolutional Neural Networks in MRIs of Multiple Sclerosis Patients. Diagnostics 2022, 12, 230. [Google Scholar] [CrossRef] [PubMed]
Jung, H.; Lodhi, B.; Kang, J. An automatic nuclei segmentation method based on deep convolutional neural networks for histopathology images. BMC Biomed. Eng. 2019, 1, 24. [Google Scholar] [CrossRef] [PubMed]
Li, R.; Ma, F.; Gao, J. Integrating Multimodal Electronic Health Records for Diagnosis Prediction. AMIA Annu. Symp. Proc. 2022, 2021, 726–735. [Google Scholar] [PubMed]
Martins, T.D.; Annichino-Bizzacchi, J.M.; Romano, A.V.C.; Maciel Filho, R. Artificial neural networks for prediction of recurrent venous thromboembolism. Int. J. Med. Inform. 2020, 141, 104221. [Google Scholar] [CrossRef] [PubMed]
Huang, F.; Qiu, A. Ensemble Vision Transformer for Dementia Diagnosis. IEEE J. Biomed. Health Inform. 2024, 28, 5551–5561. [Google Scholar] [CrossRef] [PubMed]
Al Shehri, W. Alzheimer’s disease diagnosis and classification using deep learning techniques. PeerJ Comput. Sci. 2022, 8, e1177. [Google Scholar] [CrossRef] [PubMed]
Kim, J.S.; Han, J.W.; Bae, J.B.; Moon, D.G.; Shin, J.; Kong, J.E.; Lee, H.; Yang, H.W.; Lim, E.; Kim, J.Y.; et al. Deep learning-based diagnosis of Alzheimer’s disease using brain magnetic resonance images: An empirical study. Sci. Rep. 2022, 12, 18007. [Google Scholar] [CrossRef] [PubMed]
Jo, T.; Nho, K.; Saykin, A.J. Deep Learning in Alzheimer’s Disease: Diagnostic Classification and Prognostic Prediction Using Neuroimaging Data. Front. Aging Neurosci. 2019, 11, 220. [Google Scholar] [CrossRef] [PubMed]
Alsubaie, M.G.; Luo, S.; Shaukat, K. Alzheimer’s Disease Detection Using Deep Learning on Neuroimaging: A Systematic Review. Mach. Learn. Knowl. Extr. 2024, 6, 464–505. [Google Scholar] [CrossRef]
Liu, S.; Masurkar, A.V.; Rusinek, H.; Chen, J.; Zhang, B.; Zhu, W.; Fernandez-Granda, C.; Razavian, N. Generalizable deep learning model for early Alzheimer’s disease detection from structural MRIs. Sci Rep. 2022, 12, 17106. [Google Scholar] [CrossRef] [PubMed]
Zhen, S.H.; Cheng, M.; Tao, Y.B.; Wang, Y.F.; Juengpanich, S.; Jiang, Z.Y.; Jiang, Y.K.; Yan, Y.Y.; Lu, W.; Lue, J.M.; et al. Deep Learning for Accurate Diagnosis of Liver Tumor Based on Magnetic Resonance Imaging and Clinical Data. Front. Oncol. 2020, 10, 680. [Google Scholar] [CrossRef] [PubMed]
Sridhar, K.C.K.; Lai, W.C.; Kavin, B.P. Detection of Liver Tumour Using Deep Learning Based Segmentation with Coot Extreme Learning Model. Biomedicines 2023, 11, 800. [Google Scholar] [CrossRef] [PubMed]
Liu, L.; Zhang, R.; Shi, Y.; Sun, J.; Xu, X. Automated machine learning for predicting liver metastasis in patients with gastrointestinal stromal tumor: A SEER-based analysis. Sci. Rep. 2024, 14, 12415. [Google Scholar] [CrossRef] [PubMed]
Othman, E.; Mahmoud, M.; Dhahri, H.; Abdulkader, H.; Mahmood, A.; Ibrahim, M. Automatic Detection of Liver Cancer Using Hybrid Pre-Trained Models. Sensors 2022, 22, 5429. [Google Scholar] [CrossRef] [PubMed]
Liu, J.Q.; Ren, J.Y.; Xu, X.L.; Xiong, L.Y.; Peng, Y.X.; Pan, X.F.; Dietrich, C.F.; Cui, X.W. Ultrasound-based artificial intelligence in gastroenterology and hepatology. World J. Gastroenterol. 2022, 28, 5530–5546. [Google Scholar] [CrossRef] [PubMed]
Wong, P.K.; Chan, I.N.; Yan, H.M.; Gao, S.; Wong, C.H.; Yan, T.; Yao, L.; Hu, Y.; Wang, Z.R.; Yu, H.H. Deep learning based radiomics for gastrointestinal cancer diagnosis and treatment: A minireview. World J. Gastroenterol. 2022, 28, 6363–6379. [Google Scholar] [CrossRef] [PubMed]
Pinto-Coelho, L. How Artificial Intelligence Is Shaping Medical Imaging Technology: A Survey of Innovations and Applications. Bioengineering 2023, 10, 1435. [Google Scholar] [CrossRef] [PubMed]
Thakur, G.K.; Thakur, A.; Kulkarni, S.; Khan, N.; Khan, S. Deep Learning Approaches for Medical Image Analysis and Diagnosis. Cureus 2024, 16, e59507. [Google Scholar] [CrossRef] [PubMed]
Li, M.; Jiang, Y.; Zhang, Y.; Zhu, H. Medical image analysis using deep learning algorithms. Front. Public Health. 2023, 11, 1273253. [Google Scholar] [CrossRef]
Li, Y.; El Habib Daho, M.; Conze, P.H.; Zeghlache, R.; Le Boité, H.; Tadayoni, R.; Cochener, B.; Lamard, M.; Quellec, G. A review of deep learning-based information fusion techniques for multimodal medical image classification. Comput. Biol. Med. 2024, 177, 108635. [Google Scholar] [CrossRef] [PubMed]
Nguyen, H.T.; Nguyen, H.Q.; Pham, H.H.; Lam, K.; Le, L.T.; Dao, M.; Vu, V. VinDr-Mammo: A large-scale benchmark dataset for computer-aided diagnosis in full-field digital mammography. Sci. Data 2023, 10, 277. [Google Scholar] [CrossRef] [PubMed]
Cellina, M.; Cacioppa, L.M.; Cè, M.; Chiarpenello, V.; Costa, M.; Vincenzo, Z.; Pais, D.; Bausano, M.V.; Rossini, N.; Bruno, A.; et al. Artificial Intelligence in Lung Cancer Screening: The Future Is Now. Cancers 2023, 15, 4344. [Google Scholar] [CrossRef] [PubMed]
Zhou, N.; Chen, H.; Liu, B.; Xu, C.Y. Enhanced river suspended sediment concentration identification via multimodal video image, optical flow, and water temperature data fusion. J. Environ. Manag. 2024, 367, 122048. [Google Scholar] [CrossRef] [PubMed]
Carriero, A.; Groenhoff, L.; Vologina, E.; Basile, P.; Albera, M. Deep Learning in Breast Cancer Imaging: State of the Art and Recent Advancements in Early 2024. Diagnostics 2024, 14, 848. [Google Scholar] [CrossRef] [PubMed]
Perosa, V.; Scherlek, A.A.; Kozberg, M.G.; Smith, L.; Westerling-Bui, T.; Auger, C.A.; Vasylechko, S.; Greenberg, S.M.; van Veluw, S.J. Deep learning assisted quantitative assessment of histopathological markers of Alzheimer’s disease and cerebral amyloid angiopathy. Acta Neuropathol. Commun. 2021, 9, 141. [Google Scholar] [CrossRef] [PubMed]
Hou, J.J.; Tian, H.L.; Lu, B. A Deep Neural Network-Based Model for Quantitative Evaluation of the Effects of Swimming Training. Comput. Intell. Neurosci. 2022, 2022, 5508365. [Google Scholar] [CrossRef] [PubMed]
Liu, H.; Li, I.; Liang, Y.; Sun, D.; Yang, Y.; Yang, H. Research on Deep Learning Model of Feature Extraction Based on Convolutional Neural Network. arXiv 2024, arXiv:2406.08837v1. Available online: https://doi.org/10.48550/arXiv.2406.08837 (accessed on 13 June 2024). [CrossRef]
Tripathi, N.; Bhardwaj, N.; Kumar, S.; Jain, S.K. A machine learning-based KNIME workflow to predict VEGFR-2 inhibitors. Chem. Biol. Drug Des. 2023, 102, 38–50. [Google Scholar] [CrossRef] [PubMed]
Bui, Q.-T.; Chou, T.-Y.; Hoang, T.-V.; Fang, Y.-M.; Mu, C.-Y.; Huang, P.-H.; Pham, V.-D.; Nguyen, Q.-H.; Anh, D.T.N.; Pham, V.-M.; et al. Gradient Boosting Machine and Object-Based CNN for Land Cover Classification. Remote Sens. 2021, 13, 2709. [Google Scholar] [CrossRef]
Matsumoto, S.; Ishida, S.; Araki, M.; Kato, T.; Terayama, K.; Okuno, Y. Extraction of protein dynamics information from cryo-EM maps using deep learning. Nat. Mach. Intell. 2021, 3, 153–160. [Google Scholar] [CrossRef]
Narula, J.; Stuckey, T.D.; Nakazawa, G.; Ahmadi, A.; Matsumura, M.; Petersen, K.; Mirza, S.; Ng, N.; Mullen, S.; Schaap, M.; et al. Prospective deep learning-based quantitative assessment of coronary plaque by computed tomography angiography compared with intravascular ultrasound: The REVEALPLAQUE study. Eur. Heart J. Cardiovasc. Imaging 2024, 25, 1287–1295. [Google Scholar] [CrossRef] [PubMed]
Lee, S.N.; Lin, A.; Dey, D.; Berman, D.S.; Han, D. Application of Quantitative Assessment of Coronary Atherosclerosis by Coronary Computed Tomographic Angiography. Korean J. Radiol. 2024, 25, 518–539. [Google Scholar] [CrossRef] [PubMed]
Griffin, W.F.; Choi, A.D.; Riess, J.S.; Marques, H.; Chang, H.J.; Choi, J.H.; Doh, J.H.; Her, A.Y.; Koo, B.K.; Nam, C.W.; et al. AI Evaluation of Stenosis on Coronary CTA, Comparison with Quantitative Coronary Angiography and Fractional Flow Reserve: A CREDENCE Trial Substudy. JACC Cardiovasc. Imaging 2023, 16, 193–205. [Google Scholar] [CrossRef] [PubMed]
Covas, P.; De Guzman, E.; Barrows, I.; Bradley, A.J.; Choi, B.G.; Krepp, J.M.; Lewis, J.F.; Katz, R.; Tracy, C.M.; Zeman, R.K.; et al. Artificial Intelligence Advancements in the Cardiovascular Imaging of Coronary Atherosclerosis. Front. Cardiovasc. Med. 2022, 9, 839400. [Google Scholar] [CrossRef] [PubMed]
Voros, S.; Rinehart, S.; Qian, Z.; Joshi, P.; Vazquez, G.; Fischer, C.; Belur, P.; Hulten, E.; Villines, T.C. Coronary atherosclerosis imaging by coronary CT angiography: Current status, correlation with intravascular interrogation and meta-analysis. JACC Cardiovasc. Imaging 2011, 4, 537–548. [Google Scholar] [CrossRef] [PubMed]
Arjmandi, N.; Mosleh-Shirazi, M.A.; Mohebbi, S.; Nasseri, S.; Mehdizadeh, A.; Pishevar, Z.; Hosseini, S.; Tehranizadeh, A.A.; Momennezhad, M. Evaluating the dosimetric impact of deep-learning-based auto-segmentation in prostate cancer radiotherapy: Insights into real-world clinical implementation and inter-observer variability. J. Appl. Clin. Med. Phys. 2024, 1, e14569. [Google Scholar] [CrossRef] [PubMed]
Bhandari, A. Revolutionizing Radiology with Artificial Intelligence. Cureus 2024, 16, e72646. [Google Scholar] [CrossRef] [PubMed]
Gala, D.; Behl, H.; Shah, M.; Makaryus, A.N. The Role of Artificial Intelligence in Improving Patient Outcomes and Future of Healthcare Delivery in Cardiology: A Narrative Review of the Literature. Healthcare 2024, 12, 481. [Google Scholar] [CrossRef] [PubMed]
Bennani, S.; Regnard, N.E.; Ventre, J.; Lassalle, L.; Nguyen, T.; Ducarouge, A.; Dargent, L.; Guillo, E.; Gouhier, E.; Zaimi, S.H.; et al. Using AI to Improve Radiologist Performance in Detection of Abnormalities on Chest Radiographs. Radiology 2023, 309, e230860. [Google Scholar] [CrossRef] [PubMed]
Wiggins, W.F.; Magudia, K.; Schmidt, T.M.S.; O’Connor, S.D.; Carr, C.D.; Kohli, M.D.; Andriole, K.P. Imaging AI in Practice: A Demonstration of Future Workflow Using Integration Standards. Radiol. Artif. Intell. 2021, 3, e210152. [Google Scholar] [CrossRef] [PubMed]
Baltruschat, I.; Steinmeister, L.; Nickisch, H.; Saalbach, A.; Grass, M.; Adam, G.; Knopp, T.; Ittrich, H. Smart chest X-ray worklist prioritization using artificial intelligence: A clinical workflow simulation. Eur. Radiol. 2021, 31, 3837–3845. [Google Scholar] [CrossRef] [PubMed]
Hosny, A.; Parmar, C.; Quackenbush, J.; Schwartz, L.H.; Aerts, H.J.W.L. Artificial intelligence in radiology. Nat. Rev. Cancer 2018, 18, 500–510. [Google Scholar] [CrossRef] [PubMed]
Salih, S.; Elliyanti, A.; Alkatheeri, A.; AlYafei, F.; Almarri, B.; Khan, H. The Role of Molecular Imaging in Personalized Medicine. J. Pers. Med. 2023, 13, 369. [Google Scholar] [CrossRef] [PubMed]
Massoud, T.F.; Gambhir, S.S. Integrating noninvasive molecular imaging into molecular medicine: An evolving paradigm. Trends Mol. Med. 2007, 13, 183–191. [Google Scholar] [CrossRef] [PubMed]
Pianykh, O.S.; Langs, G.; Dewey, M.; Enzmann, D.R.; Herold, C.J.; Schoenberg, S.O.; Brink, J.A. Continuous Learning AI in Radiology: Implementation Principles and Early Applications. Radiology 2020, 297, 6–14. [Google Scholar] [CrossRef] [PubMed]
Najjar, R. Redefining Radiology: A Review of Artificial Intelligence Integration in Medical Imaging. Diagnostics 2023, 13, 2760. [Google Scholar] [CrossRef] [PubMed]
Izadi, S.; Forouzanfar, M. Error Correction and Adaptation in Conversational AI: A Review of Techniques and Applications in Chatbots. AI 2024, 5, 803–841. [Google Scholar] [CrossRef]
Popa, S.L.; Ismaiel, A.; Brata, V.D.; Turtoi, D.C.; Barsan, M.; Czako, Z.; Pop, C.; Muresan, L.; Stanculete, M.F.; Dumitrascu, D.I. Artificial Intelligence and medical specialties: Support or substitution? Med. Pharm. Rep. 2024, 97, 409–418. [Google Scholar] [CrossRef] [PubMed]
Berbís, M.A.; Paulano Godino, F.; Royuela Del Val, J.; Alcalá Mata, L.; Luna, A. Clinical impact of artificial intelligence-based solutions on imaging of the pancreas and liver. World J. Gastroenterol. 2023, 29, 1427–1445. [Google Scholar] [CrossRef] [PubMed]
Liu, X.; Ren, Y.; Wang, J.; Yang, X.; Lu, L. The Clinical Diagnostic Value of F-FDG PET/CT Combined with MRI in Pancreatic Cancer. Contrast Media Mol. Imaging 2022, 2022, 1479416. [Google Scholar] [CrossRef] [PubMed]
Cai, L.; Pfob, A. Artificial intelligence in abdominal and pelvic ultrasound imaging: Current applications. Abdom. Radiol. 2024, in press. [Google Scholar] [CrossRef] [PubMed]
Lee, J.M.; Park, J.Y.; Kim, Y.J.; Kim, K.G. Deep-learning-based pelvic automatic segmentation in pelvic fractures. Sci. Rep. 2024, 14, 12258. [Google Scholar] [CrossRef] [PubMed]
Mervak, B.M.; Fried, J.G.; Wasnik, A.P. A Review of the Clinical Applications of Artificial Intelligence in Abdominal Imaging. Diagnostics 2023, 13, 2889. [Google Scholar] [CrossRef] [PubMed]
Nowak, E.; Białecki, M.; Białecka, A.; Kazimierczak, N.; Kloska, A. Assessing the diagnostic accuracy of artificial intelligence in post-endovascular aneurysm repair endoleak detection using dual-energy computed tomography angiography. Pol. J. Radiol. 2024, 89, e420–e427. [Google Scholar] [CrossRef] [PubMed]
Fowler, G.E.; Blencowe, N.S.; Hardacre, C.; Callaway, M.P.; Smart, N.J.; Macefield, R. Artificial intelligence as a diagnostic aid in cross-sectional radiological imaging of surgical pathology in the abdominopelvic cavity: A systematic review. BMJ Open 2023, 13, e064739. [Google Scholar] [CrossRef] [PubMed]
Bajaj, T.; Koyner, J.L. Cautious Optimism: Artificial Intelligence and Acute Kidney Injury. Clin. J. Am. Soc. Nephrol. 2023, 18, 668–670. [Google Scholar] [CrossRef] [PubMed]
Loftus, T.J.; Shickel, B.; Ozrazgat-Baslanti, T.; Ren, Y.; Glicksberg, B.S.; Cao, J.; Singh, K.; Chan, L.; Nadkarni, G.N.; Bihorac, A. Artificial intelligence-enabled decision support in nephrology. Nat. Rev. Nephrol. 2022, 18, 452–465. [Google Scholar] [CrossRef] [PubMed]
Raina, R.; Nada, A.; Shah, R.; Aly, H.; Kadatane, S.; Abitbol, C.; Aggarwal, M.; Koyner, J.; Neyra, J.; Sethi, S.K. Artificial intelligence in early detection and prediction of pediatric/neonatal acute kidney injury: Current status and future directions. Pediatr. Nephrol. 2024, 39, 2309–2324. [Google Scholar] [CrossRef] [PubMed]
Bi, W.L.; Hosny, A.; Schabath, M.B.; Giger, M.L.; Birkbak, N.J.; Mehrtash, A.; Allison, T.; Arnaout, O.; Abbosh, C.; Dunn, I.F.; et al. Artificial intelligence in cancer imaging: Clinical challenges and applications. CA Cancer J. Clin. 2019, 69, 127–157. [Google Scholar] [CrossRef] [PubMed]
Aggarwal, R.; Sounderajah, V.; Martin, G.; Ting, D.S.W.; Karthikesalingam, A.; King, D.; Ashrafian, H.; Darzi, A. Diagnostic accuracy of deep learning in medical imaging: A systematic review and meta-analysis. npj Digit. Med. 2021, 4, 65. [Google Scholar] [CrossRef] [PubMed]
Puri, P.; Comfere, N.; Drage, L.A.; Shamim, H.; Bezalel, S.A.; Pittelkow, M.R.; Davis, M.D.P.; Wang, M.; Mangold, A.R.; Tollefson, M.M.; et al. Deep learning for dermatologists: Part II. Current applications. J. Am. Acad. Dermatol. 2022, 87, 1352–1360. [Google Scholar] [CrossRef] [PubMed]
Zhao, G.; Chen, X.; Zhu, M.; Liu, Y.; Wang, Y. Exploring the application and future outlook of Artificial intelligence in pancreatic cancer. Front. Oncol. 2024, 14, 1345810. [Google Scholar] [CrossRef] [PubMed]
Jiang, X.; Hu, Z.; Wang, S.; Zhang, Y. Deep Learning for Medical Image-Based Cancer Diagnosis. Cancers 2023, 15, 3608. [Google Scholar] [CrossRef] [PubMed]
Aamir, A.; Iqbal, A.; Jawed, F.; Ashfaque, F.; Hafsa, H.; Anas, Z.; Oduoye, M.O.; Basit, A.; Ahmed, S.; Abdul Rauf, S.; et al. Exploring the current and prospective role of artificial intelligence in disease diagnosis. Ann. Med. Surg. 2024, 86, 943–949. [Google Scholar] [CrossRef] [PubMed]
Jain, S.; Safo, S.E. A deep learning pipeline for cross-sectional and longitudinal multiview data integration. arXiv 2023, arXiv:2312.01238. Available online: https://arxiv.org/abs/2312.01238 (accessed on 2 December 2023).
Huang, S.C.; Pareek, A.; Seyyedi, S.; Banerjee, I.; Lungren, M.P. Fusion of medical imaging and electronic health records using deep learning: A systematic review and implementation guidelines. npj Digit. Med. 2020, 3, 136. [Google Scholar] [CrossRef] [PubMed]
Rajkomar, A.; Oren, E.; Chen, K.; Dai, A.M.; Hajaj, N.; Hardt, M.; Liu, P.J.; Liu, X.; Marcus, J.; Sun, M.; et al. Scalable and accurate deep learning with electronic health records. npj Digit. Med. 2018, 1, 18. [Google Scholar] [CrossRef] [PubMed]
Kannry, J.L.; Williams, M.S. Integration of genomics into the electronic health record: Mapping terra incognita. Genet. Med. 2013, 15, 757–760. [Google Scholar] [CrossRef] [PubMed]
Chen, X.; Bhadani, R.; Sun, Z.; Head, L. MSMA: Multi-agent Trajectory Prediction in Connected and Autonomous Vehicle Environment with Multi-source Data Integration. arXiv 2024, arXiv:2407.21310. Available online: https://arxiv.org/abs/2407.21310 (accessed on 31 July 2024).
Saeed, M.K.; Al Mazroa, A.; Alghamdi, B.M.; Alallah, F.S.; Alshareef, A.; Mahmud, A. Predictive analytics of complex healthcare systems using deep learning based disease diagnosis model. Sci. Rep. 2024, 14, 27497. [Google Scholar] [CrossRef] [PubMed]
Maleki Varnosfaderani, S.; Forouzanfar, M. The Role of AI in Hospitals and Clinics: Transforming Healthcare in the 21st Century. Bioengineering 2024, 11, 337. [Google Scholar] [CrossRef] [PubMed]
Capurro, N.; Pastore, V.P.; Touijer, L.; Odone, F.; Cozzani, E.; Gasparini, G.; Parodi, A. A deep learning approach to direct immunofluorescence pattern recognition in autoimmune bullous diseases. Br. J. Dermatol. 2024, 191, 261–266. [Google Scholar] [CrossRef] [PubMed]
Dentamaro, V.; Impedovo, D.; Musti, L.; Pirlo, G.; Taurisano, P. Enhancing early Parkinson’s disease detection through multimodal deep learning and explainable AI: Insights from the PPMI database. Sci. Rep. 2024, 14, 20941. [Google Scholar] [CrossRef] [PubMed]
Swinckels, L.; Bennis, F.C.; Ziesemer, K.A.; Scheerman, J.F.M.; Bijwaard, H.; de Keijzer, A.; Bruers, J.J. The Use of Deep Learning and Machine Learning on Longitudinal Electronic Health Records for the Early Detection and Prevention of Diseases: Scoping Review. J. Med. Internet Res. 2024, 2, e48320. [Google Scholar] [CrossRef] [PubMed]
Niu, Y.; Li, J.; Xu, X.; Luo, P.; Liu, P.; Wang, J.; Mu, J. Deep learning-driven ultrasound-assisted diagnosis: Optimizing GallScopeNet for precise identification of biliary atresia. Front. Med. 2024, 1, 1445069. [Google Scholar] [CrossRef] [PubMed]
Sendak, M.P.; Ratliff, W.; Sarro, D.; Alderton, E.; Futoma, J.; Gao, M.; Nichols, M.; Revoir, M.; Yashar, F.; Miller, C.; et al. Real-World Integration of a Sepsis Deep Learning Technology Into Routine Clinical Care: Implementation Study. JMIR Med. Inform. 2020, 8, e15182. [Google Scholar] [CrossRef] [PubMed]
Reddy, S.; Shaheed, A.; Seo, Y.; Patel, R. Development of an Artificial Intelligence Model for the Classification of Gastric Carcinoma Stages Using Pathology Slides. Cureus 2024, 16, e56740. [Google Scholar] [CrossRef] [PubMed]
Tolkach, Y.; Wolgast, L.M.; Damanakis, A.; Pryalukhin, A.; Schallenberg, S.; Hulla, W.; Eich, M.L.; Schroeder, W.; Mukhopadhyay, A.; Fuchs, M.; et al. Artificial intelligence for tumour tissue detection and histological regression grading in oesophageal adenocarcinomas: A retrospective algorithm development and validation study. Lancet. Digit. Health 2023, 5, e265–e275. [Google Scholar] [CrossRef] [PubMed]
Asadi-Aghbolaghi, M.; Darbandsari, A.; Zhang, A.; Contreras-Sanz, A.; Boschman, J.; Ahmadvand, P.; Köbel, M.; Farnell, D.; Huntsman, D.G.; Churg, A.; et al. Learning generalizable AI models for multi-center histopathology image classification. npj Precis. Oncol. 2024, 8, 151. [Google Scholar] [CrossRef] [PubMed]
Acs, B.; Rantalainen, M.; Hartman, J. Artificial intelligence as the next step towards precision pathology. J. Intern. Med. 2020, 88, 62–81. [Google Scholar] [CrossRef] [PubMed]
Fell, C.; Mohammadi, M.; Morrison, D.; Arandjelović, O.; Syed, S.; Konanahalli, P.; Bell, S.; Bryson, G.; Harrison, D.J.; Harris-Birtill, D. Detection of malignancy in whole slide images of endometrial cancer biopsies using artificial intelligence. PLoS ONE 2023, 18, e0282577. [Google Scholar] [CrossRef] [PubMed]
Broggi, G.; Maniaci, A.; Lentini, M.; Palicelli, A.; Zanelli, M.; Zizzo, M.; Koufopoulos, N.; Salzano, S.; Mazzucchelli, M.; Caltabiano, R. Artificial Intelligence in Head and Neck Cancer Diagnosis: A Comprehensive Review with Emphasis on Radiomics, Histopathological, and Molecular Applications. Cancers 2024, 16, 3623. [Google Scholar] [CrossRef] [PubMed]
Ali, O.; Ali, H.; Ali, S.A.; Shahzad, A. Implementation of a Modified U-Net for Medical Image Segmentation on Edge Devices. arXiv 2022. Available online: https://arxiv.org/abs/2206.02358 (accessed on 31 July 2024).
Khalighi, S.; Reddy, K.; Midya, A.; Pandav, K.B.; Madabhushi, A.; Abedalthagafi, M. Artificial intelligence in neuro-oncology: Advances and challenges in brain tumor diagnosis, prognosis, and precision treatment. npj Precis. Oncol. 2024, 8, 80. [Google Scholar] [CrossRef] [PubMed]
Yu, Y.; Niu, J.; Yu, Y.; Xia, S.; Sun, S. AI predictive modeling of survival outcomes for renal cancer patients undergoing targeted therapy. Sci. Rep. 2024, 14, 26156. [Google Scholar] [CrossRef] [PubMed]
Li, J.; Wu, J.; Zhao, Z.; Zhang, Q.; Shao, J.; Wang, C.; Qiu, Z.; Li, W. Artificial intelligence-assisted decision making for prognosis and drug efficacy prediction in lung cancer patients: A narrative review. J. Thorac. Dis. 2021, 13, 7021–7033. [Google Scholar] [CrossRef] [PubMed]
Cowley, H.P.; Natter, M.; Gray-Roncal, K.; Rhodes, R.E.; Johnson, E.C.; Drenkow, N.; Shead, T.M.; Chance, F.S.; Wester, B.; Gray-Roncal, W. A framework for rigorous evaluation of human performance in human and machine learning comparison studies. Sci. Rep. 2022, 12, 5444. [Google Scholar] [CrossRef] [PubMed]
Dayan, B. Lung Disease Detection with Vision Transformers: A Comparative Study of Machine Learning Methods. arXiv 2024, arXiv:2411.11376. Available online: https://doi.org/10.48550/arXiv.2411.11376 (accessed on 18 November 2024).
Dai, L.; Wu, L.; Li, H.; Cai, C.; Wu, Q.; Kong, H.; Liu, R.; Wang, X.; Hou, X.; Liu, Y.; et al. A deep learning system for detecting diabetic retinopathy across the disease spectrum. Nat. Commun. 2021, 12, 3242. [Google Scholar] [CrossRef] [PubMed]
Ai, Z.; Huang, X.; Fan, Y.; Feng, J.; Zeng, F.; Lu, Y. DR-IIXRN: Detection Algorithm of Diabetic Retinopathy Based on Deep Ensemble Learning and Attention Mechanism. Front Neuroinform. 2021, 15, 778552. [Google Scholar] [CrossRef] [PubMed]
Mursch-Edlmayr, A.S.; Ng, W.S.; Diniz-Filho, A.; Sousa, D.C.; Arnold, L.; Schlenker, M.B.; Duenas-Angeles, K.; Keane, P.A.; Crowston, J.G.; Jayaram, H. Artificial Intelligence Algorithms to Diagnose Glaucoma and Detect Glaucoma Progression: Translation to Clinical Practice. Transl. Vis. Sci. Technol. 2020, 9, 55. [Google Scholar] [CrossRef] [PubMed]
Zeppieri, M.; Gardini, L.; Culiersi, C.; Fontana, L.; Musa, M.; D’Esposito, F.; Surico, P.L.; Gagliano, C.; Sorrentino, F.S. Novel Approaches for the Early Detection of Glaucoma Using Artificial Intelligence. Life 2024, 14, 1386. [Google Scholar] [CrossRef] [PubMed]
Hussain, S.; Chua, J.; Wong, D.; Lo, J.; Kadziauskiene, A.; Asoklis, R.; Barbastathis, G.; Schmetterer, L.; Yong, L. Predicting glaucoma progression using deep learning framework guided by generative algorithm. Sci. Rep. 2023, 13, 19960. [Google Scholar] [CrossRef] [PubMed]
Khanal, B.; Poudel, P.; Chapagai, A.; Regmi, B.; Pokhrel, S.; Khanal, S.E. Paddy Disease Detection and Classification Using Computer Vision Techniques: A Mobile Application to Detect Paddy Disease. arXiv 2024, arXiv:2412.05996. Available online: https://arxiv.org/abs/2412.05996 (accessed on 8 December 2024).
Mustafa, Z.; Nsour, H. Using Computer Vision Techniques to Automatically Detect Abnormalities in Chest X-rays. Diagnostics 2023, 13, 2979. [Google Scholar] [CrossRef] [PubMed]
Natarajan, S.; Chakrabarti, P.; Margala, M. Robust diagnosis and meta visualizations of plant diseases through deep neural architecture with explainable AI. Sci. Rep. 2024, 14, 13695. [Google Scholar] [CrossRef] [PubMed]
Kundu, R.; Das, R.; Geem, Z.W.; Han, G.T.; Sarkar, R. Pneumonia detection in chest X-ray images using an ensemble of deep learning models. PLoS ONE 2021, 16, e0256630. [Google Scholar] [CrossRef] [PubMed]
Singh, S.; Kumar, M.; Kumar, A.; Verma, B.K.; Shitharth, S. Pneumonia detection with QCSA network on chest X-ray. Sci. Rep. 2023, 13, 9025. [Google Scholar] [CrossRef] [PubMed]
Reshan, M.S.A.; Gill, K.S.; Anand, V.; Gupta, S.; Alshahrani, H.; Sulaiman, A.; Shaikh, A. Detection of Pneumonia from Chest X-ray Images Utilizing MobileNet Model. Healthcare 2023, 11, 1561. [Google Scholar] [CrossRef] [PubMed]
Singh, S.; Kumar, M.; Kumar, A.; Verma, B.K.; Abhishek, K.; Selvarajan, S. Efficient pneumonia detection using Vision Transformers on chest X-rays. Sci. Rep. 2024, 14, 2487. [Google Scholar] [CrossRef] [PubMed]
Salehi, M.; Mohammadi, R.; Ghaffari, H.; Sadighi, N.; Reiazi, R. Automated detection of pneumonia cases using deep transfer learning with paediatric chest X-ray images. Br. J. Radiol. 2021, 94, 20201263. [Google Scholar] [CrossRef] [PubMed]
Liu, M.; Wu, J.; Wang, N.; Zhang, X.; Bai, Y.; Guo, J.; Zhang, L.; Liu, S.; Tao, K. The value of artificial intelligence in the diagnosis of lung cancer: A systematic review and meta-analysis. PLoS ONE 2023, 18, e0273445. [Google Scholar] [CrossRef] [PubMed]
Nakao, T.; Hanaoka, S.; Nomura, Y.; Sato, I.; Nemoto, M.; Miki, S.; Maeda, E.; Yoshikawa, T.; Hayashi, N.; Abe, O. Deep neural network-based computer-assisted detection of cerebral aneurysms in MR angiography. J. Magn. Reson. Imaging 2018, 47, 948–953. [Google Scholar] [CrossRef] [PubMed]
Shimada, Y.; Tanimoto, T.; Nishimori, M.; Choppin, A.; Meir, A.; Ozaki, A.; Higuchi, A.; Kosaka, M.; Shimahara, Y.; Kitamura, N. Incidental cerebral aneurysms detected by a computer-assisted detection system based on artificial intelligence: A case series. Medicine 2020, 99, e21518. [Google Scholar] [CrossRef] [PubMed]
Kuwabara, M.; Ikawa, F.; Sakamoto, S.; Okazaki, T.; Ishii, D.; Hosogai, M.; Maeda, Y.; Chiku, M.; Kitamura, N.; Choppin, A.; et al. Effectiveness of tuning an artificial intelligence algorithm for cerebral aneurysm diagnosis: A study of 10,000 consecutive cases. Sci. Rep. 2023, 13, 16202. [Google Scholar] [CrossRef] [PubMed]
Xu, Y.; Li, X.; Jie, Y.; Tan, H. Simultaneous Tri-Modal Medical Image Fusion and Super-Resolution using Conditional Diffusion Model. arXiv 2024, arXiv:2404.17357. Available online: https://arxiv.org/abs/2404.17357 (accessed on 15 October 2024).
Richens, J.G.; Lee, C.M.; Johri, S. Improving the accuracy of medical diagnosis with causal machine learning. Nat. Commun. 2020, 11, 3923. [Google Scholar] [CrossRef] [PubMed]
Olveres, J.; González, G.; Torres, F.; Moreno-Tagle, J.C.; Carbajal-Degante, E.; Valencia-Rodríguez, A.; Méndez-Sánchez, N.; Escalante-Ramírez, B. What is new in computer vision and artificial intelligence in medical image analysis applications. Quant. Imaging Med. Surg. 2021, 11, 3830–3853. [Google Scholar] [CrossRef] [PubMed]
Essa, H.A.; Ismaiel, E.; Hinnawi, M.F.A. Feature-based detection of breast cancer using convolutional neural network and feature engineering. Sci. Rep. 2024, 14, 22215. [Google Scholar] [CrossRef] [PubMed]
Luo, W.; Li, W.; Urtasun, R.; Zemel, R. Understanding the Effective Receptive Field in Deep Convolutional Neural Networks. arXiv 2017, arXiv:1701.04128. Available online: https://arxiv.org/abs/1701.04128 (accessed on 15 January 2017).
Qian, Z.; Hayes, T.L.; Kafle, K.; Kanan, C. Do We Need Fully Connected Output Layers in Convolutional Networks? arXiv 2020, arXiv:2004.13587. Available online: https://arxiv.org/abs/2004.13587 (accessed on 29 April 2020).
Yang, Y.; Zhang, L.; Du, M.; Bo, J.; Liu, H.; Ren, L.; Li, X.; Deen, M.J. A comparative analysis of eleven neural networks architectures for small datasets of lung images of COVID-19 patients toward improved clinical decisions. Comput. Biol. Med. 2021, 139, 104887. [Google Scholar] [CrossRef] [PubMed]
Cuevas-Rodriguez, E.O.; Galvan-Tejada, C.E.; Maeda-Gutiérrez, V.; Moreno-Chávez, G.; Galván-Tejada, J.I.; Gamboa-Rosales, H.; Luna-García, H.; Moreno-Baez, A.; Celaya-Padilla, J.M. Comparative study of convolutional neural network architectures for gastrointestinal lesions classification. PeerJ. 2023, 11, e14806. [Google Scholar] [CrossRef] [PubMed]
Davila, A.; Colan, J.; Hasegawa, Y. Comparison of fine-tuning strategies for transfer learning in medical image classification. arXiv 2024, arXiv:2406.10050. Available online: https://arxiv.org/abs/2406.10050 (accessed on 14 June 2024). [CrossRef]
Mortazi, A.; Bagci, U. Automatically Designing CNN Architectures for Medical Image Segmentation. arXiv 2018, arXiv:1807.07663. Available online: https://arxiv.org/abs/1807.07663 (accessed on 19 July 2018).
Badrinarayanan, V.; Kendall, A.; Cipolla, R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. arXiv 2015, arXiv:1511.00561. Available online: https://arxiv.org/abs/1511.00561 (accessed on 2 November 2015). [CrossRef] [PubMed]
Basyal, G.P.; Zeng, D.; Rimal, B.P. Development of CNN Architectures using Transfer Learning Methods for Medical Image Classification. arXiv 2024, arXiv:2410.16711. Available online: https://arxiv.org/abs/2410.16711 (accessed on 22 October 2024).
Khan, M.A.; Auvee, R.B.Z. Comparative Analysis of Resource-Efficient CNN Architectures for Brain Tumor Classification. arXiv 2024, arXiv:2411.15596. Available online: https://arxiv.org/abs/2411.15596 (accessed on 23 November 2024).
Chollet, F. Xception: Deep Learning with Depthwise Separable Convolutions. arXiv 2016, arXiv:1610.02357. Available online: https://arxiv.org/abs/1610.02357 (accessed on 7 October 2016).
El-Assy, A.M.; Amer, H.M.; Ibrahim, H.M.; Mohamed, M.A. A novel CNN architecture for accurate early detection and classification of Alzheimer’s disease using MRI data. Sci. Rep. 2024, 14, 3463. [Google Scholar] [CrossRef] [PubMed]
Yu, G.; Sun, K.; Xu, C.; Shi, X.H.; Wu, C.; Xie, T.; Meng, R.Q.; Meng, X.H.; Wang, K.S.; Xiao, H.M.; et al. Accurate recognition of colorectal cancer with semi-supervised deep learning on pathological images. Nat. Commun. 2021, 12, 6311. [Google Scholar] [CrossRef] [PubMed]
Kayhan, O.S.; van Gemert, J.C. On Translation Invariance in CNNs: Convolutional Layers can Exploit Absolute Spatial Location. arXiv 2020, arXiv:2003.07064. Available online: https://arxiv.org/abs/2003.07064 (accessed on 30 May 2020).
Johnston, W.J.; Fusi, S. Abstract representations emerge naturally in neural networks trained to perform multiple tasks. Nat Commun. 2023, 14, 1040. [Google Scholar] [CrossRef] [PubMed]
Xie, Y.; Raga, R.C., Jr. Convolutional Neural Networks for Sentiment Analysis on Weibo Data: A Natural Language Processing Approach. arXiv 2023, arXiv:2307.06540. Available online: https://arxiv.org/abs/2307.06540 (accessed on 13 July 2023).
Kim, H.; Jeong, Y.-S. Sentiment Classification Using Convolutional Neural Networks. Appl. Sci. 2019, 9, 2347. [Google Scholar] [CrossRef]
Pook, T.; Freudenthal, J.; Korte, A.; Simianer, H. Using Local Convolutional Neural Networks for Genomic Prediction. Front. Genet. 2020, 11, 561497. [Google Scholar] [CrossRef] [PubMed]
Vaz, J.M.; Balaji, S. Convolutional neural networks (CNNs): Concepts and applications in pharmacogenomics. Mol. Divers. 2021, 25, 1569–1584. [Google Scholar] [CrossRef] [PubMed]
Kulikova, A.V.; Diaz, D.J.; Loy, J.M.; Ellington, A.D.; Wilke, C.O. Learning the local landscape of protein structures with convolutional neural networks. J. Biol. Phys. 2021, 47, 435–454. [Google Scholar] [CrossRef] [PubMed]
Thompson, N.C.; Greenewald, K.; Lee, K.; Manso, G.F. The Computational Limits of Deep Learning. arXiv 2024, arXiv:2007.05558. Available online: https://arxiv.org/abs/2007.05558 (accessed on 27 July 2022).
Ying, N.; Lei, Y.; Zhang, T.; Lyu, S.; Li, C.; Chen, S.; Liu, Z.; Zhao, Y.; Zhang, G. CPIA Dataset: A Comprehensive Pathological Image Analysis Dataset for Self-supervised Learning Pre-training. arXiv 2023, arXiv:2310.17902. Available online: https://arxiv.org/abs/2310.17902 (accessed on 27 October 2023).
Rodriguez, J.P.M.; Rodriguez, R.; Silva, V.W.K.; Kitamura, F.C.; Corradi, G.C.A.; de Marchi, A.C.B.; Rieder, R. Artificial intelligence as a tool for diagnosis in digital pathology whole slide images: A systematic review. J. Pathol. Inform. 2022, 13, 100138. [Google Scholar] [CrossRef] [PubMed]
Ding, K.; Zhou, M.; Wang, H.; Gevaert, O.; Metaxas, D.; Zhang, S. A Large-scale Synthetic Pathological Dataset for Deep Learning-enabled Segmentation of Breast Cancer. Sci. Data 2023, 10, 231. [Google Scholar] [CrossRef] [PubMed]
Jiménez-Sánchez, A.; Avlona, N.-R.; Juodelyte, D.; Sourget, T.; Vang-Larsen, C.; Rogers, A.; Zając, H.D.; Cheplygina, V. Copycats: The many lives of a publicly available medical imaging dataset. arXiv 2024, arXiv:2402.06353. Available online: https://arxiv.org/abs/2402.06353 (accessed on 9 February 2024).
Xu, H.; Usuyama, N.; Bagga, J.; Zhang, S.; Rao, R.; Naumann, T.; Wong, C.; Gero, Z.; González, J.; Gu, Y.; et al. A whole-slide foundation model for digital pathology from real-world data. Nature 2024, 630, 181–188. [Google Scholar] [CrossRef] [PubMed]
Cong, C.; Xuan, S.; Liu, S.; Pagnucco, M.; Zhang, S.; Song, Y. Dataset Distillation for Histopathology Image Classification. arXiv 2024, arXiv:2408.09709. Available online: https://arxiv.org/abs/2408.09709 (accessed on 19 August 2024).
Hilgers, L.; Ghaffari Laleh, N.; West, N.P.; Westwood, A.; Hewitt, K.J.; Quirke, P.; Grabsch, H.I.; Carrero, Z.I.; Matthaei, E.; Loeffler, C.M.L.; et al. Automated curation of large-scale cancer histopathology image datasets using deep learning. Histopathology 2024, 84, 1139–1153. [Google Scholar] [CrossRef] [PubMed]
Haghighat, M.; Browning, L.; Sirinukunwattana, K.; Malacrino, S.; Khalid Alham, N.; Colling, R.; Cui, Y.; Rakha, E.; Hamdy, F.C.; Verrill, C.; et al. Automated quality assessment of large digitised histology cohorts by artificial intelligence. Sci. Rep. 2022, 12, 5002. [Google Scholar] [CrossRef] [PubMed]
Homeyer, A.; Geißler, C.; Schwen, L.O.; Zakrzewski, F.; Evans, T.; Strohmenger, K.; Westphal, M.; Bülow, R.D.; Kargl, M.; Karjauv, A.; et al. Recommendations on compiling test datasets for evaluating artificial intelligence solutions in pathology. Mod. Pathol. 2022, 35, 1759–1769. [Google Scholar] [CrossRef] [PubMed]
Ruiz-Casado, J.L.; Molina-Cabello, M.A.; Luque-Baena, R.M. Enhancing Histopathological Image Classification Performance through Synthetic Data Generation with Generative Adversarial Networks. Sensors 2024, 24, 3777. [Google Scholar] [CrossRef] [PubMed]
Usui, K.; Ogawa, K.; Goto, M.; Sakano, Y.; Kyougoku, S.; Daida, H. Quantitative evaluation of deep convolutional neural network-based image denoising for low-dose computed tomography. Vis. Comput. Ind. Biomed. Art 2021, 4, 21. [Google Scholar] [CrossRef] [PubMed]
Allier, C.; Hervé, L.; Paviolo, C.; Mandula, O.; Cioni, O.; Pierré, W.; Andriani, F.; Padmanabhan, K.; Morales, S. CNN-based cell analysis: From image to quantitative representation. Front. Phys. 2022, 9, id.848. [Google Scholar] [CrossRef]
Tam, T.Y.C.; Liang, L.; Chen, K.; Wang, H.; Wu, W. A Quantitative Approach for Evaluating Disease Focus and Interpretability of Deep Learning Models for Alzheimer’s Disease Classification. arXiv 2024, arXiv:2409.04888. Available online: https://doi.org/10.48550/arXiv.2409.04888 (accessed on 7 September 2024).
Nakagawa, S.; Ono, N.; Hakamata, Y.; Ishii, T.; Saito, A.; Yanagimoto, S.; Kanaya, S. Quantitative evaluation model of variable diagnosis for chest X-ray images using deep learning. PLoS Digit. Health 2024, 3, e0000460. [Google Scholar] [CrossRef] [PubMed]
Amerikanos, P.; Maglogiannis, I. Image Analysis in Digital Pathology Utilizing Machine Learning and Deep Neural Networks. J. Pers. Med. 2022, 12, 1444. [Google Scholar] [CrossRef] [PubMed]
Redlich, J.-P.; Feuerhake, F.; Weis, J.; Schaadt, N.S.; Teuber-Hanselmann, S.; Buck, C.; Luttmann, S.; Eberle, A.; Nikolin, S.; Appenzeller, A.; et al. Applications of artificial intelligence in the analysis of histopathology images of gliomas: A review. arXiv 2024, arXiv:2401.15022. Available online: https://arxiv.org/abs/2401.15022 (accessed on 26 January 2024). [CrossRef]
Talo, M. Automated Classification of Histopathology Images Using Transfer Learning. arXiv 2019, arXiv:1903.10035. Available online: https://arxiv.org/abs/1903.10035 (accessed on 24 March 2019). [CrossRef]
Sekhar, A.; Gupta, R.K.; Sethi, A. Few-Shot Histopathology Image Classification: Evaluating State-of-the-Art Methods and Unveiling Performance Insights. arXiv 2024, arXiv:2408.13816. Available online: https://arxiv.org/abs/2408.13816 (accessed on 25 August 2024).
Shafi, S.; Parwani, A.V. Artificial intelligence in diagnostic pathology. Diagn. Pathol. 2023, 18, 109. [Google Scholar] [CrossRef] [PubMed]
Tsuneki, M. Editorial on Special Issue “Artificial Intelligence in Pathological Image Analysis”. Diagnostics 2023, 13, 828. [Google Scholar] [CrossRef] [PubMed]
Song, A.H.; Jaume, G.; Williamson, D.F.K.; Lu, M.Y.; Vaidya, A.; Miller, T.R.; Mahmood, F. Artificial Intelligence for Digital and Computational Pathology. arXiv 2023, arXiv:2401.06148. Available online: https://arxiv.org/abs/2401.06148 (accessed on 13 December 2023). [CrossRef]
Byeon, S.J.; Park, J.; Cho, Y.A.; Cho, B.J. Automated histological classification for digital pathology images of colonoscopy specimen via deep learning. Sci. Rep. 2022, 12, 12804. [Google Scholar] [CrossRef] [PubMed]
Li, J.; Cheng, J.; Meng, L.; Yan, H.; He, Y.; Shi, H.; Guan, T.; Han, A. DeepTree: Pathological Image Classification Through Imitating Tree-Like Strategies of Pathologists. IEEE Trans. Med. Imaging 2024, 43, 1501–1512. [Google Scholar] [CrossRef] [PubMed]
Esteva, A.; Chou, K.; Yeung, S.; Naik, N.; Madani, A.; Mottaghi, A.; Liu, Y.; Topol, E.; Dean, J.; Socher, R. Deep learning-enabled medical computer vision. npj Digit. Med. 2021, 4, 5. [Google Scholar] [CrossRef] [PubMed]
McGenity, C.; Clarke, E.L.; Jennings, C.; Matthews, G.; Cartlidge, C.; Freduah-Agyemang, H.; Stocken, D.D.; Treanor, D. Artificial intelligence in digital pathology: A systematic review and meta-analysis of diagnostic test accuracy. npj Digit. Med. 2024, 7, 114. [Google Scholar] [CrossRef] [PubMed]
Yang, J.; Soltan, A.A.S.; Clifton, D.A. Machine learning generalizability across healthcare settings: Insights from multi-site COVID-19 screening. npj Digit. Med. 2022, 5, 69. [Google Scholar] [CrossRef] [PubMed]
Ada, S.E.; Ugur, E.; Akin, H.L. Generalization in Transfer Learning. arXiv 2021, arXiv:1909.01331. Available online: https://arxiv.org/abs/1909.01331 (accessed on 22 February 2021).
Ferber, D.; Wölflein, G.; Wiest, I.C.; Ligero, M.; Sainath, S.; Ghaffari Laleh, N.; El Nahhas, O.S.M.; Müller-Franzes, G.; Jäger, D.; Truhn, D.; et al. In-context learning enables multimodal large language models to classify cancer pathology images. Nat. Commun. 2024, 15, 10104. [Google Scholar] [CrossRef] [PubMed]
Hanif, A.M.; Beqiri, S.; Keane, P.A.; Campbell, J.P. Applications of interpretability in deep learning models for ophthalmology. Curr. Opin. Ophthalmol. 2021, 32, 452–458. [Google Scholar] [CrossRef] [PubMed]
Liang, M.; Chen, Q.; Li, B.; Wang, L.; Wang, Y.; Zhang, Y.; Wang, R.; Jiang, X.; Zhang, C. Interpretable classification of pathology whole-slide images using attention based context-aware graph convolutional neural network. Comput. Methods Programs Biomed. 2023, 229, 107268. [Google Scholar] [CrossRef] [PubMed]
Tempel, F.; Groos, D.; Ihlen, A.F.E.; Adde, L.; Strümke, I. Choose Your Explanation: A Comparison of SHAP and GradCAM in Human Activity Recognition. arXiv 2024, arXiv:2412.16003. Available online: https://arxiv.org/abs/2412.16003 (accessed on 20 December 2024).
Brussee, S.; Buzzanca, G.; Schrader, A.M.R.; Kers, J. Graph Neural Networks in Histopathology: Emerging Trends and Future Directions. arXiv 2024, arXiv:2406.12808. Available online: https://arxiv.org/abs/2406.12808 (accessed on 21 June 2024). [CrossRef] [PubMed]
Chang, J.; Hatfield, B. Advancements in computer vision and pathology: Unraveling the potential of artificial intelligence for precision diagnosis and beyond. Adv. Cancer Res. 2024, 161, 431–478. [Google Scholar] [CrossRef] [PubMed]
Doğan, R.S.; Yılmaz, B. Histopathology image classification: Highlighting the gap between manual analysis and AI automation. Front. Oncol. 2024, 13, 1325271. [Google Scholar] [CrossRef] [PubMed]
Pan, Y.; Gou, F.; Xiao, C.; Liu, J.; Zhou, J. Semi-supervised recognition for artificial intelligence assisted pathology image diagnosis. Sci. Rep. 2024, 14, 21984. [Google Scholar] [CrossRef] [PubMed]
Gómez-de-Mariscal, E.; Del Rosario, M.; Pylvänäinen, J.W.; Jacquemet, G.; Henriques, R. Harnessing artificial intelligence to reduce phototoxicity in live imaging. J. Cell Sci. 2024, 137, jcs261545. [Google Scholar] [CrossRef] [PubMed]
Mühlberg, A.; Ritter, P.; Langer, S.; Goossens, C.; Nübler, S.; Schneidereit, D.; Taubmann, O.; Denzinger, F.; Nörenberg, D.; Haug, M.; et al. SEMPAI: A Self-Enhancing Multi-Photon Artificial Intelligence for Prior-Informed Assessment of Muscle Function and Pathology. Adv. Sci. 2023, 10, e2206319. [Google Scholar] [CrossRef] [PubMed]
Mameed, M.A.S.; Qureshi, A.M.; Kaushik, A. Bias Mitigation via Synthetic Data Generation: A Review. Electronics 2024, 13, 3909. [Google Scholar] [CrossRef]
Yamaguchi, S. Generative Semi-supervised Learning with Meta-Optimized Synthetic Samples. arXiv 2023, arXiv:2309.16143. Available online: https://arxiv.org/abs/2309.16143 (accessed on 28 September 2023).
Lu, Y.; Shen, M.; Wang, M.; Wang, X.; van Rechem, C.; Fu, T.; Wei, W. Machine Learning for Synthetic Data Generation: A Review. arXiv 2023, arXiv:2302.04062. Available online: https://arxiv.org/abs/2302.04062 (accessed on 8 February 2023).
Goyal, M.; Mahmoud, Q. A Systematic Review of Synthetic Data Generation Techniques Using Generative AI. Electronics 2024, 13, 3509. [Google Scholar] [CrossRef]
Wang, Z.; Mao, J.; Xiang, L.; Yamasaki, T. From Obstacle to Opportunity: Enhancing Semi-supervised Learning with Synthetic Data. arXiv 2024, arXiv:2405.16930. Available online: https://arxiv.org/abs/2405.16930v1 (accessed on 27 May 2024).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Matsuzaka, Y.; Yashiro, R. The Diagnostic Classification of the Pathological Image Using Computer Vision. Algorithms 2025, 18, 96. https://doi.org/10.3390/a18020096

AMA Style

Matsuzaka Y, Yashiro R. The Diagnostic Classification of the Pathological Image Using Computer Vision. Algorithms. 2025; 18(2):96. https://doi.org/10.3390/a18020096

Chicago/Turabian Style

Matsuzaka, Yasunari, and Ryu Yashiro. 2025. "The Diagnostic Classification of the Pathological Image Using Computer Vision" Algorithms 18, no. 2: 96. https://doi.org/10.3390/a18020096

APA Style

Matsuzaka, Y., & Yashiro, R. (2025). The Diagnostic Classification of the Pathological Image Using Computer Vision. Algorithms, 18(2), 96. https://doi.org/10.3390/a18020096

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

The Diagnostic Classification of the Pathological Image Using Computer Vision

Abstract

1. Introduction

2. Applications of Deep Learning Approaches for Diagnostic Classification

3. How Does Computer Vision Improve the Accuracy of Disease Diagnosis

4. Architectures, Features, and Advantages of CNNs

5. The Datasets Referenced in Studies on the Diagnostic Classification of Pathological Images Using Computer Vision, Including Both Publicly Available and Proprietary Datasets

6. A Comprehensive Comparison of CNN Models and State-of-the-Art Methods for Images in Different Modalities for Different Diseases

7. Integration into Real-World Clinical Workflows

8. Interpretability, Regulatory Concerns, and Cost

8.1. Interpretability Is a Significant Challenge in AI-Based Pathological Image Analysis

8.2. AI Algorithms for Pathological Image Analysis Face Stringent Regulatory Requirements

8.3. The Implementation of AI in Pathological Image Analysis Can Be Expensive

9. Challenges and Future Directions

9.1. Advanced Neural Network Architectures

9.2. Multimodal Integration

9.3. Automated Feature Detection

9.4. Self-Supervised Learning

9.5. Explainable AI

10. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI