Models to Identify Small Brain White Matter Hyperintensity Lesions

Castillo, Darwin; Rodríguez-Álvarez, María José; Samaniego, René; Lakshminarayanan, Vasudevan

doi:10.3390/app15052830

Open AccessArticle

Models to Identify Small Brain White Matter Hyperintensity Lesions

by

Darwin Castillo

^1,2,3,*

,

María José Rodríguez-Álvarez

³

,

René Samaniego

⁴ and

Vasudevan Lakshminarayanan

^2,5,*

¹

Departamento de Química y Ciencias Exactas, Sección Fisicoquímica y Matemáticas, Universidad Técnica Particular de Loja, San Cayetano Alto s/n, Loja 11-01-608, Ecuador

²

Theoretical and Experimental Epistemology Lab, School of Optometry and Vision Science, University of Waterloo, Waterloo, ON N2L3G1, Canada

³

Instituto de Instrumentación para Imagen Molecular (i3M) Universitat Politècnica de València—Consejo Superior de Investigaciones Científicas (CSIC), E-46022 Valencia, Spain

⁴

Departamento de Radiología, Hospital UTPL, Loja 11-01-608, Ecuador

⁵

Departments of Physics, Electrical and Computer Engineering and Systems Design Engineering, University of Waterloo, Waterloo, ON N2L3G1, Canada

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2025, 15(5), 2830; https://doi.org/10.3390/app15052830

Submission received: 22 January 2025 / Revised: 21 February 2025 / Accepted: 23 February 2025 / Published: 6 March 2025

(This article belongs to the Special Issue MR-Based Neuroimaging)

Download

Browse Figures

Versions Notes

Abstract

Featured Application

This work explores diverse deep learning models (UNet, SAM, YOLOv8, and Detectron2) for detecting, segmenting, and classifying small brain lesions from MRI images, particularly White Matter Hyperintensities (WMH), which can significantly enhance clinical diagnostics for ischemic and demyelinating diseases.

Abstract

According to the World Health Organization (WHO), peripheral and central neurological disorders affect approximately one billion people worldwide. Ischemic stroke and Alzheimer’s Disease and other dementias are the second and fifth leading causes of death, respectively. In this context, detecting and classifying brain lesions constitute a critical area of research in medical image processing, significantly impacting clinical practice. Traditional lesion detection, segmentation, and feature extraction methods are time-consuming and observer-dependent. In this sense, research in the machine and deep learning methods applied to medical image processing constitute one of the crucial tools for automatically learning hierarchical features to get better accuracy, quick diagnosis, treatment, and prognosis of diseases. This project aims to develop and implement deep learning models for detecting and classifying small brain White Matter hyperintensities (WMH) lesions in magnetic resonance images (MRI), specifically lesions concerning ischemic and demyelination diseases. The methods applied were the UNet and Segmenting Anything model (SAM) for segmentation, while YOLOV8 and Detectron2 (based on MaskRCNN) were also applied to detect and classify the lesions. Experimental results show a Dice coefficient (DSC) of 0.94, 0.50, 0.241, and 0.88 for segmentation of WMH lesions using the UNet, SAM, YOLOv8, and Detectron2, respectively. The Detectron2 model demonstrated an accuracy of 0.94 in detecting and 0.98 in classifying lesions, including small lesions where other models often fail. The methods developed give an outline for the detection, segmentation, and classification of small and irregular morphology brain lesions and could significantly aid clinical diagnostics, providing reliable support for physicians and improving patient outcomes.

Keywords:

MRI images; ischemia; demyelination; deep learning; stroke; WMH; classification; detection

1. Introduction

Neurological disorders affect as many as a billion people worldwide [1], with Ischemic stroke and Alzheimer’s disease and other dementias being ranked the second and fifth significant causes of death, respectively.

White Matter Hyperintensities (WMHs) are lesions localized in the periventricular and deep white matter [2,3,4,5]. They are a hallmark of brain lesions associated with vascular and neuroinflammatory conditions, including ischemic damage, demyelination, and axonal loss [5,6].

Ischemic stroke accounts for 87% of all strokes and occurs due to an obstruction or blockage of a vessel of small caliber (10 to 400 μm) that irrigates the brain. There are two categories of stroke: ischemic and hemorrhagic [6]. Hemorrhagic strokes are caused by bleeding into brain tissue when a blood vessel bursts [7,8]. According to [9], cerebral ischemia has a temporal continuum that begins with hyperacute symptoms and progresses through acute, subacute, and chronic stages.

Meanwhile, Demyelination disorders are associated with the WMHs that cause destruction or loss of the myelin sheath and loss of myelin-supporting cells such as oligodendrocytes and Schwann cells in the central and peripheral nervous system [10]. Myelin is a laminated membrane that concentrically and repeatedly surrounds axons with a radius of approximately 12 nm; it insulates peripheral nerves, nerves in the brain, spinal cord, and eyes [11] and it allows nerves to send signals or conduct electrical impulses typically and efficiently. The damage of myelin compromises neural transmission and could cause neurological deficits, such as vision changes, weakness, altered sensation, and cognitive problems [12], which are also the principal causes of multiple sclerosis (MS) [13].

Medical imaging represents one fundamental role in identifying, diagnosing, and monitoring brain lesions, helping clinicians distinguish between different types of lesions associated with various conditions. Particularly, WMHs can be seen through Magnetic Resonance Imaging (MRI), which allows the identification and quantification of microvascular lesions and silent ischemic strokes [14]. Among different MRI modalities, FLAIR (Fluid Attenuated Inversion Recovery) imaging [5,15,16] constitutes one of the most effective in detecting small WMH lesions related to ischemia and demyelination [17,18,19,20]. However, accurately distinguishing these two types of lesions is a complex task, especially for an inexperienced radiologist, due to overlapping intensities and morphological variations in the FLAIR MR scans.

In this sense, the traditional methods of lesion detection, segmentation, classification, or manual annotations in medical images have issues due to the high observer variability, many time-consuming tasks, interpreting, and limited generalization across patient datasets [20,21]. These techniques often fail to capture small or irregularly shaped lesions, especially in conditions like ischemia and demyelination [22].

Research in artificial intelligence (AI) and its methods of image processing, like deep learning (DL) applied to medicine, constitutes one of the crucial tools for helping to get better and quick diagnosis, treatment, and prognosis of diseases [23]. Particularly concerning brain disorders, it allows optimization of the time-consuming tasks of detecting, segmenting, classification, lesion localization, analyzing complex brain imaging data, and generalizing better [20]. Moreover, the adaptability of DL models through fine-tuning and transfer learning enables robust performance even in datasets with limited annotations [24].

However, despite the advancements in AI for medical imaging, few studies have focused on distinguishing ischemic stroke-related WMHs from demyelination-related WMHs using deep learning (DL) methods, principally because detecting small and subtle WMH lesions presents a significant challenge, especially in automated image analysis. These lesions’ shape, size, and intensity variability further complicate detection, increasing the risk of misdiagnosis or delayed intervention [25].

This variability is compounded by the limited availability of labeled medical imaging data and the lack of robust segmentation performance on small lesions, making automated segmentation and classification a complex task and struggling with classification accuracy for different lesion types.

This project aims to address these challenges by developing deep learning models for detecting, segmenting, and classifying small WMH lesions related to ischemic and demyelination diseases in FLAIR MRI. This task is crucial due to the similarity between these two diseases; a misdiagnosis by an untrained or inexperienced physician could lead to incorrect treatment. Therefore, this project seeks to provide the scientific and clinical community with a tool that assists in diagnosing these diseases, serving as a second opinion and as a training resource for identifying brain lesions.

Specifically, the project employs machine learning and deep learning techniques to understand lesion features and facilitate detection and classification. Given the amount of data available for algorithm development, several transfer learning approaches, classical data augmentation methods, and synthetic data augmentation methods were utilized. The methodology for detection and classification primarily involved the following models: (i) U-Net [26] and Segmenting Anything Model (SAM) [27] to explore the task of segmentation and (ii) YOLOv8 (You Look Only Once, version 8) [28] and Detectron2 (Mask R-CNN-based) [29] for detection and classification.

The proposed models were trained and tested using the 2D slices of each series of FLAIR MRI on a private dataset (80 volumes—200 images) and the MICCAI public dataset (110 volumes—2650 images). Due to the limitations of using a big dataset, were explored transfer learning and fine-tuning techniques to leverage pre-trained models from COCO (YOLO and Detectron2) and 1 billion masks and 11M images for SAM, adapting them for small medical datasets.

This paper is structured as follows. The next section contains a summary of related work concerning the small lesions WMHs, in this case, the ischemia and demyelination. The methodology section overviews the models and techniques applied to segment, detect, and differentiate the lesions. The results section shows the evaluation, comparisons, and metrics values of each model applied. Finally, a discussion and conclusions section of this work is presented.

2. Related Work

Recent AI-driven medical imaging advancements have led to improved segmentation, computer-aided diagnosis, classification, and lesion detection [30,31,32]. However, although several DL architectures have been proposed to address WMH segmentation and classification challenges, the manual segmentation and delineation of abnormal brain tissue is the gold standard for lesion identification, particularly for identifying neurological disorders like stroke and demyelinating lesions.

2.1. Deep Learning in Brain Lesion Segmentation

UNet is a fully convolutional network (FCN) architecture that consists of a contracting path (encoder) and an expansive path (decoder), which gives it a U-shaped structure that has been widely used in biomedical image segmentation [33] due to the network’s encoder–decoder structure enables precise boundary detection.

In this sense, Kumar et al. [22] proposed an enhanced combination of U-Net with fractal networks, which reported an accuracy of 0.9908 and a DSC value of 0.8993 for the stroke penumbra estimation (SPEC) and an accuracy of 0.9914 and a DSC of 0.883 for sub-acute stroke lesions (SISS), using public databases ISLES 2015 and ISLES 2017. Similarly, Clèrigues et al. [34], using the same database (ISLES), a U-Net-based 3D CNN architecture, and 32 filters, obtained values of DSC as 0.59 for SISS and 0.84 for SPEC.

Other studies have explored alternative architectures, such as a u-shaped residual network (uResNet) for WMH differentiation proposed by Guerrero et al. [35], yielding 69.5% DSC for WMHs and 40.0% DSC for ischemic stroke. Mitra et al. [36] reported a DSC of 0.60 for WMH lesions using the classical Random Forest (RF) classifier, highlighting the need for deep learning-based solutions. Likewise, Ghaforian et al. [37] determined a sensitivity of 0.73 using a combination of AdaBoost and RF algorithms for the WMHs concerning cerebral small-vessel disease (SVD).

With the goal of emphasizing relevant features and suppressing irrelevant ones to enhance segmentation accuracy, especially in regions with complex boundaries, [38] proposed a segmentation model of small demyelination lesions based on the implementation of Context Information Weighting Fusion (CIWF) and Modified Channel Attention (MCA) components into the classic UNet. The results showed a DSC: 73.81, precision: 75.28, recall: 74.84, and AUC: 87.28. The dataset used was a private dataset with FLAIR modality, including data on 204 pediatric patients.

Along similar lines, Zhou et al. [39] showed a U-Net segmentation for pixel-level structure information and deep convolutional networks to classify demyelinating diseases accurately. The reported results for precision, recall, accuracy, and AUC score values were 96.96%, 96.96%, 99.19%, and 96.66%, respectively. Also, a DSC of 71.1% was achieved for segmentation.

2.2. Emerging Models for Small Lesion Segmentation

An emerging model that could be useful for small lesion segmentation is the Segmenting Anything Model (SAM), launched in April 2023 and developed by META AI’s FAIR Lab [27]. It is a vision transformer (ViT)-based segmentation model [27,40,41,42,43] trained on 1B masks from 11M images, with a primary emphasis on natural image data, renowned for its prominent edge details. It is a prompt-based architecture with a cutting-edge image segmentation model that allows prompt segmentation to generate precise object masks automatically, which means SAM receives points and boxes to indicate the target objects’ pixel-level semantics and region-level positions.

While SAM excels in natural image segmentation, its effectiveness in medical imaging is still under evaluation, as mentioned by Mazurowsk et al. [44], “SAM cannot be used the same way as most segmentation models in medical imaging where the input is simply an image and the output is a segmentation mask or multiple masks for the desired object or object” [44].

Mazurowski et al. [44] proposed three different methods of applying the SAM model to medical images: (i) semi-automated annotation, where the SAM could be a valuable tool for faster annotations and generation of masks; (ii) assisting other segmentation models, in this case, SAM could act as an “inference model” to help another algorithm automatically segment images and generate masks for a post-process of classification; and (iii) new medical image foundation segmentation models: this aspect refers to building a specific model from scratch guided by SAM’s development process.

Jun Ma et al. [40] proposed the MedSAM, a foundation model developed on a large-scale medical image dataset with 1,570,263 image-mask pairs, covering 10 imaging modalities and over 30 cancer types. Also, Huang et al. [45] tested SAM in a large medical image database called the COSMOS 1050K dataset, which is composed of different collections of public medical datasets. Their findings give a global vision of SAM in medical image segmentation; depending on the segmentation tasks, the model will be more or less effective. More detailed examples can be found in [40,41,44,46,47,48,49,50,51,52,53]. Zhang et al. [54] provide a repository that groups all recent research concerning SAM in medical images [55]. Several studies conclude that SAM in the medical field could improve the task of semi-automatic segmentation, which means more accuracy in the annotations and reduced time invested.

2.3. Object Detection and Classification Models in MRI

Object detection techniques also use transfer learning to detect and classify brain lesions or tumors. One of these models that use this approach was introduced in 2015 by J. Redmon et al. [56] in the YOLO model, which detects the objects within images through the bounding boxes and class probabilities trained with a neural network that treats the object detection as a regression problem. Also, according to [57], YOLO has higher performance in terms of speed and accuracy.

In the field of medical imaging, YOLO is employed in tasks of detecting and localizing, e.g., anatomical structures [58,59], tumors [57,60], lesions of breast cancer [61,62], fundus lesions [63], skin lesions [64], and cells detection [65]. Concerning brain MRI medical image segmentation, some recent studies are also related to treating brain tumors principally [57,66]. Ragab et al. [59] summarized the applications of YOLO in medical object detection from 2018 to 2023.

In this sense, research proposed by [66] using the YOLOv8 for brain cancer detection and localization reported precision equal to 0.943, a recall of 0.932, and a mAP_0.5 of 0.941. Another study [67] that also used the YOLOv7 model through transfer learning showed results of 99.5% accuracy for identifying the presence and precise location of brain tumors in MRI images.

Another model that could achieve high-precision segmentation and object detection is the Detectron2 model, which is an open-source object detection system from Facebook AI Research (FAIR) [29] and has been implemented in the PyTorch 1.7 framework to date. Detectron2 is a comprehensive reconstruction of Detectron1, utilizing the Mask R-CNN [68] benchmark as its foundation. It includes models such as Faster R-CNN [69], Mask R-CNN [68], RetinaNet [70], and DensePose [71] and could support tasks for semantic segmentation and panoptic segmentation (a combination of instance and semantic segmentation).

Several studies using Detectron2 for medical imaging, e.g., Chincholi and Koestler [72], reported an accuracy of 99.34% in the segmentation and 99.22% for the detection of hemorrhage lesions from Diabetic Retinopathy. Salh and Ali [73] used the Detectron2 model for the detection of breast tumors in MRI and reported an accuracy of 98.6%. Regarding brain tumor detection, Dipu et al. [74] found that Detectron2 gives a mAp@0.5 score of 91.51% compared with other algorithms.

3. Materials and Methods

Figure 1 presents a flowchart that illustrates the proposed methodology of this project.

3.1. Dataset

The experiments were conducted using public and private datasets. The public dataset was collected from the WMH segmentation MICCAI challenge, which is publicly available and downloadable after registration on the challenge’s web page [75]. A private Hospital in Ecuador provided the private dataset. Table 1 summarizes the details of the datasets.

Table 1 refers to the number of volumes and the number of slices collected from all MRI series. In this project, we train the algorithms with slices of each stack of images. Consequently, the total number of images to use in the training and validation process is 2850.

Out of this, 70% of the images are used for training, 20% for validation, and 10% for testing. The selection was performed randomly, ensuring that each split contained a representative sample of lesion types and sizes. Also, it uses custom feature engineering. Figure 2 represents examples of slice images of the datasets used.

One important fact derived from the observation of Table 1 is the variability of the equipment used to acquire public and private data. Therefore, it is necessary to preprocess the volumes and all images to get better segmentation and classification results.

3.1.1. Analysis of Volumes and Slices Dataset

Due to the limited dataset and computational resources, we use the slice images of each patient. The slices with lesions were previously recognized and identified in the volume to work only with these images.

This point of view allows us to reduce the bias and the noise produced for the other slices that do not have lesions, and consequently, improve the training models of detection, segmentation, and classification.

Also, in a detailed analysis of the private data of patients with ischemia and/or demyelination, it was seen that the lesions are not continuous between slices. This means that lesions change the shapes and locations of each slice. Consequently, getting a volume of the lesion to do 3D processing with good results is more difficult.

Figure 3 shows a patient with lesions of ischemia and demyelination in different slices and how these lesions change between slices.

3.1.2. Data Preprocessing

Preprocessing images simplifies the learning process in subsequent stages [76]. However, it is important to remember that the preprocessing of medical images is different from natural images, especially in MRI, where several factors could affect the preprocessing, e.g., (i) image intensity reflects tissue type, (ii) image intensity is relative, and (iii) intensity range is not bounded. More details about these aspects can be found in [21,76,77,78,79].

Artifacts and Noise Reduction

In order to minimize the noise and artifacts in the volumes of our dataset, the CurvatureFlow filter was implemented to denoise and improve the volumes. The CurvatureFlow is a filter based on the contours of equal intensity within the images [80]. In that way, the method tends to smooth flat and noisy areas through the evolution of the volume and preserve and highlight the structure’s contours (see Figure 4).

Bias Field Correction

Bias Field Correction is necessary due to the inhomogeneities that can introduce variations concerning the intensity of the MRI signal. The principal reason for applying this is that our dataset combines different magnetic field strengths (3 T and 1.T). In this sense, this technique could help to standardize the images, making them more comparable and improving the reliability of subsequent analyses.

In this case, the N4ITK algorithm was used to estimate the bias field, which works using an iterative process to optimize and converge the correction. Details about the N4ITK Bias Field Correction works can be found in [81,82].

Image Normalization

Once the bias field is estimated, the next step is to make comparable the intensities between images. Each image is normalized by dividing the original image by the estimated bias field. This correction helps to standardize the intensity values across the entire image, making the image more uniform and reducing the impact of inhomogeneities [83].

In this work, the method used for intensity normalization is the Z-score normalization instead of the Max-Min Normalization. The Z-score normalization was used because, in our case, the dataset for our proposes only used the FLAIR modality of MRI brain volumes.

Resampling and Spacing Normalization

After the normalization, a process of resampling was done with the objective of having the same voxel size and spatial resolution. According to the literature, this step is important to ensure that lesion characteristics are consistent across volumes. However, as we can see in the previous section, we have in mind that our lesion is not continuous in all the volumes.

The UNet Preprocessing

The preprocessing strategy used for the UNet approach was using the MONAI (Medical Open Network for AI) [84] platform. MONAI is a PyTorch-based, open-source framework developed by NVIDIA and King’s College London for deep learning in healthcare imaging [84,85]. The transform functions used were:

Loadimaged: means to load the *. nifty files.
ToTensord: converts the transformed data into torch tensors so we can use them for training.
Resized: allows the same dimensions for all patients.
AddChanneld: allows us to add a channel to our image (volume) and label. This means adding channels in agreement with our class; in this case, 0, 1, and 2 for background, ischemia, and demyelination, respectively.
Spacingd: Allows to change the voxel dimensions to have the same dimensions independently if the dataset of medical images was acquired with the same scan or with different scans, and consequently, they may have different voxel dimensions (width, height, and depth).
ScalIntensityRanged: Allows the contrast change and normalizes the voxel values between 0 and 1. That is important because the training will be faster.
CropForegroundd: Assists us in cropping out the empty regions of the image that we do not require, leaving only the region of interest.

3.1.3. Data Augmentation

Classical Data Augmentation

With the goal of getting a larger dataset, data augmentation techniques were used through the MONAI platform [84,85,86,87,88]. These techniques include random cropping, rotation, and scaling image intensity ranging from 0 to 1. Figure 5 shows some examples of augmented data.

As we can see in Figure 5, the data augmentation produces some artifacts and noise around the images. Therefore, additional preprocessing is required before the segmentation process.

GAN Data Augmentation

The idea to use GANs in this project is to get the possibility to increase our dataset with synthetic images and to build a more robust model of segmentation and classification of the lesions. The results of the synthetic images were not good enough to use in our dataset. However, it is important to describe this process in order to do more research related to this theme in future works.

Generative Adversarial Networks (GAN), conceived by Goodfellow and LeCun [89,90], have a good acceptance in the field of computer vision related to generating synthetic natural images but constitute a research challenge concerning medical images. More details and literature about this theme can be found in [91,92,93].

The spectral normalization (SNGAN) technique was applied to generate more data images of the ischemia and demyelination diseases. SNGAN introduces weight normalization (spectral normalization) to enhance the training stability of the discriminator network, serving as the foundation for synthetic image generation [94,95].

The Fréchet Inception Distance (FID) and Kernel Inception Distance (KID) metrics were used to evaluate the synthetic data generated. These metrics allow us to determine the difference in vector representation between the synthetic and real images.

FID permits the comparison of the distributions of the original and synthetic images. Better-quality images are indicated by lower FID scores [95]. KID shows the degree of visual similarity between the generated and real images. A lower KID value means a high quality of visual similarity. To compare the skewness, mean, and variance, KID uses the cubic kernel [95]. Details of the KID and FID equations are shown in [53,95,96].

The FID and KID values for our project were 228.09 and 0.3415, respectively. With these values, the synthetic data generated for our project are not completely useful because the SNGAN results are not good enough in terms of similarity, and for that reason, we cannot use these generated data to increase our dataset because, for our case, the most important detail is to get images with lesions, not only the shape of the brain. Figure 6 shows the best synthetic data generated at 275 epochs.

Table 2 summarizes the hyperparameters used for tuning the SNGAN model to generate synthetic images with brain lesions of ischemia and demyelination.

3.2. Models

This section presents the models applied for (i) segmentation the lesions of WMHs without classification and (ii) models applied to identification and classification to distinguish the ischemia from demyelination lesions. The hyperparameter selection was based on a combination of previous literature references and empirical tuning.

3.2.1. UNet Model

For building this UNet model, the MONAI framework was used as a baseline for data loading, preprocessing, data augmentation, and model training tools.

In this model, each slice of each MRI volume is treated as a separate 2D image. We use multiple channels (filter sizes), i.e., 16, 32, 64, 128, and 256, to allow the model to incorporate different modalities or features, enhancing its ability to distinguish between normal and pathological tissues. Figure 7 shows a schema of the UNet process. The parameters used with the UNet neural network are shown in Table 3.

The model receives an input of a FLAIR MRI volume, which is converted into a stack of 2D slices, and each of the slices is featured and mapped to get the segmentation mask. Blue squares represent the layers in the encoder, progressively reducing the spatial dimensions while increasing the number of channels. The green square in the middle represents the bottleneck layer, which is the deepest part of the network with the highest number of channels (256 in this case). The red squares represent the layers in the decoder, which progressively increase the spatial dimensions while decreasing the number of channels. The arrows between layers indicate the upsampling process. The numbers represent the number of channels in each layer (filters).

3.2.2. Segmenting Anything Model (SAM)

This method uses the SAM as a fine-tuning inference model to train and generate the segmenting masks of brain lesions: ischemia and demyelination. In Figure 8, we can see an overview of the SAM model and the application of this project. Table 4 shows the training parameters used with SAM.

The model consists of an image encoder to extract image embeddings, a prompt to the encoder, and a mask decoder to predict segmentation masks using the image and prompt embeddings. The SAM model was trained using the combined dataset (public + private: 2850 images).

3.2.3. YOLO Model

The architecture of YOLO consists of three main components: (i) the Backbone extracts features and generates feature maps of the input images; (ii) the Neck combines the feature maps from various layers of the backbone network and forwards them to the head; and (iii) the head handles the combined features and makes predictions regarding bounding boxes, object scores, and classification scores.

Here, the version of YOLO 8 from 2023 Is used to detect lesions; according to [28], this version has better feature aggregation and a mish activation function that improves detection accuracy and processing speed. Figure 9 shows the YOLO function process and the components used for this project.

Table 5 summarizes the fine-tuning hyperparameters for YOLOv8 used for this project.

3.2.4. Detectron2 Model

This project utilized the Detectron2 with the Faster R-CNN model with a ResNet50+FPN backbone (R50-FPN) [97] and the Mask R-CNN model with a ResNet50+FPN backbone (R50-FPN) [98].

The Faster R-CNN (R50-FPN) [99] structure has three components: (i) the Backbone Network, which extracts the feature maps made up from the input image; (ii) the Region Proposal Network (RPN), designed for object detection, which generates square object proposals with corresponding scores from the input image; and (iii) the Box Head, which constitutes the region of interest (ROI) head and employs the fully connected layers to refine the box placements and classify the objects.

The Mask R-CNN (R50-FPN) has an extra output branch than Fast RCNN for generating object masks for each region of interest (ROI), thus improving the spatial arrangement of the objects, as it involves pixel-to-pixel alignment [68,71].

The RoIAlign layer in Mask R-CNN improves the accuracy of the features extracted from the ROIs by aligning these with the input image, ensuring accurate mask prediction and per-pixel spatial correspondence [71].

Figure 10 shows the architecture overview of Detectron2, and Table 6 shows the parameters used for Detectron2.

3.3. Tools and Computational Resources

All experiments performed in this project used the Pytorch Framework, MONAI platform, Python version 3.10, and their libraries, Jupyter Notebooks, and were performed on a Google ColabPro: Tesla V100-SXM2-16GB. The time, processors, and epochs utilized with each model are reported in Appendix A, together with the list of open-source codes that were utilized as a baseline for the development of the code of this project. The code and trained models generated in this project are, upon request, available on Google-drive: https://goo.su/KJy2hV (accessed on 20 January 2025) and on Github: https://github.com/dpcastillo/ISCDEMY (accessed on 20 January 2025).

4. Results

4.1. Label Analysis

This section presents a global vision of the sizes, locations, and shapes of the ischemia and demyelination lesions before segmentation, detection, and classification results. Figure 11 describes a correlogram of the lesions, and we note:

In section (a), the scatter plot in the (x, y) shows a correlation between the horizontal and vertical positions of the lesions in the brain, where they are more likely to occur in certain brain areas. The x and y histograms show the frequency of lesion positions. The distributions indicate that lesions are more frequently found in certain brain areas.
In section (b), the scatter plots (x, width) and (y, width) show a more spread distribution with no strong pattern, indicating weak or no direct correlation between lesion positions and their width.
In section (c), the scatter plot (width, height) suggests a more dispersed pattern, indicating that the width and height of lesions do not have a strong correlation. Lesions come in various shapes and sizes.
The width and height histograms indicate that most lesions have small dimensions, between 0 and 0.4, and most cases have a size of less than 0.2 pixels. Fewer lesions have larger dimensions.

This knowledge could be used to understand and propose a better fine-tuning of segmentation and detection algorithms.

4.2. Segmentation Results

4.2.1. UNet Segmentation

The UNet model segmentation gives a mean Dice score (DSC) value of 0.95 in data validation. The loss curves permit measurement of how well the model predictions match the true labels. Lower values indicate better performance. Figure 12 shows that the validation and training curve losses decrease consistently without significant fluctuations, indicating stable learning and good generalization.

Regarding Figure 12, it is seen that after about 100 epochs, the training loss stabilizes around 0.35, indicating that the model has learned most of the features and is refining its predictions. The validation loss stabilizes around 0.35, slightly lower than the training loss, suggesting good generalization to unseen data. Also, these facts suggest no overfitting on the training model, as there is no significant gap between them.

Figure 13 gives some examples of the prediction results. We can see the original images with their corresponding masks and predictions.

4.2.2. SAM Model

To get the best result with SAM, we did segmentations with the three trained weights with the SAM official repository: vit-base, vit-large, and vit-huge. A pre-trained model with medical images was also used (MedSam).

In Figure 14, we can see the training and validation loss curves (a) vit-base, (b) vit-large, and (c) vit-huge. The analysis of these curves refers to aspects such as:

Effective initial learning: we can see in the three graphs a rapid initial drop in training loss: (~0.002) for vit-base, (~0.08) for vit-large but dropping sharply to stabilize around 0.002, and (~0.012) for vit-huge, which stabilizes quickly around 0.002. In general, this indicates effective learning from the start.
Low and Stable Training Loss: Consistently low training loss across all graphs suggests a good model fit for the training data.
Validation Loss Fluctuations: We can see there is variability in validation loss across all models, which suggests the challenges in the generalization that it is principally due to the limited training data. The most fluctuations between 0.004 and 0.006.

In general, analyses indicate that the models learn effectively and fit well with the training data but have to address validation loss fluctuations in order to improve the models for the generalization capabilities and performance on unseen data.

In Table 7, we can see the Dice values as results of the training of our dataset with each of the three pre-trained models proportioned by SAM. The highest mean Dice value is 0.5.

Figure 15 shows that SAM is better in segmentation when there are a few small lesions that are not so small. In this case, the Dice values are around 0.10 in some cases and 0.60 to 0.80 in others. The segmentation prediction is not so good when the lesions are very small and abundant in the image; the Dice and IoUs values fall under 0.5. The best result is using the “vit-large” pre-trained weights.

These results are in concordance with other published results indicating that the SAM model does not have satisfactory results when automatizing the segmentation due to the lack of training in medical data [45]. This means that, although the SAM was trained with 11 M of natural images, the medical images have an extensive variability. Also, some recent studies state that SAM improves segmentation using the promp-based method [40,44,51,53].

4.2.3. YOLO Model for Detecting WMH Lesions

In this first experiment, we train the model for only lesion identification. The data used were composed of 2850 images. Two pre-trained models were used: “yolov8n-seg.pt” and “yolov8x-seg.pt”. In this case, the best results were obtained with the “yolov8n-seg.pt” model.

The confusion matrix in Figure 16 indicates that 24% of the backgrounds are detected as lesions, and 76% of the lesions are not detected. This suggests a tendency towards over-detection because the model has a good sensitivity for lesions when they are present in the image but does not have enough specificity. This suggests that while the model is highly accurate in confirming lesions, it needs significant improvement in reducing false positives and, more critically, in increasing its sensitivity to ensure fewer lesions are missed. Figure 17 presents graphs of the performance of the proposed model for the detection and localization of the lesions.

Figure 17 illustrates the analysis of the training and validation function loss and metrics of precision and recall of the lesion identification. Here are more details about it:

The curves of “train/box_loss” and “val/box_loss” show the training and validation loss related to the bounding box predictions. As the loss function decreases, it signifies that the network is effectively learning and enhancing its capability to precisely predict well-fitted bounding boxes.
The “train/seg_loss” and “val/seg_loss” represents the training and validation segmentation loss.
The “train/cls_loss” and “val/cls_loss” refer to training classification loss, which evaluates the classification accuracy of each predicted bounding box.
The “train/dfl_loss” and “val/dfl_loss” indicate the training distribution focal loss, indicating the model’s confidence in predictions.
The graph of “metrics/precision(B)” and “metrics/precision(M)” indicates the precision for bounding box predictions and precision for mask predictions, respectively.
The “metrics/recall(B)” and “metrics/recall(M)” are the recall for bounding box predictions and recall for mask predictions, respectively. This indicates the ability to identify true positive masks.
The “metrics/mAP50(B)” and “metrics/mAP50(M)” are the mean average precision at 50% IoU (Intersection over Union) for bounding boxes and for masks, respectively.
The “metrics/mAP50-95(B)” and “metrics/mAP50-95(M)” are the mean average precision at IoU thresholds from 50% to 95% for bounding boxes and for masks, respectively.

The analysis detailed in Figure 17 shows that the training and validation loss for bounding box predictions, segmentation, and classification all show a decrease, indicating effective learning and good generalization. In this scenario, the desirable trend of the function loss is to decrease toward zero as the numbers of epochs increase. However, loss fluctuations indicate that fine-tuning may need further refinement.

Precision and recall metrics for masks and bounding boxes show steady upward trends, which reflect an increase in accuracy and true positive rates. The mean average precision metrics highlight enhanced detection and segmentation performance across various IoU thresholds.

4.2.4. Detectron2 for Detecting WMH Lesions

Figure 18 shows some examples of the detection of lesions from the model. Here, the model detects only the lesions and uses the big dataset of 2850 images. The reason for not classifying it is that the public dataset corresponds to WMH lesions without any specification of disease type.

In Table 8, we can see the mean values of the metrics from function loss and the accuracy of the Detectron2 model for lesion detection. These values are complemented by the corresponding graphs in Figure 19 and Figure 20.

All values of the loss (‘total_loss’, mean value 0.843, ‘loss_box_reg’, mean value 0.312, ‘loss_cls’, mean value 0.120, ‘loss_mask’, mean value 0.245, ‘loss_rpn_cls’, mean value 0.017, ‘loss_rpn_loc’, ‘mean value 0.132’) indicates that the model has learned the essential features needed to identify and detect the lesions. The trends over the iterations indicate a stabilization with fluctuations due to the fine-tuning process.

The accuracy (‘fast_rcnn/cls_accuracy’, mean value 0.946 and ‘mask_rcnn/accuracy’, mean value 0.887) indicates that the detection accuracy performs excellently in identifying lesions. At the same time, the segmentation accuracy is also substantial but slightly lower, which indicates good performance in delineating the lesions. That is complemented by the lower rates of ‘false positives’, with a mean of 0.065, and ‘false negatives’, with a mean of 0.227.

4.3. Detection and Classification

4.3.1. YOLOv8 Model for Detection and Classification

This section only uses the Private dataset with (220) images to experiment with the detection and classification simultaneously. Therefore, the model was fed with ischemia and demyelination lesions. Figure 21 shows a correlogram of the balanced instances (“ische” and “demy”). Also, the size of the lesions and their locations can be seen.

The YOLO model was trained with different parameters, epochs, and pre-trained models, i.e., “yolov8x-cls.pt”, “yolov8n-seg.pt”, and “yolov8x-seg.pt”, for transfer learning. The best result was acquired using the model trained on segmentation “yolov8n-seg.pt”.

The curves from Figure 22 show that the “train/box_loss” fluctuates initially but shows a general downward trend, indicating that the model is improving its bounding box predictions over time. The “train/seg_loss” is a noticeable downward trend with some fluctuations, suggesting that the model is learning to segment brain lesions better as training progresses. The “train/cls_loss” starts high and decreases significantly, stabilizing at a lower value, indicating that the model is effectively learning to classify the lesions. The “train/dfl_loss” shows a downward trend with fluctuations, indicating improvement in the model’s confidence in predictions.

For the validation loss, the “val/box_loss” shows initial fluctuations but stabilizes with a slight downward trend, suggesting the model generalizes well to unseen data. The “val/seg_loss” follows a similar pattern to the training segmentation loss, indicating consistency between training and validation performance. The “val/cls_loss” decreases over time, though it remains slightly higher than the training classification loss, indicating some degree of overfitting. The “val/dfl_loss” shows similar trends to the training loss, suggesting consistent learning across both sets.

The “metrics/precision(B)” graph shows significant fluctuations but trends upward, indicating improving precision over time. The “metrics/recall(B)” shows fluctuating values but upward trends, suggesting the model is improving at identifying true positives. The “metrics/precision(M)” shows an upward trend with fluctuations, indicating improving accuracy in segmentation masks. The “metrics/recall(M)” has values that fluctuate but increase, indicating the model’s improved ability to identify accurate positive masks.

The “metrics/mAP50(B)” steadily increases, indicating better overall detection performance. The “metrics/mAP50-95(B)” also trends upward, suggesting improving detection accuracy across varying IoU thresholds. The “metrics/mAP50(M)” shows an increase, indicating better segmentation performance. The “metrics/mAP50-95(M)” shows improvement, suggesting better segmentation accuracy across varying IoU thresholds.

Complementing a detailed analysis from Figure 23, we can note that the confusion matrix suggests the model performs well in distinguishing major lesion types. However, misclassifications still occur, particularly with small lesions and lesions with low contrast or irregular morphology.

4.3.2. DETECTRON2 Model for Detection and Classification

Figure 24 shows the model’s loss and accuracy when trained with the private dataset to detect and classify the lesions.

Table 9 shows the mean values of the metrics from function loss and the accuracy of the Detectron2 model for lesion detection and classification. These values are complemented by the corresponding graphs in Figure 25, Figure 26 and Figure 27. We note that:

The total loss (‘total_loss’, mean value 0.300) indicates that the model has learned the essential features needed for classification and stabilization with minor fluctuations.
The box regression loss (‘loss_box_reg’, mean value 0.072) decreases as the model learns to predict better-bounding boxes after stabilizing at a low value. This indicates that the model has become proficient in predicting bounding box coordinates.
The classification loss (‘loss_cls’, mean value 0.030), through the pass of iteration, decreases and stabilizes, which suggests the model has learned to classify most of the lesions correctly.
The mask loss (‘loss_mask’, mean value 0.154), shows decreasing and stabilizing values, which indicate a consistent performance in mask predictions.
The RPN classification loss (‘loss_rpn_cls’, mean value 0.007) indicates a stabilization, which means that the region proposal classification task model has learned to propose regions accurately.
The RPN localization loss (‘loss_rpn_loc’, mean value 0.029) stabilizes at a low value, which indicates the model’s proficiency with the localizing regions.
The classification accuracy (‘fast_rcnn/cls_accuracy’ and ‘mask_rcnn/accuracy’) shows improvement through the iterations and gets stabilization values of 0.98 and 0.93 for both accuracies, respectively, which means the model is classifying lesions correctly most of the time (fast_rcnn/cls_accuracy) and that the model has high accuracy in mask prediction (mask_rcnn/accuracy).

Figure 26 shows the false positive and false negative values rate, and also, through an analysis of this graph together with Table 8, it is seen that the mean rate is 0.100 for false negative and 0.050 for false positive. These values mean that the proposed model usually classifies the lesions with high accuracy, practical learning, and convergence.

In summary, these graphs suggest the model has good convergence and robustness, showing stable and low losses across all loss metrics and high accuracy metrics.

4.4. Comparison of Proposed Segmentation Models

Table 10 summarizes the average DSC values of segmentation of lesions of each model experimented in this work. We can see that the UNet model (0.95) has better performance in the segmentation of the lesions, followed by Detectron2 (0.887), SAM (0.50), and YOLOv8 (0.264).

Table 11 summarizes the results of some recent works concerning the segmentation of WMHs compared to the proposed models in this project. We can note that our proposed UNet model stands out for its DSC value (0.95).

4.5. Brief Comparison Results Between Expert Criteria and YOLO and Detectron2 Models for Classification of Ischemia and Demyelination Lesions

This section compares the detection and classification of ischemia and demyelination lesions using the approaches of YOLOv8 and Detectron2 models against the metrics of two expert radiologists, each with more than ten years of experience.

Table 12 shows the visual and statistical comparison of test images/lesions (21) and their detection classification using the Detectron2 and YOLOv8 models against the radiologist expert. As we can see in Figure 28, Figure 29 and Figure 30, the best results for classifying and detecting the ischemia and demyelination lesions are obtained by the Detectron2 model.

In Table 12, the values of the Kappa (0.809) and the corresponding p-value (0.0534) indicate a substantial agreement between Expert criteria and Detectron2 model with marginal significance, while for YOLOv8, it is seen that the Kappa value (0.0350) and the p-value (0.0001) show a poor agreement between the expert criteria and YOLOv8 model, with a significant difference. In conclusion, as we can see in Figure 26, the Detectron2 model has better performance than YOLOv8; their metrics do not separate the Expert criteria but are relatively near these values. In this sense, this model could be implemented in clinical environments in a CAD system or as a second opinion in the diagnosis.

Figure 31 shows the Receiver Operating Characteristic (ROC) curve that compares the classifiers’ performance. In this case, we can see that the value of Area Under the Curve (AUC) for Detectron2 (AUC = 0.929) is better than YOLOv8 (AUC = 0.524), which means that the Detectron2 model has a good performance, close to the experts (AUC = 0.976) concerning the identification and classification of the lesions.

Table 13 presents a comparative analysis between the proposed method and existing approaches in the literature. The last results indicate that the Detectron2 model has a good performance in the detection and classification of the lesions of ischemia and demyelination, considering the small size and irregularity morphology of the lesions. Similarly, its performance is strong in the identification of lesions of WMHs without classification (see Table 8 and Figure 18).

5. Discussion

This work introduced a deep learning-based approach for detecting, segmenting, and classifying White Matter Hyperintensities (WMHs) related to ischemia and demyelination diseases. According to the literature, most works primarily focus on the segmentation process [34,35,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126], including a MICCAI challenge finalized in 2023 [75,121]. However, Liu et al. [38] noted that most methods are unsuitable for segmenting small lesion sizes with complex boundary shapes and diffuse edges, as in the case of demyelination lesions or ischemia lesions in the first stages. Our approach incorporated detection and classification, providing a comprehensive framework for lesion analysis.

Regarding the segmentation task explored and proved in this work, we can say that the proposed UNet model has a good performance on the segmentation, e.g., we can see in Table 11 that DSC values of different works [34,38,100,101,102,103] oscillate between 0.67 and 0.95, which means that the UNet network demonstrates its ability to extract the features map of the lesions and in that way get a good prediction of the lesions. In our case, to improve the segmentation and reduce the possible sources of noise that can affect the model, we only selected the range of slices of each volume that have the lesions with their corresponding annotations. We achieved accuracy with a mean value of 0.95 for the Dice coefficient (see Table 10). Nevertheless, we have to say that the model has difficulty segmenting the small lesions.

Several studies proposed the uses of the 3D UNet and their variations, like that of Soleimani et al. [117], who used a 3D UNet with such segmenting strokes, and their results have an IoU score of 0.88. Along the same lines, Rudie et al. [126] proposed the uses of 3D Unet for the segmentation of tumors and WMHs, and they reported results of a mean Dice value for the WMH segmentation of 0.62 and the tumor of 0.92. This project did not use 3D Unet because the diseases are not continuous in the slices, and it is difficult to construct a volume of the lesions without including false information.

Concerning the SAM model, and based on our knowledge of the literature search at the moment, there are no studies related to the segmentation of WMHs using the SAM model. However, we can find several works related to the detection of brain tumors [127], breast lesions of ultrasound images [47,52], and retinal fundus images [48]. In order to compare our results, we selected the paper from Peivandi et al. [127] related to brain tumors.

In this sense, the SAM model was tested automatically in this project, and our results agree with several works related to segmentations of brain lesions. Our project’s Dice mean values stay in the range of 0.3 to 0.85, which is in agreement with Dice values of 0.53 to 0.88 in the segmentation of the tumors, according to the reported results from [127,128]. In our project, the highest dice values detected are when the lesions are bigger and relatively homogeneously distributed in the image (see Figure 15).

Also, some recent studies, such as [46], have suggested that SAM has more of a possibility of success for the segmentation of medical images in the prompt mode because, in this case, the user gives the model the region of interest for the segmentation. Here, our proposed method could be easily adapted for that user who gives the prompt of the brain lesion. In this sense, that allows us to improve the annotations and the acquisition of better masks for posterior processing.

Regarding the segmentation, detection, and classification of YOLO and Detectron2, we can say that the results from the YOLOv8 model show that the model is better at detecting big lesions than small ones but also has difficulties with classification. This could be due to the quantity of the data used because, in our case, we used only 220 images for classification, while, for example, Zhang et al. [111] proposed a methodology for the detection of stroke lesions using the Faster R-CNN, YOLOV3, and SSD (Single Shote Detector) with a private dataset of 5668 brain MRI images, and reported a lesion detection with the best precision of 89.77%; however, they only detected big lesions without classification.

However, in this project, we use the YOLOv8 to test, evaluate, and compare the classification and detection of the lesions. In this sense, we can see in the confusion matrix (see Figure 19) that it is difficult for the model to distinguish between the two types of lesions due to similarities in their appearance on MRI scans and the size of the lesions. According to the literature [59], YOLO is not good at detecting and classifying small lesions.

For that reason, another approach is proposed, where YOLOv8 only detects lesions if WMH lesions are without classification, and the model reports a DSC value of 0.264 (see Figure 16 and Table 10). The YOLO model has better results but still needs improvement. Some studies [67,129] dealt with detecting brain tumors where the lesions are bigger, such as the detection of gliomas, meningioma, and pituitary brain tumors using YOLOv7 and YOLOv5, reporting results of 99.5% and 90% accuracy, respectively. Additionally, in these works, the datasets used are larger than those used in this project. Therefore, this is a good point for initiating more experiments in future studies with further refinement, additional training data, or other architectures. In this sense, in agreement with works [62,130,131], it is proposed that YOLO can improve detection results when combined with other models, e.g., SAM.

Regarding the Detectron2 model, we can see from the results that this proposed model performs well in detecting (0.94) and classifying lesions (0.98), including small ones and with irregular morphology (see Figure 24 and Table 9). One advantage over the other is that Detectron2 also allows the delineation of the shapes of the lesions; in this sense, the results acquired have more confidence.

These values agree with others reported in the literature, e.g., in [116], where the goal was detecting the WMH lesions using MaskRCNN from Detectron2, which reports a value of 0.93 for the stroke dataset and 0.83 for the WMH dataset (see Table 13). In our work, our result for segmentation has a value of DSC of 0.887, and for detecting the lesions, WMH is 0.94 (see Table 8 and Table 13).

Also, it is important to mention that the YOLOv8 and Detectron2 models were designed with the possibility of selecting and adjusting the threshold of detection for clinical detection. That means that depending on our task, we can adjust the threshold to get segmented lesions and ensure customized sensitivity based on clinical requirements. In this project, the YOLOv8 threshold was set to a value of 0.2 to get better detection results.

For Detectron2, the threshold values are 0.5 and 0.8; for the value of 0.8 (see Figure 29), it is more confident with the acquisition values due to the medical data’s sensitivity. The opportunity to select a threshold of detection also increases the transparency of physicians in order to know what the system is doing to classify lesions.

Therefore, based on the results of the comparison against the expert’s criteria (see Table 12 and Figure 30 and Figure 31), we can say that the better approach for detecting and differentiating the brain lesions of ischemia and demyelination is the Detectron2 model.

The values of the Kappa (0.809) and the corresponding p-value (0.0534) (see Table 12) indicate a substantial agreement between Expert criteria and Detectron2 model with marginal significance, while for YOLOv8, it is seen that the Kappa value (0.0350) and the p-value (0.0001) shows a poor agreement between the expert criteria and YOLOv8 model, with a significant difference. In conclusion, the Detectron2 model has better performance than YOLOv8; their metrics do not separate the Expert criteria but are relatively near these values. In this sense, this model could be implemented in clinical environments in a CAD system or as a second opinion in the diagnosis.

Limitations and Challenges

Some important limitations derived from this work are as follows:

The limited quantity of data concerning the specific pathologies of ischemia and demyelination diseases. To improve the segmentation and reduce the possible sources of noise that can affect the model, we only selected the range of slices of each volume with the lesions with their corresponding annotations. However, as with all deep learning methodologies, achieving optimal accuracy depends on the availability of large-scale datasets to ensure robust model performance and generalization.
Although we used the public database provided by the MICCAI challenge, we cannot use the combined data to classify the lesions because the public data does not contain targets of the pathologies studied in this project. For that reason, the segmentation, detection, and identification of the WMH lesions were also made like a complementary project.
The variability of the public data, in the sense that it was acquired from different equipment, may introduce bias, affecting model performance. Another limitation with respect to the data is that the private dataset is built with the annotation of experts by only one health institution; according to the literature, together with the artifacts and the noise produced with the equipment, the annotations also depend on the physician’s expertise.
The detailed exploration of GANs will increase the dataset with synthetic images and build a more robust model of segmentation and classification of lesions. In this project, this idea was used; however, the synthetic images were not good enough for use in our dataset. Therefore, this is a motivational step to perform more research related to this theme in future works.
The computational resources; running deep learning models based on 3D-based networks requires high-performance GPUs, which were not available for this study. Another limitation is that it was not possible to work with the 3D volume of the FLAIR MRI due principally to the characteristics of the specific images of our private dataset.

6. Conclusions

Unlike conventional works that focus on a single model, this paper explored several approaches, U-Net, SAM, YOLOv8, and Detectron2, that allow us to get a better understanding and comparison of their effectiveness in different lesion-related tasks to classify and differentiate ischemia from demyelination diseases, as well as identify white matter hyperintensities.

It was determined that the best approach to build an automatic model for identifying, differentiating, and classifying the lesions would be to use a 2D approach due to the nature of the lesions and their non-continuous presence across all slices of the 3D volume.

For lesion segmentation, the UNet network and the SAM model were used. In the first case, an average DSC value of 0.95 was obtained for the validation image set. For the SAM segmentation model, transfer learning was performed, experimenting with four pre-trained models: three trained on a dataset of 11 billion natural images and another recently trained on a collection of medical images. The segmentation validation results for this model range between 0.53 and 0.88 on average, which aligns with the existing literature, given that this model was launched in April 2023, and there is currently ongoing experimentation with this model in medical imaging.

Transfer learning was also used for lesion detection and classification through the YOLOv8 and Detectron2 models, both of which were trained with natural images. Experimental results from this project showed that the Detectron2 model has higher efficiency in detecting and differentiating lesions, with precision values of 0.94 for detection and 0.98 for classification. In the case of YOLOv8, the values are lower (0.4, 0.66), primarily due to the lesion size and the small number of training images for classification.

From a technical perspective, this project offers to the scientific community a straightforward methodology for addressing the challenge of detecting small and irregular morphology lesions in brain images. This project contributes by giving a set of algorithms for building a second tool that helps the training of new physicians to differentiate demyelination and ischemia diseases better, as both lesion types can appear similar on MRI scans.

To the best of our knowledge, no work has yet been developed to automatically classify these specific types of brain lesions, such as ischemia and demyelination diseases. In that sense, the best approach model considered in this project (Detectron2) could be used as a second opinion in a clinical environment to gather the most feedback possible to feed the model with various images and lesion morphologies.

With the results and methodologies utilized in this project, additional approaches are also proposed to be explored in future works, like super-resolution, especially to detect better small lesions in medical images. The results of this project provide valuable insights into the application of transfer learning methodologies for the detection and classification of lesions characterized by similarity, small size, and irregular morphology.

Future Works

The immediate future work derived from this project is to develop an adequate CAD system based on the algorithms developed and to integrate the CAD in a clinical environment with the principal idea of expanding and feeding the dataset. Also, it allows us to train the algorithms with a greater variety of data and assess the usability, clinical interpretability, and physician feedback, integrating attention-based mechanism (Grad-CAM).

Other future work is the development of an ensemble methodology using the SAM model and Detectron2 and YOLOv8, with the principal goal of creating a tool that allows better and more accurate annotations, e.g., YOLO or DETECTRON2 could identify the object and, after this bounding, box coordinates of the detected lesion could be used as a way for SAM to get a mask or only the lesion. This could improve the physicians’ annotations and reduce the time they spend.

Improve the methodology of detecting small lesions in medical images through super-resolution, GANS, and the exploration of 2.5- and 3D-based networks. Better computational resources that allow more extensive experiments and refinement on the hyperparameters are needed to get better results, especially on models with low accuracy.

Author Contributions

Conceptualization, D.C. and V.L.; methodology, D.C.; software, D.C.; validation, D.C., V.L. and M.J.R.-Á.; formal analysis, V.L., M.J.R.-Á. and R.S.; investigation, D.C.; resources, R.S.; data curation, D.C.; writing—original draft preparation, D.C.; writing—review and editing, D.C, V.L. and R.S.; visualization, D.C., V.L. and R.S.; supervision, V.L. and M.J.R.-Á.; funding acquisition, D.C. and M.J.R.-Á. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The code and trained models generated in this project are, upon request, available on Google-drive: https://goo.su/KJy2hV (accessed on 20 January 2025) and on Github: https://github.com/dpcastillo/ISCDEMY (accessed on 20 January 2025). Appendix A (Table A2) lists the open-source codes used as a baseline for developing this project.

Acknowledgments

D.C. acknowledges the support from the Universitat Politècnica de València through Assistance Call Doctoral Student Mobility 2023. D.C. also acknowledges the research support from the Universidad Técnica Particular de Loja through the project PROY_INV_QU_2022_3576.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Code, Processors, and Time

Table A1. The time, epochs, and processor used in training the models presented in this project.

Model	Epochs/Iterations	Time Processing (s)	Processor
UNet	250	90,389	Google Colab Pro: Tesla V100-SXM2-16GB
SAM-Vit-Base	25 50	189.6/epoch Total = 9480
SAM-Vit-Large	25 50	189.6/epoch Total = 9480
SAM-Vit-Huge	25 50	347.4/epoch Total = 17,370
YOLOv8 (private data)	100	30–40/epoch Total = 4123
YOLOv8 (private+public data)	100	180–190/epoch Total = 18,954
Detectron2 (private data)	500k/iterations-250 epochs	4909
Detectron2 (private+public data)	500k/iterations-250 epochs	5569

Table A2. The list of open-source codes that were used as a baseline for the development of this project.

MODEL	Source/GitHub
UNet	https://github.com/Project-MONAI/MONAI(accessed on 20 January 2025) https://github.com/leslie-zi-pan/wmh-segmentation (accessed on 20 January 2025)
SAM	https://github.com/facebookresearch/segment-anything (accessed on 20 January 2025) https://github.com/bowang-lab/MedSAM (accessed on 20 January 2025) https://medium.com/@sdwiulfah/brain-mri-segmentation-with-segment-anything-model-sam-16d0b4101a85 (accessed on 20 January 2025) https://github.com/bnsreenu/python_for_microscopists/tree/master (accessed on 20 January 2025)
YOLOv8	https://docs.ultralytics.com/ (accessed on 20 January 2025) https://github.com/bnsreenu/python_for_microscopists/tree/master (accessed on 20 January 2025)
Detectron2	https://github.com/facebookresearch/detectron2/tree/main (accessed on 20 January 2025) https://github.com/bnsreenu/python_for_microscopists/tree/master (accessed on 20 January 2025)
GAN	https://github.com/xinario/awesome-gan-for-medical-imaging (accessed on 20 January 2025)
NBA/Radiomics	http://cflu.lab.nycu.edu.tw/MRP_MLinglioma.html (accessed on 20 January 2025)

References

World Health Organization. Neurological Disorders: Public Health Challenges; World Health Organization: Geneva, Switzerland, 2006. [Google Scholar]
Marek, M.; Horyniecki, M.; Frączek, M.; Kluczewska, E. Leukoaraiosis—New Concepts and Modern Imaging. Pol. J. Radiol. 2018, 83, e76. [Google Scholar] [CrossRef] [PubMed]
Merino, J.G. White Matter Hyperintensities on Magnetic Resonance Imaging: What Is a Clinician to Do? Mayo Clin. Proc. 2019, 94, 380–382. [Google Scholar] [CrossRef] [PubMed]
White Matter Hyperintensities on MRI—Artefact or Something Sinister? Available online: https://psychscenehub.com/psychinsights/white-matter-hyperintensities-mri/ (accessed on 21 April 2022).
Ammirati, E.; Moroni, F.; Magnoni, M.; Rocca, M.A.; Anzalone, N.; Cacciaguerra, L.; di Terlizzi, S.; Villa, C.; Sizzano, F.; Palini, A.; et al. Progression of Brain White Matter Hyperintensities in Asymptomatic Patients with Carotid Atherosclerotic Plaques and No Indication for Revascularization. Atherosclerosis 2019, 287, 171–178. [Google Scholar] [CrossRef] [PubMed]
Ischemic Stroke: MedlinePlus. Available online: https://medlineplus.gov/ischemicstroke.html (accessed on 22 April 2022).
Stroke|Mayfield Brain & Spine, Cincinnati, Ohio. Available online: https://mayfieldclinic.com/pe-stroke.htm (accessed on 22 April 2022).
Types of Stroke | Johns Hopkins Medicine. Available online: https://www.hopkinsmedicine.org/health/conditions-and-diseases/stroke/types-of-stroke (accessed on 22 April 2022).
Nour, M.; Liebeskind, D.S. Imaging of Cerebral Ischemia: From Acute Stroke to Chronic Disorders. Neurol. Clin. 2014, 32, 193–209. [Google Scholar] [CrossRef]
Mehndiratta, M.M.; Gulati, N.S. Central and Peripheral Demyelination. J. Neurosci. Rural Pract. 2014, 5, 84–86. [Google Scholar] [CrossRef]
Love, S. Demyelinating Diseases. J. Clin. Pathol. 2006, 59, 1151–1159. [Google Scholar] [CrossRef]
Overview of Demyelinating Disorders—Brain, Spinal Cord, and Nerve Disorders—MSD Manual Consumer Version. Available online: https://www.msdmanuals.com/home/brain,-spinal-cord,-and-nerve-disorders/multiple-sclerosis-ms-and-related-disorders/overview-of-demyelinating-disorders (accessed on 26 April 2022).
Leite, M.; Rittner, L.; Appenzeller, S.; Ruocco, H.H.; Lotufo, R. Etiology-Based Classification of Brain White Matter Hyperintensity on Magnetic Resonance Imaging. J. Med. Imaging 2015, 2, 014002. [Google Scholar] [CrossRef]
Rachmadi, M.F.; Valdés-Hernández, M.d.C.; Makin, S.; Wardlaw, J.; Komura, T. Automatic Spatial Estimation of White Matter Hyperintensities Evolution in Brain MRI Using Disease Evolution Predictor Deep Neural Networks. Med. Image Anal. 2020, 63, 101712. [Google Scholar] [CrossRef]
Diniz, P.H.B.; Valente, T.L.A.; Diniz, J.O.B.; Silva, A.C.; Gattass, M.; Ventura, N.; Muniz, B.C.; Gasparetto, E.L. Detection of White Matter Lesion Regions in MRI Using SLIC0 and Convolutional Neural Network. Comput. Methods Programs Biomed. 2018, 167, 49–63. [Google Scholar] [CrossRef]
Park, G.; Hong, J.; Duffy, B.A.; Lee, J.M.; Kim, H. White Matter Hyperintensities Segmentation Using the Ensemble U-Net with Multi-Scale Highlighting Foregrounds. Neuroimage 2021, 237, 118140. [Google Scholar] [CrossRef]
Eichinger, P.; Schön, S.; Pongratz, V.; Wiestler, H.; Zhang, H.; Bussas, M.; Hoshi, M.M.; Kirschke, J.; Berthele, A.; Zimmer, C.; et al. Accuracy of Unenhanced MRI in the Detection of New Brain Lesions in Multiple Sclerosis. Radiology 2019, 291, 429–435. [Google Scholar] [CrossRef] [PubMed]
Rudie, J.D.; Mattay, R.R.; Schindler, M.; Steingall, S.; Cook, T.S.; Loevner, L.A.; Schnall, M.D.; Mamourian, A.C.; Bilello, M. An Initiative to Reduce Unnecessary Gadolinium-Based Contrast in Multiple Sclerosis Patients. J. Am. Coll. Radiol. 2019, 16, 1158–1164. [Google Scholar] [CrossRef] [PubMed]
McKinley, R.; Wepfer, R.; Grunder, L.; Aschwanden, F.; Fischer, T.; Friedli, C.; Muri, R.; Rummel, C.; Verma, R.; Weisstanner, C.; et al. Automatic Detection of Lesion Load Change in Multiple Sclerosis Using Convolutional Neural Networks with Segmentation Confidence. Neuroimage Clin. 2020, 25, 102104. [Google Scholar] [CrossRef] [PubMed]
Castillo, D.; Lakshminarayanan, V.; Rodríguez-Álvarez, M.J. MR Images, Brain Lesions, and Deep Learning. Appl. Sci. 2021, 11, 1675. [Google Scholar] [CrossRef]
Li, Y.; Ammari, S.; Balleyguier, C.; Lassau, N.; Chouzenoux, E. Impact of Preprocessing and Harmonization Methods on the Removal of Scanner Effects in Brain MRI Radiomic Features. Cancers 2021, 13, 3000. [Google Scholar] [CrossRef]
Kumar, A.; Upadhyay, N.; Ghosal, P.; Chowdhury, T.; Das, D.; Mukherjee, A.; Nandi, D. CSNet: A New DeepNet Framework for Ischemic Stroke Lesion Segmentation. Comput. Methods Programs Biomed. 2020, 193, 105524. [Google Scholar] [CrossRef]
Hugo Lopez Pinaya, W.; Tudosiu, P.-D.; Gray, R.; Rees, G.; Nachev, P.; Ourselin, S.; Cardoso, M.J. Unsupervised Brain Anomaly Detection and Segmentation with Transformers. Proc. Mach. Learn. Res. 2021, 143, 596–617. [Google Scholar]
Zhao, Z.; Alzubaidi, L.; Zhang, J.; Duan, Y.; Gu, Y. A Comparison Review of Transfer Learning and Self-Supervised Learning: Definitions, Applications, Advantages and Limitations. Expert Syst. Appl. 2024, 242, 122807. [Google Scholar] [CrossRef]
Maier, O.; Schröder, C.; Forkert, N.D.; Martinetz, T.; Handels, H. Classifiers for Ischemic Stroke Lesion Segmentation: A Comparison Study. PLoS ONE 2015, 10, e0145118. [Google Scholar] [CrossRef]
Weng, W.; Zhu, X. U-Net: Convolutional Networks for Biomedical Image Segmentation. IEEE Access 2015, 9, 16591–16603. [Google Scholar] [CrossRef]
Kirillov, A.; Mintun, E.; Ravi, N.; Mao, H.; Rolland, C.; Gustafson, L.; Xiao, T.; Whitehead, S.; Berg, A.C.; Lo, W.Y.; et al. Segment Anything. In Proceedings of the IEEE International Conference on Computer Vision, Paris, France, 1–6 October 2023; pp. 3992–4003. [Google Scholar] [CrossRef]
Ultralytics Home—Ultralytics YOLOv8 Docs. Available online: https://docs.ultralytics.com/ (accessed on 25 May 2024).
Wu, Y.; Kirillov, A.; Massa, F.; Lo, W.-Y.; Girshick, R. Detectron2. Available online: https://github.com/facebookresearch/detectron2 (accessed on 17 July 2024).
Hussain, S.; Anwar, S.M.; Majid, M. Segmentation of Glioma Tumors in Brain Using Deep Convolutional Neural Network. Neurocomputing 2018, 282, 248–261. [Google Scholar] [CrossRef]
Nadeem, M.W.; al Ghamdi, M.A.; Hussain, M.; Khan, M.A.; Khan, K.M.; Almotiri, S.H.; Butt, S.A. Brain Tumor Analysis Empowered with Deep Learning: A Review, Taxonomy, and Future Challenges. Brain Sci. 2020, 10, 118. [Google Scholar] [CrossRef] [PubMed]
Anwar, S.M.; Majid, M.; Qayyum, A.; Awais, M.; Alnowami, M.; Khan, M.K. Medical Image Analysis Using Convolutional Neural Networks: A Review. J. Med. Syst. 2018, 42, 226. [Google Scholar] [CrossRef] [PubMed]
Ronneberger, O.; Fischer, P.; Brox, T. U-Net: {Convolutional} Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Munich, Germany, 5–9 October 2015; Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Volume 9351, pp. 234–241. [Google Scholar] [CrossRef]
Clèrigues, A.; Valverde, S.; Bernal, J.; Freixenet, J.; Oliver, A.; Lladó, X. Acute and Sub-Acute Stroke Lesion Segmentation from Multimodal MRI. Comput. Methods Programs Biomed. 2020, 194, 105521. [Google Scholar] [CrossRef]
Guerrero, R.; Qin, C.; Oktay, O.; Bowles, C.; Chen, L.; Joules, R.; Wolz, R.; Valdés-Hernández, M.C.; Dickie, D.A.; Wardlaw, J.; et al. White Matter Hyperintensity and Stroke Lesion Segmentation and Differentiation Using Convolutional Neural Networks. Neuroimage Clin. 2018, 17, 918–934. [Google Scholar] [CrossRef]
Mitra, J.; Bourgeat, P.; Fripp, J.; Ghose, S.; Rose, S.; Salvado, O.; Connelly, A.; Campbell, B.; Palmer, S.; Sharma, G.; et al. Lesion Segmentation from Multimodal MRI Using Random Forest Following Ischemic Stroke. Neuroimage 2014, 98, 324–335. [Google Scholar] [CrossRef]
Ghafoorian, M.; Karssemeijer, N.; van Uden, I.W.M.; de Leeuw, F.-E.; Heskes, T.; Marchiori, E.; Platel, B. Automated Detection of White Matter Hyperintensities of All Sizes in Cerebral Small Vessel Disease. Med. Phys. 2016, 43, 6246. [Google Scholar] [CrossRef]
Liu, M.; Wang, T.; Liu, D.; Gao, F.; Cao, J. Improved UNet-Based Magnetic Resonance Imaging Segmentation of Demyelinating Diseases with Small Lesion Regions. Cogn. Comput. Syst. 2024, 1–8. [Google Scholar] [CrossRef]
Zhou, D.; Xu, L.; Wang, T.; Wei, S.; Gao, F.; Lai, X.; Cao, J. M-DDC: MRI Based Demyelinative Diseases Classification with U-Net Segmentation and Convolutional Network. Neural Netw. 2024, 169, 108–119. [Google Scholar] [CrossRef]
Ma, J.; He, Y.; Li, F.; Han, L.; You, C.; Wang, B. Segment Anything in Medical Images. Nat. Commun. 2024, 15, 654. [Google Scholar] [CrossRef] [PubMed]
Hu, X.; Xu, X.; Shi, Y. How to Efficiently Adapt Large Segmentation Model(SAM) to Medical Images. arXiv 2023, arXiv:2306.13731. [Google Scholar]
Tolstikhin, I.; Houlsby, N.; Kolesnikov, A.; Beyer, L.; Zhai, X.; Unterthiner, T.; Yung, J.; Steiner, A.; Keysers, D.; Uszkoreit, J.; et al. MLP-Mixer: An All-MLP Architecture for Vision. Adv. Neural Inf. Process. Syst. 2021, 29, 24261–24272. [Google Scholar]
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image Is Worth 16x16 Words: Transformers For Image Recognition at Scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
Mazurowski, M.A.; Dong, H.; Gu, H.; Yang, J.; Konz, N.; Zhang, Y. Segment Anything Model for Medical Image Analysis: An Experimental Study. Med. Image Anal. 2023, 89, 102918. [Google Scholar] [CrossRef]
Huang, Y.; Yang, X.; Liu, L.; Zhou, H.; Chang, A.; Zhou, X.; Chen, R.; Yu, J.; Chen, J.; Chen, C.; et al. Segment Anything Model for Medical Images? Med. Image Anal. 2023, 92, 103061. [Google Scholar] [CrossRef]
Cheng, D.; Qin, Z.; Jiang, Z.; Zhang, S.; Lao, Q.; Li, K. SAM on Medical Images: A Comprehensive Study on Three Prompt Modes. arXiv 2023, arXiv:2305.00035. [Google Scholar]
Tu, Z.; Gu, L.; Wang, X.; Jiang, B.; Provincial, A. Ultrasound SAM Adapter: Adapting SAM for Breast Lesion Segmentation in Ultrasound Images. arXiv 2024, arXiv:2404.14837. [Google Scholar]
Bhardwaj, R.B.; Haneef, D.A. Use of Segment Anything Model (SAM) and MedSAM in the Optic Disc Segmentation of Colour Retinal Fundus Images: Experimental Finding. Indian J. Health Care Med. Pharm. Pract. 2023, 4, 82–93. [Google Scholar] [CrossRef]
Fazekas, B.; Morano, J.; Lachinov, D.; Aresta, G.; Bogunović, H. Adapting Segment Anything Model (SAM) for Retinal OCT. In Proceedings of the OMIA 2023, Vancouver, BC, Canada, 12 October 2023; Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Volume 14096 LNCS, pp. 92–101. [Google Scholar] [CrossRef]
Zhao, Y.; Song, K.; Cui, W.; Ren, H.; Yan, Y. MFS Enhanced SAM: Achieving Superior Performance in Bimodal Few-Shot Segmentation. J. Vis. Commun. Image Represent. 2023, 97, 103946. [Google Scholar] [CrossRef]
Li, N.; Xiong, L.; Qiu, W.; Pan, Y.; Luo, Y.; Zhang, Y. Segment Anything Model for Semi-Supervised Medical Image Segmentation via Selecting Reliable Pseudo-Labels. Commun. Comput. Inf. Sci. 2024, 1964 CCIS, 138–149. [Google Scholar] [CrossRef]
Ravishankar, H.; Patil, R.; Melapudi, V.; Annangi, P. SonoSAM—Segment Anything on Ultrasound Images. In Proceedings of the ASMUS 2023, Vancouver, BC, Canada, 8 October 2023; Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Volume 14337 LNCS, pp. 23–33. [Google Scholar] [CrossRef]
Jiménez-Gaona, Y.; Álvarez, M.J.R.; Castillo-Malla, D.; García-Jaen, S.; Carrión-Figueroa, D.; Corral-Domínguez, P.; Lakshminarayanan, V. BraNet: A Mobil Application for Breast Image Classification Based on Deep Learning Algorithms. Med. Biol. Eng. Comput. 2024, 62, 2737–2756. [Google Scholar] [CrossRef] [PubMed]
Zhang, Y.; Shen, Z.; Jiao, R. Segment Anything Model for Medical Image Segmentation: Current Applications and Future Directions. Comput. Biol. Med. 2024, 171, 108238. [Google Scholar] [CrossRef] [PubMed]
GitHub—YichiZhang98/SAM4MIS: Segment Anything Model for Medical Image Segmentation: Paper List and Open-Source Project Summary. Available online: https://github.com/YichiZhang98/SAM4MIS (accessed on 24 May 2024).
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar] [CrossRef]
Almufareh, M.F.; Imran, M.; Khan, A.; Humayun, M.; Asim, M. Automated Brain Tumor Segmentation and Classification in MRI Using YOLO-Based Deep Learning. IEEE Access 2024, 12, 16189–16207. [Google Scholar] [CrossRef]
Mortada, M.J.; Tomassini, S.; Anbar, H.; Morettini, M.; Burattini, L.; Sbrollini, A. Segmentation of Anatomical Structures of the Left Heart from Echocardiographic Images Using Deep Learning. Diagnostics 2023, 13, 1683. [Google Scholar] [CrossRef]
Ragab, M.G.; Abdulkader, S.J.; Muneer, A.; Alqushaibi, A.; Sumiea, E.H.; Qureshi, R.; Al-Selwi, S.M.; Alhussian, H. A Comprehensive Systematic Review of YOLO for Medical Object Detection (2018 to 2023). IEEE Access 2024, 12, 57815–57836. [Google Scholar] [CrossRef]
Aldughayfiq, B.; Ashfaq, F.; Jhanjhi, N.Z.; Humayun, M. YOLO-Based Deep Learning Model for Pressure Ulcer Detection and Classification. Healthcare 2023, 11, 1222. [Google Scholar] [CrossRef]
Chen, J.L.; Cheng, L.H.; Wang, J.; Hsu, T.W.; Chen, C.Y.; Tseng, L.M.; Guo, S.M. A YOLO-Based AI System for Classifying Calcifications on Spot Magnification Mammograms. Biomed. Eng. Online 2023, 22, 54. [Google Scholar] [CrossRef]
Baccouche, A.; Garcia-Zapirain, B.; Olea, C.C.; Elmaghraby, A.S. Breast Lesions Detection and Classification via YOLO-Based Fusion Models. Comput. Mater. Contin. 2021, 69, 1407–1425. [Google Scholar] [CrossRef]
Santos, C.; Aguiar, M.; Welfer, D.; Belloni, B. A New Approach for Detecting Fundus Lesions Using Image Processing and Deep Neural Network Architecture Based on YOLO Model. Sensors 2022, 22, 6441. [Google Scholar] [CrossRef]
Ünver, H.M.; Ayan, E. Skin Lesion Segmentation in Dermoscopic Images with Combination of YOLO and GrabCut Algorithm. Diagnostics 2019, 9, 72. [Google Scholar] [CrossRef]
Kang, M.; Ting, C.-M.; Ting, F.F.; Phan, R.C.-W. ASF-YOLO: A Novel YOLO Model with Attentional Scale Sequence Fusion for Cell Instance Segmentation. Image Vis. Comput. 2024, 147, 105057. [Google Scholar] [CrossRef]
Abdusalomov, A.B.; Mukhiddinov, M.; Whangbo, T.K. Brain Tumor Detection Based on Deep Learning Approaches and Magnetic Resonance Imaging. Cancers 2023, 15, 4172. [Google Scholar] [CrossRef] [PubMed]
Mercaldo, F.; Brunese, L.; Martinelli, F.; Santone, A.; Cesarelli, M. Object Detection for Brain Cancer Detection and Localization. Appl. Sci. 2023, 13, 9158. [Google Scholar] [CrossRef]
He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 42, 386–397. [Google Scholar] [CrossRef]
Girshick, R. Fast R-CNN. In Proceedings of the 2015 IEEE International IEEE Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
Zhang, H.; Chang, H.; Ma, B.; Shan, S.; Chen: Cascade, X.; Zhang, H.; Chang, H.; Cn Bingpeng Ma, A.; Shan, S.; Chen, X. Cascade RetinaNet: Maintaining Consistency for Single-Stage Object Detection. In Proceedings of the 30th British Machine Vision Conference 2019, BMVC 2019, Cardiff, UK, 9–12 September 2019. [Google Scholar]
Detectron2: A PyTorch-Based Modular Object Detection Library. Available online: https://ai.meta.com/blog/-detectron2-a-pytorch-based-modular-object-detection-library-/ (accessed on 25 May 2024).
Chincholi, F.; Koestler, H. Detectron2 for Lesion Detection in Diabetic Retinopathy. Algorithms 2023, 16, 147. [Google Scholar] [CrossRef]
Salh, C.H.; Ali, A.M. Automatic Detection of Breast Cancer for Mastectomy Based on MRI Images Using Mask R-CNN and Detectron2 Models. Neural Comput. Appl. 2024, 36, 3017–3035. [Google Scholar] [CrossRef]
Dipu, N.M.; Shohan, S.A.; Salam, K.M.A. Brain Tumor Detection Using Various Deep Learning Algorithms. In Proceedings of the 2021 International Conference on Science and Contemporary Technologies, ICSCT 2021, Dhaka, Bangladesh, 5–7 August 2021. [Google Scholar] [CrossRef]
Home—WMH Segmentation Challenge. Available online: https://wmh.isi.uu.nl/ (accessed on 30 June 2022).
Bologna, M.; Corino, V.; Mainardi, L. Technical Note: Virtual Phantom Analyses for Preprocessing Evaluation and Detection of a Robust Feature Set for MRI-Radiomics of the Brain. Med. Phys. 2019, 46, 5116–5123. [Google Scholar] [CrossRef]
Masoudi, S.; Harmon, S.A.; Mehralivand, S.; Walker, S.M.; Raviprakash, H.; Bagci, U.; Choyke, P.L.; Turkbey, B. Quick Guide on Radiology Image Pre-Processing for Deep Learning Applications in Prostate Cancer Research. J. Med. Imaging 2021, 8, 010901. [Google Scholar] [CrossRef]
Um, H.; Tixier, F.; Bermudez, D.; Deasy, J.O.; Young, R.J.; Veeraraghavan, H. Impact of Image Preprocessing on the Scanner Dependence of Multi-Parametric MRI Radiomic Features and Covariate Shift in Multi-Institutional Glioblastoma Datasets. Phys. Med. Biol. 2019, 64, 165011. [Google Scholar] [CrossRef]
Orlhac, F.; Lecler, A.; Savatovski, J.; Goya-Outi, J.; Nioche, C.; Charbonneau, F.; Ayache, N.; Frouin, F.; Duron, L.; Buvat, I. How Can We Combat Multicenter Variability in MR Radiomics? Validation of a Correction Procedure. Eur. Radiol. 2021, 31, 2272–2280. [Google Scholar] [CrossRef]
Zhang, K.; Song, H.; Zhang, L. Active Contours Driven by Local Image Fitting Energy. Pattern Recognit. 2010, 43, 1199–1206. [Google Scholar] [CrossRef]
Tustison, N.J.; Avants, B.B.; Cook, P.A.; Zheng, Y.; Egan, A.; Yushkevich, P.A.; Gee, J.C. N4ITK: Improved N3 Bias Correction. IEEE Trans. Med. Imaging 2010, 29, 1310–1320. [Google Scholar] [CrossRef] [PubMed]
Avants, B.B.; Tustison, N.J.; Wu, J.; Cook, P.A.; Gee, J.C. An Open Source Multivariate Framework for N-Tissue Segmentation with Evaluation on Public Data. Neuroinformatics 2011, 9, 381–400. [Google Scholar] [CrossRef] [PubMed]
Shah, M.; Xiao, Y.; Subbanna, N.; Francis, S.; Arnold, D.L.; Collins, D.L.; Arbel, T. Evaluating Intensity Normalization on MRIs of Human Brain with Multiple Sclerosis. Med. Image Anal. 2011, 15, 267–282. [Google Scholar] [CrossRef]
Consortium, M. MONAI: Medical Open Network for AI, Version 0.8.0; Zenodo: Genève, Switzerland, 2021. [CrossRef]
MONAIBootcamp2021/1. Getting Started with MONAI.Ipynb at Main Project-MONAI/MONAIBootcamp2021 GitHub. Available online: https://github.com/Project-MONAI/MONAIBootcamp2021/blob/main/day1/1.%20Getting%20Started%20with%20MONAI.ipynb (accessed on 11 October 2022).
Gupta, V.; Veeturi, N.; Akkur, A. Brain Aneurysm Classification from MRI Images Using MONAI Framework. CS230, Stanford University. Available online: http://cs230.stanford.edu/projects_winter_2021/reports/70747002.pdf (accessed on 20 January 2025).
Diaz-Pinto, A.; Alle, S.; Ihsani, A.; Asad, M.; Nath, V.; Pérez-García, F.; Mehta, P.; Li, W.; Roth, H.R.; Vercauteren, T.; et al. MONAI Label: A Framework for AI-Assisted Interactive Labeling of 3D Medical Images. Med. Image Anal. 2024, 95, 103207. [Google Scholar] [CrossRef]
MONAI—Home. Available online: https://monai.io/ (accessed on 4 October 2022).
LeCun, Y.; Bengio, Y.; Hinton, G.; Goodfellow, I.; Bengio, Y.; Courville, A.; LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Nets. Adv. Neural Inf. Process. Syst. 2014, 27, 2672–2680. [Google Scholar]
Jung, E.; Luna, M.; Park, S.H. Conditional GAN with 3D Discriminator for MRI Generation of Alzheimer’s Disease Progression. Pattern Recognit. 2023, 133, 109061. [Google Scholar] [CrossRef]
Alrashedy, H.H.N.; Almansour, A.F.; Ibrahim, D.M.; Hammoudeh, M.A.A. BrainGAN: Brain MRI Image Generation and Classification Framework Using GAN Architectures and CNN Models. Sensors 2022, 22, 4297. [Google Scholar] [CrossRef]
Woldesellasse, H.; Tesfamariam, S. Data Augmentation Using Conditional Generative Adversarial Network (CGAN): Application for Prediction of Corrosion Pit Depth and Testing Using Neural Network. J. Pipeline Sci. Eng. 2023, 3, 100091. [Google Scholar] [CrossRef]
Shao, S.; Wang, P.; Yan, R. Generative Adversarial Networks for Data Augmentation in Machine Fault Diagnosis. Comput. Ind. 2019, 106, 85–93. [Google Scholar] [CrossRef]
Borji, A. Pros and Cons of GAN Evaluation Measures. Comput. Vis. Image Underst. 2019, 179, 41–65. [Google Scholar] [CrossRef]
Parmar, G.; Zhang, R.; Zhu, J.Y. On Aliased Resizing and Surprising Subtleties in GAN Evaluation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 11400–11410. [Google Scholar] [CrossRef]
Detectron2/Configs/COCO-Detection/Faster_rcnn_R_50_FPN_3x.Yaml at Main Facebookresearch/Detectron2 GitHub. Available online: https://github.com/facebookresearch/detectron2/blob/main/configs/COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml (accessed on 25 May 2024).
Detectron2/Configs/COCO-InstanceSegmentation/Mask_rcnn_R_50_FPN_3x.Yaml at Main Facebookresearch/Detectron2 GitHub. Available online: https://github.com/facebookresearch/detectron2/blob/main/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml (accessed on 25 May 2024).
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
Li, X.; Zhao, Y.; Jiang, J.; Cheng, J.; Zhu, W.; Wu, Z.; Jing, J.; Zhang, Z.; Wen, W.; Sachdev, P.S.; et al. White Matter Hyperintensities Segmentation Using an Ensemble of Neural Networks. Hum. Brain Mapp. 2022, 43, 929–939. [Google Scholar] [CrossRef]
Sahayam, S.; Abirami, A.; Jayaraman, U. A Novel Modified U-Shaped 3-D Capsule Network (MUDCap3) for Stroke Lesion Segmentation from Brain MRI. In Proceedings of the 4th IEEE Conference on Information and Communication Technology, CICT 2020, Chennai, India, 3–5 December 2020. [Google Scholar] [CrossRef]
Zhang, H.; Zhu, C.; Lian, X.; Hua, F. A Nested Attention Guided UNet++ Architecture for White Matter Hyperintensity Segmentation. IEEE Access 2023, 11, 66910–66920. [Google Scholar] [CrossRef]
Farkhani, S.; Demnitz, N.; Boraxbekk, C.J.; Lundell, H.; Siebner, H.R.; Petersen, E.T.; Madsen, K.H. End-to-End Volumetric Segmentation of White Matter Hyperintensities Using Deep Learning. Comput. Methods Programs Biomed. 2024, 245, 108008. [Google Scholar] [CrossRef]
Li, H.; Jiang, G.; Zhang, J.; Wang, R.; Wang, Z.; Zheng, W.-S.; Menze, B. Fully Convolutional Network Ensembles for White Matter Hyperintensities Segmentation in MR Images. Neuroimage 2018, 183, 650–665. [Google Scholar] [CrossRef]
Wu, J.; Zhang, Y.; Wang, K.; Tang, X. Skip Connection U-Net for White Matter Hyperintensities Segmentation from MRI. IEEE Access 2019, 7, 155194–155202. [Google Scholar] [CrossRef]
Liu, L.; Chen, S.; Zhu, X.; Zhao, X.M.; Wu, F.X.; Wang, J. Deep Convolutional Neural Network for Accurate Segmentation and Quantification of White Matter Hyperintensities. Neurocomputing 2020, 384, 231–242. [Google Scholar] [CrossRef]
Rathore, S.; Niazi, T.; Iftikhar, M.A.; Singh, A.; Rathore, B.; Bilello, M.; Chaddad, A. Multimodal Ensemble-Based Segmentation of White Matter Lesions and Analysis of Their Differential Characteristics across Major Brain Regions. Appl. Sci. 2020, 10, 1903. [Google Scholar] [CrossRef]
Lee, A.R.; Woo, I.; Kang, D.W.; Jung, S.C.; Lee, H.; Kim, N. Fully Automated Segmentation on Brain Ischemic and White Matter Hyperintensities Lesions Using Semantic Segmentation Networks with Squeeze-and-Excitation Blocks in MRI. Inform Med. Unlocked 2020, 21, 100440. [Google Scholar] [CrossRef]
Zhou, P.; Liang, L.; Guo, X.; Lv, H.; Wang, T.; Ma, T. U-Net Combined with CRF and Anatomical Based Spatial Features to Segment White Matter Hyperintensities. In Proceedings of the 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Montreal, QC, Canada, 20–24 July 2020; pp. 1754–1757. [Google Scholar] [CrossRef]
Karthik, R.; Menaka, R.; Hariharan, M.; Won, D. Ischemic Lesion Segmentation Using Ensemble of Multi-Scale Region Aligned CNN. Comput. Methods Programs Biomed. 2021, 200, 105831. [Google Scholar] [CrossRef] [PubMed]
Zhang, S.; Xu, S.; Tan, L.; Wang, H.; Meng, J. Stroke Lesion Detection and Analysis in MRI Images Based on Deep Learning. J. Healthc. Eng. 2021, 2021, 5524769. [Google Scholar] [CrossRef]
Uçar, G.; Dandil, E. Automatic Detection of White Matter Hyperintensities via Mask Region-Based Convolutional Neural Networks Using Magnetic Resonance Images. In Deep Learning for Medical Applications with Unique Data; Academic Press: Cambridge, MA, USA, 2022; pp. 153–179. [Google Scholar] [CrossRef]
Chen, S.; Sedghi Gamechi, Z.; Dubost, F.; van Tulder, G.; de Bruijne, M. An End-to-End Approach to Segmentation in Medical Images with CNN and Posterior-CRF. Med. Image Anal. 2022, 76, 102311. [Google Scholar] [CrossRef]
Wang, J.; Wang, S.; Liang, W. METrans: Multi-Encoder Transformer for Ischemic Stroke Segmentation. Electron. Lett. 2022, 58, 340–342. [Google Scholar] [CrossRef]
Khezrpour, S.; Seyedarabi, H.; Razavi, S.N.; Farhoudi, M. Automatic Segmentation of the Brain Stroke Lesions from MR Flair Scans Using Improved U-Net Framework. Biomed. Signal Process. Control 2022, 78, 103978. [Google Scholar] [CrossRef]
Uçar, G.; Dandıl, E. Enhanced Detection of White Matter Hyperintensities via Deep Learning-Enabled MR Imaging Segmentation. Trait. Du Signal 2024, 41, 1–21. [Google Scholar] [CrossRef]
Soleimani, P.; Farezi, N. Utilizing Deep Learning via the 3D U-Net Neural Network for the Delineation of Brain Stroke Lesions in MRI Image. Sci. Rep. 2023, 13, 19808. [Google Scholar] [CrossRef]
Liew, S.-L.; Anglin, J.M.; Banks, N.W.; Sondag, M.; Ito, K.L.; Kim, H.; Chan, J.; Ito, J.; Jung, C.; Khoshab, N.; et al. A Large, Open Source Dataset of Stroke Anatomical Brain Images and Manual Lesion Segmentations. Sci. Data 2018, 5, 180011. [Google Scholar] [CrossRef]
Rieu, Z.H.; Kim, J.Y.; Kim, R.E.Y.; Lee, M.; Lee, M.K.; Oh, S.W.; Wang, S.M.; Kim, N.Y.; Kang, D.W.; Lim, H.K.; et al. Semi-Supervised Learning in Medical MRI Segmentation: Brain Tissue with White Matter Hyperintensity Segmentation Using Flair MRI. Brain Sci. 2021, 11, 720. [Google Scholar] [CrossRef]
Rekik, I.; Allassonnière, S.; Carpenter, T.K.; Wardlaw, J.M. Medical Image Analysis Methods in MR/CT-Imaged Acute-Subacute Ischemic Stroke Lesion: Segmentation, Prediction and Insights into Dynamic Evolution Simulation Models. A Critical Appraisal. NeuroImage Clin. 2012, 1, 164–178. [Google Scholar] [CrossRef] [PubMed]
Kuijf, H.J.; Biesbroek, J.M.; De Bresser, J.; Heinen, R.; Andermatt, S.; Bento, M.; Berseth, M.; Belyaev, M.; Cardoso, M.J.; Casamitjana, A.; et al. Standardized Assessment of Automatic Segmentation of White Matter Hyperintensities and Results of the WMH Segmentation Challenge. IEEE Trans. Med. Imaging 2019, 38, 2556–2568. [Google Scholar] [CrossRef] [PubMed]
Weeda, M.M.; Brouwer, I.; de Vos, M.L.; de Vries, M.S.; Barkhof, F.; Pouwels, P.J.W.; Vrenken, H. Comparing Lesion Segmentation Methods in Multiple Sclerosis: {Input} from One Manually Delineated Subject Is Sufficient for Accurate Lesion Segmentation. Neuroimage Clin. 2019, 24, 102074. [Google Scholar] [CrossRef] [PubMed]
Praveen, G.B.; Agrawal, A.; Sundaram, P.; Sardesai, S. Ischemic Stroke Lesion Segmentation Using Stacked Sparse Autoencoder. Comput. Biol. Med. 2018, 99, 38–52. [Google Scholar] [CrossRef]
Khademi, A.; Gibicar, A.; Arezza, G.; DiGregorio, J.; Tyrrell, P.N.; Moody, A.R. Segmentation of White Matter Lesions in Multicentre FLAIR MRI. Neuroimage Rep. 2021, 1, 100044. [Google Scholar] [CrossRef]
Mortazavi, D.; Kouzani, A.Z.; Soltanian-Zadeh, H. Segmentation of Multiple Sclerosis Lesions in MR Images: A Review. Neuroradiology 2012, 54, 299–320. [Google Scholar] [CrossRef]
Rudie, J.D.; Weiss, D.A.; Saluja, R.; Rauschecker, A.M.; Wang, J.; Sugrue, L.; Bakas, S.; Colby, J.B. Multi-Disease Segmentation of Gliomas and White Matter Hyperintensities in the BraTS Data Using a 3D Convolutional Neural Network. Front. Comput. Neurosci. 2019, 13, 84. [Google Scholar] [CrossRef]
Peivandi, M.; Zhang, J.; Lu, M.; Zhu, D.; Kou, Z. Empirical Evaluation of the Segment Anything Model (SAM) for Brain Tumor Segmentation. arXiv 2023, arXiv:2310.06162. [Google Scholar]
Zhang, P.; Wang, Y. Segment Anything Model for Brain Tumor Segmentation. arXiv 2023, arXiv:2309.08434. [Google Scholar]
Paul, S.; Ahad, D.M.T.; Hasan, M.M. Brain Cancer Segmentation Using YOLOv5 Deep Neural Network. arXiv 2022, arXiv:2212.13599. [Google Scholar]
Pandey, S.; Chen, K.-F.; Dam, E.B. Comprehensive Multimodal Segmentation in Medical Imaging: Combining YOLOv8 with SAM and HQ-SAM Models. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops 2023, Paris, France, 2–6 October 2023; pp. 2592–2598. [Google Scholar]
Mohammad, S.; Hashemi, H.; Safari, L.; Dadashzade Taromi, A. Realism in Action: Anomaly-Aware Diagnosis of Brain Tumors from Medical Images Using YOLOv8 and DeiT. arXiv 2024, arXiv:2401.03302. [Google Scholar]

Figure 1. The flowchart of the proposed methodology: (a) the dataset, acquisition, number of images, preprocessing steps, and data augmentation methods applied to the dataset; (b) the DL methods, feature extraction, segmentation using UNET and SAM, and classification and detection using YOLOv8 and Detectron2.

Figure 2. Some examples of images used in the dataset for training.

Figure 3. Slices from the volume of a real patient with two lesions: ischemia (red circles) and demyelination (yellow circles). It is seen that the lesions are not continuous between slices. The lesions change in each slice, as well as their size, shape, and location. The different colors refer to the two types of lesions in the slice.

Figure 4. Examples of images before and after the noise and artifact reduction.

Figure 5. Examples of data augmentation. Some artifacts and noise are also generated.

Figure 6. Examples of data augmentation using SNGAN at 275 epochs. The synthetic images do not show good confidence in the lesions of ischemia and demyelination diseases.

Figure 7. The UNet model for segmenting brain lesions related to ischemia and demyelination.

Figure 8. Segment Anything Model (SAM) architecture with the data and trained models used in this project. The model consists of an image encoder to extract image embeddings, a prompt encoder, and a mask decoder to predict segmentation masks using the image and prompt embeddings. This figure was adapted from [57].

Figure 9. Architecture overview YOLOv8 applied in this project. The input image passes to the training process between the Bounding Boxes and Class Probability to give the input image with the bounding boxes and the probability of detection. This figure was adapted from [60].

Figure 10. Architecture overview of Detectron2—R50-FPN applied in this project. The input image passes to the Backbone Network, Region Proposal Network, and Box Head with Fast R-CNN for object identification.

Figure 11. Correlogram of lesions distribution and characteristics. (a) Spatial distribution of lesions, showing their tendency to occur in specific brain regions. In (b), width and (c) height are shown an analysis of lesion dimensions, indicating that the majority are small, with sizes below 0.2 pixels. It is observed that despite the localization trends, there is no strong correlation pattern between lesion occurrence and size.

Figure 12. Training and validation loss curves from UNet model. The training and validation loss curves converge after 100 epochs and stabilize around 0.35, indicating that the model has learned most of the features and is refining its predictions.

Figure 13. Examples of the lesion prediction using the UNet model. (a) Original images with their corresponding (b) masks and (c) prediction results.

Figure 14. The training and validation loss of the SAM model with each of the trained models: (a) vit-base, (b) vit-large, and (c) vit-huge.

Figure 15. Some examples of segmentation lesion predictions using the SAM model with 25 epochs of training and different processors. (a) MR Image. (b) Ground truth mask. (c) Predicted mask. The orange areas correspond to the WMH lesions.

Figure 16. Confusion matrix of the YOLO detection model using the pre-trained “yolov8n-seg.pt” model. On the right side, and below it, are some examples of the classification of the lesions.

Figure 17. The experimental results with the pre-trained “yolov8n-seg.pt” model. The graphs are concerned with the trend of training and validation loss scores over the epochs and their corresponding precision and recall metrics related to bounding box prediction, segmentation, and classification of lesions. The values of the loss or metric are plotted on the y-axis, and the epochs are represented on the x-axis.

Figure 18. Examples of detection of the lesions. The original image without detecting a lesion is shown on the left side, and the lesion prediction with model Detectron2 is shown on the right. In the center column, the classification done by the radiologist expert is shown; the “yellow” color refers to demyelination lesions and the violet color refers to ischemia lesions.

Figure 19. Graphs of loss and accuracy metrics for the lesion detection model. (a) Loss curves over training iterations illustrate the optimization of different loss components, including classification, bounding box regression, mask loss, and total loss. (b) Accuracy trends over training iterations, depicting the performance of the Fast R-CNN classifier and Mask R-CNN segmentation accuracy. The stability and convergence of these metrics indicate the model’s learning progress and effectiveness.

Figure 20. Graphs of false positive and false negative rates over training iterations. The false negative rate (orange) decreases steadily, indicating improved sensitivity. The false positive rate (blue) remains relatively stable, suggesting consistent precision in lesion detection. These trends highlight the model’s learning process and ability to refine segmentation accuracy over time.

Figure 21. Correlogram of the balanced number of instances (lesions) used for the YOLO model to detect and classify ischemia and demyelination. The bar plots at the top represent the distribution of lesion instances across classes. The scatter plots (x-y plane) illustrate the location and distribution of the lesions (scattered points) within the image, as well as the correlation between lesion height and width, providing insights into lesion size variability. This visualization includes information on the lesion characteristics.

Figure 22. Experimental analysis results, including training and validation metrics, precision–recall curves, and mean average precision (mAP) metrics. These graphs illustrate the model’s learning progression, performance across evaluation metrics, and effectiveness in detecting and segmenting lesions.

Figure 23. Confusion matrix of the YOLO classification to distinguish between ischemic and demyelination lesions using the pre-trained “yolov8n-seg.pt” model. The images on the top right side and lower sections are examples of the classification of the lesions with their corresponding confidence scores in predicting each lesion type. These results highlight the model’s effectiveness in lesion classification while revealing potential challenges in differentiating lesions with similar radiological characteristics.

Figure 24. Examples of detection and classification of the lesions. The image without a classification is shown on the left side, and the classification prediction with model Detectron2 is shown on the right. The labels “ische” refer to ischemia, and “demy” refer to demyelination diseases, respectively. In the center column, the classification performed by the radiologist expert is shown; the “yellow” color refers to demyelination lesions and the violet color refers to ischemia lesions.

Figure 25. Training performance metrics for classifying and detecting lesions using the Detectron2 model. (a) Loss curves over iterations, including classification loss (orange), bounding box regression loss (blue), mask loss (green), and total loss (brown), indicate the behavior convergence of the model. (b) Accuracy metrics over iterations, illustrating the classification performance of the Fast R-CNN and Mask R-CNN models. The increasing accuracy trends suggest improved learning stability and effectiveness in lesion detection and classification between ischemia and demyelination.

Figure 26. Graphs of the metrics concerning the model’s false positive and false negative rates. The graph illustrates how the model refines its predictions over time, with the false negative rate (red) decreasing as the model improves sensitivity and correctly identifies more lesions. Similarly, the false positive rate (yellow) stabilizes, indicating enhanced precision in lesion classification.

Figure 27. Graphs of performance metrics of the classification model over training iterations. (a) Accuracy, (b) recall, (c) precision, and (d) F1_score values of the classification model. All values maintain an average value above 0.9, indicating high classification reliability.

Figure 28. Visual comparison of the detection and classification of the lesions using the Detectron2 and YOLOv8 models against the radiologist expert. In the Detectron2 Model, a threshold for lesion detection of 0.8 and 0.5 is used. In the YOLOv8 model, the threshold used for lesion detection is 0.2.

Figure 29. Visual comparison of the detection and classification of the lesions using the Detectron2 and threshold for lesion detection of 0.8 and 0.5. The threshold level allows for a change in the sensitivity of the detection of lesions.

Figure 30. Comparative analysis of detection and classification performance metrics of accuracy, precision, recall, F1-score, sensitivity, and specificity for detection and classification of the lesions across three evaluations: criteria by Experts (blue bars), Detectron2 (orange bars), and YOLOv8 (green bars). The graph highlights the strong performance of the Detectron2 model comparable to the reliability of the expert’s criteria and the lower performance of YOLOv8, particularly for recall and sensitivity.

Figure 31. ROC curve comparison for detection and classification of the lesions using the criteria by Experts (AUC = 0.976), Detectron2 (AUC = 0.929), and YOLOv8 (AUC = 0.524). The curve illustrates the trade-off between true positive rate (sensitivity) and false positive rate, highlighting the good performance of Detectron2, comparable to experts, while YOLOv8 shows limited discriminatory power for lesion classification. The dashed diagonal line represents a random classifier (AUC = 0.5).

Table 1. Summary of the data used in this project.

Dataset	Origin Places	Scanner	FLAIR (Voxel Size, mm³)	TR/TE (ms)	Quantity
Dataset	Origin Places	Scanner	FLAIR (Voxel Size, mm³)	TR/TE (ms)	Volumes	Slices Images
Public	Singapore	3TSiemens Trio Tim	1.00 × 1.00 × 3.00	9000/82	30	550
	Utrecht	3T Philips Achieva	0.98 × 0.98 × 1.20	11,000/125	30	536
	Amsterdam	3T GE Signa HDxt	0.98 × 0.98 × 1.20	8000/126	50	1564
Private	HUTPL—EC	1.5 Philips Achieva	0.89 × 0.89 × 6.00	11,000/140	80	200
Total patients’ volumes studies					190
Total images slices						2850

Table 2. Hyperparameters tunning the SNGAN deep learning model.

Hyperparameter:	SNGAN	Quantity Images Feed: Quantity Data Generated:	263 560
Batch size:	32	β1: β2:	0.1 0.9
Image size:	128	Latent vector:	100
Epochs:	400	Loss function:	Hinge/BCE
Best Epoch/model generated images:	275	Optimization function (Discriminator):	LeakyReLU
Learning Rate:	0.0001	Optimization function (Generator):	ReLU

Table 3. Training parameters of the UNet network.

Optimizer:	Adam	Epochs:	400	Data/Train:	2650
Learning Rate:	0.0001	Batch Size/Training:	128	Batch Size/Validation:	64
Dropout:	0.2	Filter sizes:	16, 32, 64, 128, 256	Weight_decay:	1 × 10⁻⁵
Input Channels:	2	Output channels:	2	Strides:	2
Loss_function:	Dice Loss	Residual Units per level:	2	Kernel:	3 × 3

Table 4. Training parameters of the SAM.

Optimizer:	Adam	Epochs:	25	Data/Train:	2850
Learning Rate:	0.001	Batch Size/Training:	2	Batch Size/Validation:	1
Target size:	256 × 256	Filter sizes:		Weight decay:	0
SAM processors:	sam-vit-base	sam-vit-large		sam-vit-huge medsam
Loss_function:	Focal Loss

Table 5. Parameters of the YOLO model.

raining Parameters	Task:	segment classification	Epochs:	100	Data/ Train:	2650 220
raining Parameters	Patience-to stop early:	50	Batch Size/Training:	4	Target size:	256
Optimization Parameters	Optimizer:	auto	Learning rate:	0.01	Weight decay:	0.0005
	Momentum:	0.937	Warmup		Warmup Bias	0.1
	Momentum:	0.937	Momentum:	0.8	Learning Rate:
Models Pretrained Tested	yolov8x-seg.pt yolov8n-seg.pt		yolov8n-seg.pt		IOU Threshold:	0.2–0.7
Data Augmentation Parameters	Horizontal Flip Probability (fliplr):	0.5	HSV Hue:	0.015	HSV: Saturation:	0.7
	HSV Value:	0.4	Translate:	0.1	Scale:	0.5
	Auto Augment:	randaugment	Erasing:	0.4

Table 6. Training parameters of the architecture Detectron2.

Optimizer	Learning Rate	Iterations	Batch Size/Training	Batch Size/Validation	Dropout
Adam	0.0001	250	128	64	0.2

Table 7. DSC mean values for SAM segmentation.

Metric	Vit-Base	Vit-Large	Vit-Huge	MedSAM
Dice Mean	0.50	0.50	0.32	0.31

Table 8. Mean of values of the loss and accuracy curves of the Detectron2 model for lesions.

	loss_box_reg	loss_cls	loss_mask	loss_rpn_cls	loss_rpn_loc	total_loss	mask_rcnn/accuracy	mask_rcnn/false_negative	mask_rcnn/false_positive	fast_rcnn/cls_accuracy	fast_rcnn/false_negative	fast_rcnn/fg_cls_accuracy
mean	0.312	0.120	0.245	0.017	0.132	0.843	0.887	0.227	0.066	0.948	0.159	0.840

Table 9. Statistics of the loss and accuracy curves of the Detectron2 model.

	loss_box_reg	loss_cls	loss_mask	loss_rpn_cls	loss_rpn_loc	total_loss	mask_rcnn/ accuracy	fast_rcnn/ cls_accuracy	Mask_rcnn/ false negative	Mask_rcnn/ false positive
Instances	34200	3420	3420	3420	3420	3420	3420	3420	3420	3420
mean	0.072	0.030	0.154	0.007	0.029	0.300	0.930	0.988	0.100	0.050

Table 10. DSC values of segmentation lesions models proposed.

Model/Metric	UNet	SAM (Vit Large)	YOLOv8	Detectron2
DSC	0.95	0.50	0.264	0.887

Table 11. Models and DSC (Dice) metrics of different studies using WMHs compared with the proposed model.

	Task	Dataset/Modality of Images	Model	DSC
Xinxin Li, et al. [100]	Segmentation WMHs	Public MICCAI + Private/T2 FLAIR, T1	UNet and SE block	0.89
Sahayam, et al. [101]	Stroke lesion segmentation	Atlas	U-shaped 3-D Capsule	0.67
Liu, et al. [38]	Demyelinating disease	Private T2 FLAIR	UNet	0.7381
Hao Zhang, et al. [102]	Segmentation WMHs	Public MICCAI + Private/T2 Flair, T1	Nested Attention Guided UNet++	0.88
Farkhani, S., et al. [103]	Segmentation WMHs	Public MICCAI + Private (LISA)	Attention UNet	0.8543
Proposed model	Segmentation WMHs	Public MICCAI+Private/FLAIR	UNet SAM (vit large) YOLOv8 Detectron2	0.95 0.50 0.246 0.887

Table 12. Classification metrics with a set of test images/lesions for comparison between Expert, Detectron2, and YOLOv8 models.

Metric	Expert	Detectron2	YOLOv8	Comparison Models vs. Expert	Kappa	p < 0.05 (DeLong’s Test)
Accuracy	0.976	0.928	0.523	Expert vs. Detectron2	0.809	0.0534
Precision	0.954	0.909	0.600	Expert vs. Detectron2	0.809	0.0534
Recall	0.997	0.952	0.142	Expert vs. YOLOv8	0.035	0.0001
F1 score	0.976	0.930	0.230	Expert vs. YOLOv8	0.035	0.0001
Sensitivity	0.995	0.952	0.142	Detectron2 vs. YOLOv8	0.035	0.0001
Specificity	0.952	0.904	0.904	Detectron2 vs. YOLOv8	0.035	0.0001

Table 13. Comparative analysis between the proposed method and approaches in the literature.

Author/Year	Lesion Type	MR Modality	Dataset	Methods	DSC
Li et al. [104], 2018	WMH	T1-w and FLAIR	MICCAI 2017 WMH	U-Net	0.8
Guerrero et al. [35], 2018	WMH	T1-w and FLAIR	WMH (private)	CNN (uResNet)	0.7
Wu et al. [105], 2019	WMH	T1-w and FLAIR	MICCAI 2017 WMH	SC U-Net	0.78
Clerigues et al. [34], 2020	Stroke	T1, T2, FLAIR	ISLES 2015 (SISS)y (SPES)	U-Net	0.59
		DWI, CBF, CBV	ISLES 2015 (SISS)y (SPES)		0.84
		TTP and Tmax
Liu et al. [106], 2020	WMH, Ischemic stroke	T1-w and FLAIR	MICCAI 2017 WMH (train), ISLES 2015 (test)	M2DCNN	0.84
Rathore et al. [107], 2020	WMH	T1, FLAIR	MICCAI 2017 WMH	ResNet+ SVM	0.8
Lee et al. [108], 2020	Stroke	DWI	Acute Infarct (Asan Medical dataset)	U-Net+ SE (squeeze	0.85
Lee et al. [108], 2020	WMH	FLAIR	MICCAI 2017 WMH	U-Net+ SE (squeeze	0.77
Zhou et al. [109], 2020	WMH	T1, FLAIR	MICCAI 2017 WMH	U-Net+ CRF+ Spatial	0.78
Park et al. [16], 2021	WMH	T1-w and FLAIR	MICCAI 2017 WMH	U-Net+ highlighting foregrounds (HF)	0.81
Karthik et al. [110], 2021	Stroke	T1-w, T2-w, DWI and FLAIR	ISLES 2015 (SISS)	Multi-level RoI aligned CNN	0.77
Zhang, et al. [111], 2021	Stroke	DWI	Private	Faster R-CNN	0.89
				YOLO v3
				SSD
Li et al. [100], 2022	WMH	T1-w and FLAIR	MICCAI 2017 WMH Chinese National Stroke Registry	U-Net	0.83
Li et al. [100], 2022	Stroke	T1-w and FLAIR	(CNSR)	U-Net	0.78
Uçar and Dandıl [112], 2022	MS	T2-w	MICCAI 2008 MS Lesion (1)	Mask R-CNN	0.76
	Brain tumors		Private Brain Tumour dataset		0.88
	MS+ Brain tumor		TCGA-LGG (2)		0.82
			(1) + (2)
Chen et al. [113], 2022	Stroke	FLAIR	ISLES 2015 (SISS)	CNN Posterior-CRF	0.61
Chen et al. [113], 2022	WMH	T1 and FLAIR	MICCAI 2017 WMH	(U-Net based)	0.79
Wang et al. [114], 2022	Stroke	T1-w	ATLAS	U-Net	0.93
		T1-w, T2-w, DWI	ISLES 2015		0.79
		and FLAIR	ISLES 2018		0.67
Khezrpour et al. [115], 2022	Stroke	FLAIR	ISLES 2015 (SISS)	U-Net	0.9
Zhou et al. [39], 2023	Demyelinating NMOSD	MRI	Private	M-DDC	0.71
Zhou et al. [39], 2023	Demyelinating NMOSD	MRI	Private	(U-Net for pixel-level based)	0.71
Uçar and Dandıl [116], 2024	WMH	FLAIR	ISLES 2015 (SISS)	Mask R-CNN	0.83
				U-Net	0.82
	Stroke				0.93
					0.92
Liu et al. [38] 2024	Demyelination	FLAIR	Private	U-Net	0.73
Proposed	Demyelination	FLAIR	Private	Detectron2 Classification	0.98
	Ischemia			Fast R-CNN	0.94
	WMH		MICCAI 2017	Mask R-CNN	0.88

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Castillo, D.; Rodríguez-Álvarez, M.J.; Samaniego, R.; Lakshminarayanan, V. Models to Identify Small Brain White Matter Hyperintensity Lesions. Appl. Sci. 2025, 15, 2830. https://doi.org/10.3390/app15052830

AMA Style

Castillo D, Rodríguez-Álvarez MJ, Samaniego R, Lakshminarayanan V. Models to Identify Small Brain White Matter Hyperintensity Lesions. Applied Sciences. 2025; 15(5):2830. https://doi.org/10.3390/app15052830

Chicago/Turabian Style

Castillo, Darwin, María José Rodríguez-Álvarez, René Samaniego, and Vasudevan Lakshminarayanan. 2025. "Models to Identify Small Brain White Matter Hyperintensity Lesions" Applied Sciences 15, no. 5: 2830. https://doi.org/10.3390/app15052830

APA Style

Castillo, D., Rodríguez-Álvarez, M. J., Samaniego, R., & Lakshminarayanan, V. (2025). Models to Identify Small Brain White Matter Hyperintensity Lesions. Applied Sciences, 15(5), 2830. https://doi.org/10.3390/app15052830

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Models to Identify Small Brain White Matter Hyperintensity Lesions

Abstract

Featured Application

Abstract

1. Introduction

2. Related Work

2.1. Deep Learning in Brain Lesion Segmentation

2.2. Emerging Models for Small Lesion Segmentation

2.3. Object Detection and Classification Models in MRI

3. Materials and Methods

3.1. Dataset

3.1.1. Analysis of Volumes and Slices Dataset

3.1.2. Data Preprocessing

Artifacts and Noise Reduction

Bias Field Correction

Image Normalization

Resampling and Spacing Normalization

The UNet Preprocessing

3.1.3. Data Augmentation

Classical Data Augmentation

GAN Data Augmentation

3.2. Models

3.2.1. UNet Model

3.2.2. Segmenting Anything Model (SAM)

3.2.3. YOLO Model

3.2.4. Detectron2 Model

3.3. Tools and Computational Resources

4. Results

4.1. Label Analysis

4.2. Segmentation Results

4.2.1. UNet Segmentation

4.2.2. SAM Model

4.2.3. YOLO Model for Detecting WMH Lesions

4.2.4. Detectron2 for Detecting WMH Lesions

4.3. Detection and Classification

4.3.1. YOLOv8 Model for Detection and Classification

4.3.2. DETECTRON2 Model for Detection and Classification

4.4. Comparison of Proposed Segmentation Models

4.5. Brief Comparison Results Between Expert Criteria and YOLO and Detectron2 Models for Classification of Ischemia and Demyelination Lesions

5. Discussion

Limitations and Challenges

6. Conclusions

Future Works

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI