MRI Images, Brain Lesions and Deep Learning

Medical brain image analysis is a necessary step in Computer Assisted /Aided Diagnosis (CAD) systems. Advancements in both hardware and software in the past few years have led to improved segmentation and classification of various diseases. In the present work, we review the published literature on systems and algorithms that allow for classification, identification, and detection of White Matter Hyperintensities (WMHs) of brain MRI images specifically in cases of ischemic stroke and demyelinating diseases. For the selection criteria, we used the bibliometric networks. Out of a total of 140 documents we selected 38 articles that deal with the main objectives of this study. Based on the analysis and discussion of the revised documents, there is constant growth in the research and proposal of new models of deep learning to achieve the highest accuracy and reliability of the segmentation of ischemic and demyelinating lesions. Models with indicators (Dice Score, DSC: 0.99) were found, however with little practical application due to the uses of small datasets and lack of reproducibility. Therefore, the main conclusion is to establish multidisciplinary research groups to overcome the gap between CAD developments and their complete utilization in the clinical environment.


Introduction
There are estimated to be as many as a billion people worldwide [1] affected by peripheral and central neurological disorders [1,2]. Some of these disorders include: brain tumors, Parkinson's disease (PD), Alzheimer's disease (AD), multiple sclerosis (MS), epilepsy, dementia, neuroinfectious, stroke, and traumatic brain injuries [1]. According to the World Health Organization (WHO): ischemic stroke and "Alzheimer disease with other dementias" are the second and fifth major causes of death, respectively [2].
Biomedical images give fundamental information necessary for the diagnosis, prognosis, and treatment of different pathologies. Of all the various imaging modalities neuroimaging, is a crucial tool for studying the brain [3][4][5][6][7][8][9]. In terms of neuroimaging of functional MRI which are used for functional analysis of the brain [10]. Brain MRI image analysis is useful for different tasks, e.g.: lesion detection, lesion segmentation, tissue segmentation, as well as brain parcellation on neonatal, infant, and adult subjects [3,11].
MRI is frequently used in the visual inspection of cranial nerves, abnormalities of the posterior fossa and spinal cord [12], since it is less susceptible to artifacts in the image when compared to CT. This paper is organized as follows. Section 2 gives an outline of the selection criteria adopted for the literature review. Section 3 describes the principal machine learning and deep learning methods used in this application, and Section 4, summarizes the principal constraints, and common problems encountered in these CAD systems and we conclude this section 5 with a brief discussion.

Selection Criteria
The literature review was conducted using the recommendations given by Khan et al., [46] as well as the methodology proposed by Torres-Carrión [47,48]. We generated and analyzed bibliometric maps and identified clusters and their reference networks [49,50].
We also used the methods given in [51,52] to identify the strength of the research, as well as authors and principal research centers that work in the MRI images which use machine and deep learning for the identification of brain diseases.
In order to perform an appropriate search it is important to focus our attention on the real theoretical context of the research, for which reason the method proposed by Torres-Carrión [48] a so-called "conceptual mindfact" (mentefacto conceptual), can help to organize the scientific thesaurus of the research theme [47]. Figure 2 describes the conceptual mindfact used in this work to focus and constraint the topic to MRI Brain Algorithm difference Ischemic and Demyelinating diseases and obtain an adequate semantic search structure of the literature in the relevant scientific databases. Table 1 presents the semantic search structure [48] such as the input of the search specific literature (documents) in the scientific databases. The first layer is an abstraction of the conceptual mindfact; the second corresponds to the specific technicality namely, Brain Processing; the third level is relevant to the application, namely, the ischemic and demyelinating diseases. The fourth level is the global semantic structure search.
The result of the global semantic structure search (Fig 2)   In order to analyze and answer the three central research questions of this work, the global search of the 140 documents was further refined. This filter of research complied with the categories given by Fourcade and Khonsari [61], and were applied only to the "article" documents. These criteria were; • aim of the study: ischemia and demyelinating processing MRI brain images, identification, detection, classification or differentiate between them.
According to the second selection criteria, we found 38 documents to include in the analysis of this work and also were related to and in agreement with the items described above.
For analysis we used VOSviewer version 1.6.15 software [50] order to construct and display bibliometric maps. The data used for this objective was searched in Scopus due to its coverage of a wider range of journals [49,62].
In terms of citations and the countries of origin of these publications (Fig 5), we observe that the United States has a large number of citations, followed by Germany, India and the United Kingdom. This relationship was determined by the analysis of the number of citations in the documents generated by country, in agreement with the affiliation of the (primary authors? Corresponding authors?)authors and for each country the total strength of the citation link [51]. The minimum number of documents of any individual country was five and the minimum number of citations a country received was one.
The with the year of publication and, the diameter of the points shows the normalization of the citations according to Van Eck and Waltman [52,63]. The purple points are the documents that have fewer than ten citations and yellow represents documents with more than 60 citations.
In Table 2, we list the ten most cited articles according to the normalization of the citations [63]. Waltman et al [51] manifest that "the normalization corrects for the fact that older documents have had more time to receive citations than more recent documents" [51,64]. Also, table two show the dataset, methodology, techniques and metrics used to develop and validate the algorithm or CAD systems proposed by theses authors.
In the bibliometric networks or science mapping there are large differences between nodes in the number of edges they have to other nodes [50]. In order to reduce these differences, the VOSviewer uses the association strength normalization [63], that is a probabilistic measure of the co-occurrence data.
The association strength normalization is discussed by Van Eck and Waltman [63], and here we construct a normalized network [50] in which the weight of the edge between nodes i and j is given by: is also known as the similarity of nodes i and j, ( ) denotes the total weight of all edges of node i (node j) and m denotes the total weight of all edges in the network [50].
and For more information related with normalization, mapping and clustering techniques used by VOSviewer, the reader is referred to the relevant literature [50,63,64].
From Table 2 it can be seen that articles that are cited often deal with ischemic stroke rather than demyelinating disease. According with the methods and techniques used were support vector machine (SVM) [65], random forest (RF) [32]; classical algorithms of segmentation like Watershed algorithm (WS) [66]; and techniques of deep learning such as convolutional neural networks (CNN) [36,67]; as well as a combination between them: SVM-RF [22], CNN-RF [20,68].

Stroke and Demyelinating Disease
In the following subsections, we discuss how artificial intelligence (AI) through ML and DL methods are used in the development of algorithms for brain disease diagnosis and their relation to the central theme of this review.

Machine learning and deep learning:
The definitions of machine learning and deep learning are part of the global field of the Artificial Intelligence (AI) which is defined as the ability for a computer to imitate the cognitive abilities of a human being [61]. There are two different general concepts of AI: (1) Cognitivism related with development of rule-based programs referred to as expert systems, and (2) Connectionism associated to the development of simple programs educated or trained by data [61,69]. applications of AI to medicine and health are not covered, e.g., ophthalmology where AI has had tremendous success, see [70][71][72][73][74][75].
Machine Learning (ML) can be considered as a subfield of artificial intelligence (AI). Lundervold and Lundervold [76] and Noguerol et al., [77] state that the main aim of ML is to develop mathematical models and computational algorithms with the ability to solve problems by learning from experiences without or with the minimum possible human intervention, in other words the model created will be able to be trained to produce useful outputs when fed input data [77]. Lakhani et al., [78] state that recent studies demonstrate that machine learning algorithms give accurate results for the determination of study protocols for both brain and body MRIs.

Support Vector Machine (SVM):
Algorithm used for tasks of classify, regression and clustering. SVM is driven by a linear function similar to logistic regression [80], but with the difference that SVM only outputs class identities and does not provide probabilities. SVM classifies between two classes by constructing a hyperplane in highdimensional feature space [81]. The class identities are positive or negative when is positive or negative, respectively. For the optimal separating hyperplane between classes, the SVM uses different kernels (dot products) [82,83]. More information and detail about the SVM is given in the literature [80][81][82][83].

k-Nearest Neighbors (k-NN):
The k-NN is a non-parametric algorithm (it means no assumption for underlying data distribution) and can be used for classification or regression [80,84]. Like a classifier k-NN is based on the measure of the Euclidean distance (distance function) and a voting function in k nearest neighbors [85] given N training vectors. The value of the k (the number of nearest neighbors) decides the classification of the points between classes. KNN has the following basic steps: (1) Calculate distance, (2) Find closest neighbors and (3) Vote for labels [84]. More details of the k-NN algorithm can be found in references [80,85,86]. Programming libraries such a Scikit-Learn have algorithms for k_NN [84]. The k-NN has higher accuracy and stability for MRI data, but is relatively slow in terms of computational time [86]. As an aside it is interesting to note that the nearest neighbor formulation might have been first described by the Islamic polymath

Ibn al Haytham in his famous book Kitab al Manazir over a 1000 years ago ("The Book of
Optics ", see: [87])!

Random Forest (RF): This technique is a collection of Classification and Regression
Trees [88]. Here a forest of classification trees is generated where each tree is grown on a bootstrap sample of the data [89]. In that way, the RF classifier consists of a collection of binary classifiers where each decision tree casts a unit vote for the most popular class label (see figure 8 (d) ) [90]. More information are given elsewhere [91].

k-Means Clustering (k-means):
The k-means clustering algorithm is used for segmentation in medical imaging due to its relatively low computational complexity [92,93] and minimum computation time [94]. It is an unsupervised algorithm based on the concept of clustering. Clustering is a technique of grouping pixels of an image according to their intensity values [95,96], It divides the training set into k different clusters of examples that are near each other [80]. The properties of the clustering are measure such as the average Euclidean distance from a cluster centroid to the members of the cluster [80]. The input data for use with this algorithm should be numeric values, with continuous being better than discrete values, and the algorithm performs well when used with unlabeled datasets.

Deep Learning methods
Deep Learning (DL) is a subfield of ML [97] that uses artificial neural networks (ANN) to develop decision making algorithms [77]. Artificial Neural Networks are neural networks which employ learning algorithm [98] and infer rules for learning, In order to do so a set of training data examples are needed, the concept is derived from the concept of the biological neuron concept ( figure 8 (e)). An artificial neuron receives inputs from other neurons, integrates the inputs with weights, and activates (or "fires" in the language of biology) when a pre-defined condition is satisfied [79]. There are many books describing AANs -see for example [80].
The fundamental unit of a neural network is the neuron, which has a bias w0 and a weight vector w = (w1, . . ., wn) as parameters θ = (w0,...,wn) to model a decision: using a non-linear activation function h(x) [99]. The activation functions commonly used are: function, the sigmoid function and An interconnected group of nodes comprise the ANN, where each node representing a neuron arranged in layers [76], the arrow representing a connection from the output of one neuron to the input of another [90]. ANNs have input layer which receives observed values, while the output layer represents the target (a value or class) and the layers between input and output layers are called hidden layers [79].
There are different types of ANNs [100] and the most common types are: convolutional neural nets (CNNs) [101], recurrent neural nets (RNN) [102], long short-term memory(LSTM) [103], and generative adversarial networks (GANs) [104]. In practice, these types of networks can be combined [100] between them and with classical machine learning algorithms.. The CNNs are most commonly used for the processing of medical images because of their success in processing and recognition of patterns in vision systems [43].
CNNs are inspired by the biological visual cortex and also called multi-layer perceptrons (MLPs) [43,105,106]. It consists of a stack of layers: convolutional, max pooling and fully connected layers. The intermediate layer is fed by the output of the previous layer e.g. the convolutional layer creates a feature map of different size and the pooling layers reduce the size of feature maps to be feed to the following layers. The final fully connected layers produce the specified class prediction at the output [43]. The general CNN architecture is presented in Fig 9. There is a compromise between the numbers of neurons in each layer, the connection between them and the number of layers with the number of parameters that defines the network [43].  [115], Le at al., [116], Shen et al., [105].

Computer-Aided Diagnosis in Medical Imaging (CADx system)
Computer-aided diagnosis has its origins in 1980s at the Kurt Rossmann Laboratories for Radiologic Image Research in the Department of Radiology at the University of Chicago, [117]. The initial work was on detection of breast cancer [29,117,118] and the reader is referred to a recent review [119].
There has been much research and development of CADx systems using different modalities of medical images. The CAD not is a substitute for the specialist but can assist or be an adjunct to the specialist in the interpretation of the images [34]. In other words, CADx systems can provide a "second objective opinion" [89,99] and make the final disease decision from image-based information and the discrimination of lesions, complementing a radiologist's assessment [120].
CAD development takes into consideration the principles of radiomics [40,[121][122][123][124][125]. The term radiomics is defined as the extraction and analysis of quantitative features of medical images, in other words the conversion of medical images into mineable data with high fidelity and high throughput for decision support [40,121,122]. The medical images used in radiomics are obtained principally with CT, PET or MRI [40].
The steps that are utilized by a CAD system consists of [40]: (a) image data and preprocessing, (b) image segmentation, (c) feature extraction and qualification, (d) classification (Fig 10).

Image Data:
The dataset is the principal component to develop an algorithm because it is the nucleus of the processing. Razzak et al., [109] state that the accuracy of diagnoses of the disease depends upon image acquisition and image interpretation. However Shen et al., [105] add a caveat that the image features obtained from one method need not be guaranteed for other images acquired using different equipment [105,126,127]. For example, it has been shown that the methods of image segmentation and registration designed for 1.5-Tesla T1-weighted brain MR images are not applicable to 7.0-Tesla T1weighted MR images [43,57,58].
There are different datasets of images for brain medical image processing, in the case of stroke, the most famous datasets used is the ISLES (Ischemic Stroke Lesion Segmentation) dataset [20,68] and ATLAS (Anatomical Tracings of Lesions After Stroke) [128]; for the case of demyelinating disease there isn't a specific dataset, but datasets for Multiple Sclerosis is used, e.g., MSSEG (MS segmentation) [129]. Table 3 describes the datasets that have been used in the publications under consideration in this review is possible find datasets for brain medical image processing.

Image Preprocessing:
There are several preprocessing steps necessary to reduce the noise and artifacts in the medical images, before the segmentation [34,130,131].
The preprocessing steps commonly used are (1) the grayscale conversion, and the image resizing [131] to get better contrast and enhancement, (2) bias field correction to correct the intensity inhomogeneity [24,130], (3) image registration, a process for spatial alignment [130], and (4) removal of nonbrain tissue such as fat, skull, or neck which have intensities overlapping with intensities of brain tissues [21,130,132].

Image Segmentation:
In simple terms image segmentation is the procedure of separating a digital image into a different set of pixels [31] and is considered the most fundamental process as it extracts the region of interest (ROI) through a semiautomatic or automatic process [133]. It divides the image into areas according a specific description to obtain the anatomical structures and patterns of diseases.
Despotovíc et al., [130], Merjulah and Chandra [31] indicate that the principal goal of the medical image segmentation is to make things simpler and transform it into a set of semantically meaningful, homogeneous, and nonoverlapping regions of similar attributes such as intensity, depth, color, or texture [130] because the segmentation assists doctors to diagnose and make decisions [31].
To evaluate, validate and measure the performance of every automated lesion segmentation methodology compared to the expert segmentation [134] one needs to consider the accuracy (evaluation measurements) and reproducibility of the model [135].
The evaluation measurements compare the output of segmentation algorithms with ground truth in either a pixel-wise or a volume-wise basis [3].
The accuracy is related with the grade of closeness of the estimated measure to the true measure [135], • Precision: is the measure of over-segmentation between 0 and 1, and it means the proportion of the computed segmentation which overlaps with the reference segmentation [136,137], This is also is called the positive predictive value (PPV), with a high PPV indicating that a patient identified with a lesion does actually have the lesion [139].
• Recall also known as Sensitivity: Gives a metric between 0 and 1, this a sign of over-segmentation, and it is a measure of the amount of the reference segmentation which overlaps with the computed segmentation [136,137].
The metrics of overlap measures which are less often used are the sensitivity, specificity (measures the portion of negative voxels in the ground truth segmentation [140]) and accuracy, which according with García-Lorenzo et al., [135] and Taha and Hanbury [140], should be considered carefully because they penalize errors in small segments more than in large segments. These are defined as: • Average Symmetric Surface Distance (ASSD, mm): represents the average surface distance between two segmentations (computed and reference and vice-versa), and is an indicator of how well the boundaries of the two segmentations align. ASSD is measured in millimeters, and a smaller value indicates higher accuracy [68,134,137].
The average surface distance (ASD) is given as: Where is a 3D matrix consisting of the Euclidean distances between the two image volumes and , and is defined [134]: • Hausdorff's distance (HD,mm): It is more sensitive to segmentation errors appearing away from segmentation frontiers than ASSD [137]. The Hausdorff measure is an indicator of the maximal distance between the surfaces of two image volumes (the computed and reference segmentations) [20,137]. HD is measured in millimeters and like the ASSD a smaller value indicates higher accuracy [134].
where and are points of lesion segmentations and , respectively, and is a 3D matrix consisting of all Euclidean distances between theses points [134].
• Intra Class Correlation (ICC): Is a measure of correlation between volumes segmented and ground truth lesion volume [137].
• Correlation with Fazekas score: A Fazekas score is a clinical measure of WMH, comprising of two integers in the range [0, 3] reflecting the degree of periventricular WMH and deep WMH respectively [137].
• Relative Volume difference (VD, %): It measure the agreement between lesion volume and the ground truth lesion volume, a low VD means more agreement [139,141].
where, and are the segmented and ground truth lesion volumes respectively.
Lastly, we define [135] the "reproducibility" which is a measure of the degree of agreement between several identical experiments. Reproducibility guarantees that differences in segmentations as a function of time result from changes in the pathology and not from the variability of the automatic method [135].
Tables 2 and 4 presents the types of databases, modalities and the evaluation measurements considered and applied to the results reported in the literature to date.

Feature extraction:
An ML or DL algorithm is often a classifier [113] of objects (e.g. lesions in medical images). Feature selection is a fundamental step in the processing of medical image and more specially it allows us to research which features are relevant for the specific classification problem of interest, and also it helps to get higher accuracy rates [42].
The task of feature extraction is complex due to the task of determining to determine an algorithm that can extract a distinctive and complete feature representation, and for that principal reason it is very difficult to generalize and implies that one has to design a featurization method for every new application [99]. In DL, this process is also denoted to as "hand-crafting" features [99].
The classification is related to the extracted features that are entered as input to an ML model [113], while a DL algorithm model uses pixel values in images directly as input information instead of features calculated from segmented objects [113].
In the case of processing stroke with CNNs the featurization of the images is a key application [68,142] and depends on the signal-to-noise ratio in the image, which can be improved by target identification via segmentation to select regions of interest [142].
According to Praveen et al., [143], a CNN learns to discriminative local features and return better performance than handcrafted features.
Texture analysis is a common technique in medical pattern recognition tasks to determine the features, and for that one uses second-order statistics or co-occurrence matrix features [40]. Mitra et al., [139], indicate that they derive local features, spatial features and context-rich features from the input MRI channels.
It is clear that currently the DL algorithms especially those that use of a combination of CNNs and machine learning classifiers produce a marked transformation [144] in the featurization and the segmentation in medical image processing [76,142].
CNNs have a high utility in tasks like identification of compositional hierarchy features and low-level features (e.g. edges), specific pattern forms and intrinsic structures (e.g. shapes, textures) can be developed [3] and spatial features generated from an n-dimensional array of basically any arbitrary size [37,108]. e.g. the U-Net model proposed by Ronneberger et al., [145], employed parameter sharing between encoder-decoder paths for incorporating spatial and semantic data that allow better segmentation performance [136]. Based on the U-Net model, currently there are novel variants of U-Net designs, e.g. Bamba et al., [146], used a U-net architecture with 3D convolutions that allow the use of an attention gate for the decoder to suppress unimported parts of the input while emphasize the relevant features.
There is considerable room for improvement and innovation of innovative networks (e.g., [147]).
The process of converting a raw signal into a predictor (automatization of the featurization) constitutes an advantage of the DL methods over others, which is useful when there are large volumes of data of uncertain relationship to an outcome [142], e.g. the featurization of acute stroke and the demyelinating diseases.

ML and DL classifiers applied to diagnosis ischemia and demyelinating diseases
In this subsection we discuss the different classifiers that have been utilized in the literature under. Additional details such as dataset and the measure metrics of the algorithms and the tasks are presented in the Tables 2 and 4.
Different studies [15,65,79,158] related to stroke (see table 4 and fig 1) in their different types, use principally classifiers of ML to determine the properties of the lesion.
The classifiers most commonly used are SVM and Random Forest (RF) [158].
According to Lee et al., [158] the RF has some advantages over the SVM because RF can be trained quickly and provides insight into the features that can predict the target outcome [158]; also the RF can automatically perform the task of feature selection and provide a reliable feature importance estimate. Additionally, the SVM is effective only in cases where the number of samples is small compared with the number of features [79,158]. Along similar lines, Subudhi et al., [22] reported that the RF algorithm works better when one has a large dataset and it is more robust when there are a higher number of trees in the decision making process, They reported an accuracy of 93.4% and DSC index of 0.94 in their study.
Huang et al., [65] present results that predict ischemic tissue fate pixel-by-pixel based on multi-modal MRI data of acute stroke using a flexible support vector machine algorithm [65]. Nazari-Farsani et al., [27] proposes an identification of the ischemic stroke through SVM with Linear Kernel and cross validation folder with an accuracy of 73% with a private dataset of 192 patients scans, while Qiu et al., [151] with a private dataset of 1000 patients for the same task use only the Random Forest (RF) classifier and obtain an accuracy of 95%.
The combination of the traditional classifier likes SVM and RF with CNN show better results, e.g. [32,65,143] report values of DSC between 0.80 and 0.86. Melingi and Vivekanand [131] reported that through combination of the Kernelized Fuzzy C-Means clustering and SVM they achieved an accuracy of 98.8% and sensitivity of 99%.
A method for detecting the stroke presence or non-stroke presence using the SVM and feed-forward backpropagation neural networks classifiers, is presented in [15]. For extraction of the features of the segmentation of the stroke region a k-means clustering was used along with adaptive neuro fuzzy inference system (ANFIS) classifier, since the other two methods failed to detect the stroke region in low edges brain images, resulting in the accuracy and the precision of 99.8% and 97.3% respectively.
The different developments of architectures in the DL models contribute to get better evaluation and results of segmentation, e.g. Kumar et al., [136] proposed a combination of U-Net and Fractal Networks, Fractal networks, are based on the repetitive generation of self-similar objects and ruling out residual connections [136,166]. They

Common Problems in medical image processing for ischemia and demyelinating brain diseases
This section presents a brief summary of some common problems found in the processing of ischemia and demyelinating disease images.

The dataset
The availability of large datasets is a major problem in medical imaging studies, and there are few datasets related to specific diseases [27]. The lack of datasets is a challenge since deep learning methods require a large amount of data for training, test and validation [27].
Another major problem is that even though algorithms for ischemic stroke segmentation in MRI scans have been (and are) intensively researched, but the reported results in general do not allow us to establish a comparative analysis due to the use of different databases (privates and public) with different validation schemes [29,34].
The Ischemic Stroke Lesion Segmentation (ISLES) challenge, was designed to facilitate the development of tools for the segmentation of stroke lesions [20,68,142]. The Ischemic Stroke Lesion Segmentation (ISLES) group [20,68] have a set of stroke images, but there is a need to enrich the dataset with clinical information, in order to get better performance with CNNs.
Another problem with the datasets, is the need for accurately labeled data [37], This lack of annotated data constitutes a major challenge for ML supervised algorithms [168] because the methods have to learn and train with limited annotated data which in most cases contain weak annotations (sparse annotations, noisy annotations, or only image level annotations) [144]. Therefore collecting image data in a structured and systematic way is imperative [79] due the large database required by the AI techniques to function efficiently.
An example of good practice of health data (images and health information) is exemplified by the UK Biobank [169], which has health data from half a million UK participants. The UK Biobank aims to create a large-scale biomedical database that can be accessed globally for public health research. However, the access depends on administrator approval and payment of a fee.
Other difficulties that accompany the labeling of the images in a dataset include the lack of collaboration between clinical specialists and academics, patient privacy issues, and the most importantly the costly time-consuming task of manual labeling of data by clinicians [34].
With CNNs overfitting is a common problem due the small size of the training data [114], and therefore it is important the increase of the size of training data and, one solution for this problem is the use of the technique of "data augmentation" which according to [170], helps improve generalization capabilities of deep neural networks, and can be perceived as implicit regularization, e.g. Tajbakhsh et al., in [144,171] reported in their results that the sensitivity in a model improves 10% (from 62 to 72%) if the dataset is increased from a quarter to full size of the training dataset. Various methods of data augmentation of medical images are reviewed in [172].
However, in [141] it is suggested that cascaded CNN architectures are a practical solution for the problem of the limited annotated data, in that the proposed architecture tends to learn well from small sets of data [141].
An additional but no less important problem, is the availability of equipment for collecting the image data. Even though the MRI is better than CT for stroke diagnosis [173] there is also the fact that in some developing countries the availability of CT and MRI facilities is very limited and relatively expensive, in addition to lack of trained technical personnel and information [34]. Even in developed countries there are disparities in availability of equipment between urban and rural areas. These issues are discussed for example in a report published by the Organization of Economic Cooperation and Development (OECD) [174].

Detection of lesions
It is known that that the brain lesions have a high degree of variability [8,64], e.g., stroke lesions and tumors, and hence it is a hard and complex challenge to develop a system with great fidelity and precision. As an example, the lesion size and contrast affect the performance of the segmentation [18].
In the case of the WMHs and their association with a determined disease like the ischemic stroke, demyelinating disease or any other disorders, the set of features to describe their appearances and different locations [14], plays a fundamental role for training with the minimum errors of any model.

Computational cost
In medical image processing the computational cost is a fundamental factor, since the ML algorithms often require a large amount of data to ''learn'' to provide useful answers [100] and hence increased computational costs. Different studies [110,113,175] report that training neural networks which are efficient and make accurate predictions have a high computational cost (e.g. time, memory, and energy) [110]. This problem is often a limitation with the CNNs due to the high dimensionality of input data and the large number of training images required [113]. However graphical processing units (GPUs) have proven to be flexible and efficient hardware for ML purpose [100]. GPUs are highly specialized processors for image processing. The area of General purpose GPU (GPGPU) Computing is a growing area and is an essential part of many scientific computing applications. The basic architecture of a GPU differs a lot from a CPU. The GPU is Suzuki et al., [113,178] propose the utilization of massive-training artificial neural network (MTANN) [179] instead of the CNNs because the CNN requires a huge number of training images (e.g., 1,000,000), while that the MTANN requires a small number of training images (e.g., 20) because of its simpler architecture. They note that with GPU implementation, an MTANN completes training in a few hours, whereas a deep CNN takes several days [113], of course, currently this depends upon the task as well as the processor speed.
It has been proposed that one can use small convolutional kernels in 3D CNNs [144]. This architecture seems to be more discriminative without increasing the computational cost and number of trainable parameters in relation to the task of identification [164].

Discussion and conclusions
The techniques of deep learning are going to play a major role in medical diagnosis in the future, and even with the high training cost, CNNs appear to have great potential and can serve as a preliminary step in the design and implementation of a CAD system [34].
However, brain lesions, and especially the WMHs have significant variants with respect to size, shape, intensity, and location, which makes their automatic and accurate segmentation challenging [159]; e.g. in spite of the fact that stroke is considered to be easy to recognize and differentiate from other WMHs for experienced neuroradiologists, it could be a challenge and difficult task for general physicians, especially in rural areas or in developing countries where there are shortages of radiologists and neurologists and, for that reason it is important to employ computer-assisted methods as well as telemedicine [180], in this sense, e.g. Mollura et al., [181] gives some strategies in order to get an effective and sustainable implementation of radiology in developing countries.
Our research has noted diverse approaches in the detection differentiation of WHMs, especially with ischemic stroke and demyelinating disease like MS. Those include methods like support vector machine (SVM), neural networks, decision trees, or linear discrimination analysis.
In the ISLES 2015 [20] and ISLES 2016 [68] competitions the best results were obtained for stroke lesion segmentation and outcome prediction using the classic machine learning models, specifically the Random Forest (RF); whereas in ISLES 2017 [68] the participants offered algorithms that use CNN, but the overall performance was not much different from ISLES 2016. However, the ISLES team state that despite this deep learning has the potential to influence clinical decision making for stroke lesion patients [68].
However, this is only in the research setting and has not been applied to a real clinical environment, in spite of development of many CAD systems [100].
To identify stroke, according to Huang et al., [65] the SVM method provides better prediction and quantitative metrics compared with the ANN. Also, they note the SVM provides accurate prediction with a small sample size [65,182], Feng et al., [142] indicate that the biggest barriers in applying deep learning techniques to medical data are the insufficiency of the large datasets that are needed to train DNNs [142].
Although various models trained with small datasets report good results (DSC values > 0.90) in their classifications or segmentations, see table 4 [15,148,152], Davatzikos [183] recommends avoidance of methods trained with small datasets because of replicability and reproducibility issues [77,183]. Therefore, it is important to have multidisciplinary groups [77,98,184] involving representatives from the clinical, academic and industrial communities in order to create efficient processes that can validate the algorithms and hence approve or refute recommendations made by software [77]. Related to this is that algorithmic development has to take into consideration that real life performance by clinicians is different from models.
However other areas of medicine, for example ophthalmology has shown that certain classifiers approach clinician level performance. Of further importance is the development of explainable AI methods which have been applied to ophthalmology where correlations are made between areas of the image that the clinician uses to make decisions and the ones used by the algorithms to arrive at the result (i.e., the portions of the image which most heavily weighs the neural connexons) [71,[185][186][187].
Thus, the importance of involving actively clinical AI research, multidisciplinary communities, it is possible to pass the "valley of death" [100] namely the lack of resources and expertise often encountered in translational research. This will take into account the fact that currently deep learning is a black box [43], where the inputs and outputs are known but the inner representations are not well understood. This is being alleviated by the development of explainable AI [72].
Even though there have been tremendous advances, there are only a few methods that are able to handle the vast range of radiological presentations of subtle disease states.
There is a tremendous need for large annotated clinical data sets, a problem that can be (partially) solved by data augmentation and by methods of transfer learning [188,189] used in the models principally with different CNNs architectures.
Although it is very important to note that processing diseases or tasks in medical images are not the same as processing general pictures of say, dogs or cats, but it is possible uses a set of generic features already trained in CNNs for a specific task to transfer as features for input to classifiers focused on other medical imaging tasks. For examples in medical imaging see: [190][191][192][193].
Finally it is important keep in mind the fact of the mentioned by [194] that like humans, the software is only as good as the data it is trained on. Therefore, it is important that research in medical image analysis and diagnosis must include both clinical and technical knowledge. .

Fig 2.
Conceptual Mindfact (Mentefacto conceptual) according to [47,48]. This allows the keyword identification for a systemic search of the literature in the scientific databases.   Tables   Table 1. Key Words used in the global semantic structure search

Key words for semantic structure search in Scopus database
TITLE-ABS-KEY ( ( ( ( magnetic* ) AND ( resonanc* ) AND ( imag* OR picture OR visualiz* ) ) OR mri OR mra ) AND ( ( brain* OR cerebrum ) AND ( ( ischemic AND strok* ) OR ( demyelinating AND ( disease OR "brain lesions" ) ) ) ) AND ( algorithm* OR svm OR dwt OR kmeans OR pca OR cnn OR ann ) ) AND ( "deep learning" ) OR ( "neural networks" ) OR ( "machine learning" ) OR ( "convolutional neural network" ) OR ( "radiomics" ) Table 2. List of the ten most cited articles according to the normalization of the citations [51]. Also shows the central theme of research, the type of image and the methodology used in the processing.