1. Introduction
The consumption of alcohol during pregnancy is an indisputable generator of mental health problems in children [
1,
2]. Conditions that emerge from this consumption are grouped under the concept of Fetal Alcohol Spectrum Disorder (FASD). This concept was coined in Canada, and included effects may include physical, mental, behavioral, and learning disabilities with possible lifelong implications diagnosis. The most common conditions are Fetal Alcohol Syndrome (FAS), partial Fetal Alcohol Syndrome (pFAS), and the Alcohol-Related Neuro-Developmental Disorder (ARND) [
3,
4]. Being FAS is the most severe consequence of mother alcohol consumption since the infants would have potential mental retardation. Most of the common outcomes include significant hyperactivity, language delay, and school learning problems. There is an observed higher rate of mental health problems in adolescence, as well as problematic use of alcohol and other drugs.
An earlier diagnosis of these conditions could reduce the long-term health and psychosocial outcome, especially cognitive difficulties, thus improving the quality of life of young children and adolescents. In addition, people who are diagnosed with FASD report higher unemployment, law-breaking, and other social behavior problems. These problems result in a higher cost to the public welfare and society at large. In spite of the importance of the diagnosis of FASD, there are currently no direct clinical characteristics that can be used to diagnose a child with FASD. The only distinct difference compared to other childhood or adolescent conditions is the mother’s alcohol consumption during pregnancy [
5,
6,
7]. If there is no certainty about the mother’s alcohol consumption, some indirect effects are associated with FAS, such as facial dysmorphia, growth deficiency, or central nervous system abnormalities, can be used as proxy for the diagnosis. However, when none of these effects are present, the diagnosis becomes challenging [
8].
In the state of the art, we can find an increased effort to develop tools on specific behavior and profiles to allow an early diagnosis of FASD, where these tools are failing to make a specific diagnosis of FASD compared to other neurodevelopmental conditions [
6,
9]. Among these tools, we find tools in Machine Learning (ML) [
10,
11] and Computer-Aided Detection (CAD) [
12,
13]. However, there are tools to improve FASD detection [
14]; therefore, this paper addresses the use of ML techniques for an improved FASD diagnosis. Specifically, this work focuses on developing computational algorithms based on ANN to classify children with FASD. The research question is whether an ANN model could be used for medical diagnosis. One important reason for testing ANN as a diagnostic tool is the easy implementation and simplicity in applying existing data. This model will be evaluated based on its performance, in terms of precision, completeness, and performance benchmark to other machine learning techniques. This model will be evaluate the goodness of fitness of using psychometric, saccadic eye movement, and diffusion tensor imaging (DTI) tests). It is believed that techniques based on machines learning are a good tool for grouping and comparing the clinical information obtained from a set of psychometric data, thus expanding the possibility of detecting differences in FASD patients. To our knowledge, the use of machine learning as FASD diagnostic tools has not been tested. However, there are current studies that have been used them, such as Zhang et al. (2019) [
15]. Nevertheless, these authors have used support vector machine regression (SVMR) algorithms to analyze psychometric data and DTI and a linear regression for the rest of the data.
The rest of this paper is organized as follows. The next section presents the required background to understand this study.
Section 3 describes materials and methods are used to carry out this study and additionally describes the experiment execution.
Section 4 presents the results obtained from the experiment.
Section 5 discusses these results, and
Section 6 provides the conclusions of this researchand other methods for the remaining tests.
4. Results
This research evaluated the performance of a ANN-based classification algorithm to detect FASD in children using the result of psychometric, DTI, and saccade tests and comparing the accuracy of the model with the SVMR developed in Reference [
15]. The cross-validation process showed that the psychometric data model outperformed the rest. After this process, we saved the best model weight for each of the different data, and then we estimated the accuracy with all the data available.
Table 4 shows the the accuracy level for each of the data type and its comparison to results obtained by Zhang’s model.
In the next subsection, the results of each (Psycometric, Antisaccad, Prosaccade, and DTI) of the tests are shown and discussed.
4.1. Psychometric Data Classification
The data used for testing included psychometric tests (20 input features), associated with analyzing social behavior, memory activities, language delay, and all altered behavioral factors in FASD children. In addition, the dataset included other developmental diseases, making the classification process more difficult. This paper used two models of dense networks using this type of data. The first model archived an accuracy of 75.55%. However, the model did not show significant improvements, despite the variation in the layers and neurons in the layers. In the second model, a feature layer was added, achieving an accuracy of 88.46%.
Once the network configuration was chosen, training and testing or validation behavior are shown in the model precision and loss functions. The loss function with a high result indicates that the neural network has a poor performance and a low result, and that it is doing a good job.
Figure 3a shows the accuracy for each of the dataset. In terms of the number of that were aspects found and related to the number of individuals evaluated. We attempted to find a model with no significant difference between the labeled data and the prediction. Nonetheless, with the implemented configuration, the training data increased accuracy. In addition, the validation data also increased in accuracy.
Figure 3a shows the loss function on the psychometric data. In this case, the loss function has a decreasing curve in both training and testing. This result suggests that the network may have issues classifying some data not used during training but not significantly.
The confusion matrix (
Figure 4) shows the performance of the model with test data. The model has a good performance in classifying children with FASD since the correctness is 38.46% over 42.31% in positive hits (FASD having FASD), representing 90.9%, and it has 7.69% over 42.31% false positives (a control having FASD). It should be noted that the total percentage of children diagnosed with FASD is 42.31%. The model allows classifying FASD in most cases. In classifying control patients, the model classifies 51.0% over 57.69% of control patients who are really in control (86.67% of the cases control). And it fails in 3.85% to classify patients with FASD as controls (13.32% mistake of cases control).
4.2. Antisaccade Data Classification
The results from the Antisaccade tasks correspond to measurements of successes between the number of tests achieved by each individual, where the failures in the automatic saccade inhibition with a peripheral objective are measured. In this case, initially, there were 173 records, but we evaluated the model with 136 records. The characteristics of the Antisaccadic test are: 15 include correct and wrong trails, number of express saccades, the interval between the target appearance, and saccade onset, velocity, amplitude, angle, and other measurements derived from those.
The confusion matrix (
Figure 5) results show a high rate of prediction of classifying FASD compared to the amount of data evaluated (100%). However, some of the classification characteristics tend to confuse control children with the possibility of having FASD. These results make it necessary to evaluate the degree of FASD these children have and the reason for this classification (33.33% subject control were classified FASD).
4.3. Prosaccade Task Data Classification
Prosaccade tests measure reaction time, performance accuracy, viability, and parameters which corresponding to the main sequence. In this particular case, 18 characteristics were measured in the study, and the data collected is similar to the antisaccades test but include duration, deceleration, acceleration, and skew. For the classification of FASD patients with these data from the 186 records, only 38% correspond to patients were diagnosed with FASD. This study achieved an accuracy of 72.41%.
The confusion matrix (
Figure 6) for this case shows that the model predicts FASD patients well, but only in 50% of the cases control patients being controls. This confusion matrix allows us to observe that we just need to make some adjustments to improve the exactness to predict control patients or verify the data. We should better understand the present differentiating characteristics from control patients to adequately identified during training reflected in the test data.
4.4. Memory-Guided Saccade Task Classification Results
These tasks are related to following the objective order. When there are errors, the subject cannot follow the order that initiates the saccade by the second objective more than on the first. In these tasks, 26 characteristics are measured: correct trial, times between movement and reactions, errors, distance, speed, and others [
15]. We took 61 records by classification, achieving 88% in test data, one of the highest in the study.
The confusion matrix (
Figure 7) for this model shows a balanced measurement of FASD and control patients greater than 80% per case. And, in the case of FASD, very few mistakes. It is important to note that the studies that achieved the best results have an element of the individual’s memory. This result makes us think that we must pay more attention to this type of test, including memory issues.
4.5. Diffusion Tensor Imaging (DTI) Data Classification Results
This study measures the connectivity of the white matter in the corpus callosum. It comes from a structural MRI, so it does not involve tests. This study has 76.54% records classified as FASD and 46% control. Having little data means that this study did not achieve a good accuracy (less than 50%). The Leaky-ReLU function was added in the intermediate layers for improvement, managing to rise to 75% in a combination of 4 hidden layers of 64 and 128 neurons, a feature layer, and the training time was 100 epochs compared to the other models with 50 epochs training.
The confusion matrix (
Figure 8) shows that the percentage of failures to classify FASD and control is the same. And, concerning the percentage of samples of each type, we observed a good performance.
5. Discussion
This paper evaluated the usage of computational algorithms based on ANN to classify children with FASD for an improved medical diagnosis. This paper estimated the prediction model using numerical psychometric data, DTI, and saccadic eye movement. Unlike most of the current research that implements ML using image-based data to predict FASD, the novelty of this paper is that it uses numerical data to estimate the model. Although Zhang et al. (2019) [
15] collected and used the same data, our results show a better performance using ANN instead of SVMR or other methods.
First, for psychometric data, a basic network configuration was obtained with correctness with test data of 75.5% with a Leaky-ReLU activation function, which did not require data normalization. Subsequently, a change was made in the configuration of the networks adding a feature layer. This layer allows identifying the essential characteristics for FASD diagnosis. This feature significantly contributes to the classification, achieving 88% accuracy. It also produced the better results using the K-Fold Cross validation (K = 10) and K-Fold Cross-Validation (K = 5), suggesting that this should be the direction for improving FASD diagnosis using ANN. For the rest of the data, a network configuration with a feature layer was used to evaluate the neural network model performance. Using numerical origin (Saccade, Prosaccade, and DTI) data related to FASD, models achieved an accuracy greater than 70%. Despite this higher rate obtained by all of these configurations, it is not yet considered to be a safe prediction for a medical diagnoses.
The confusion matrices obtained demonstrate that the model predicts at least 71% of patients diagnosed with FASD, from the around 50% that this group represents. However, the prediction of control patients has over 77% of patients who have not been diagnosed with FASD (except Prosaccadic task). These matrices suggest that the model must be recalibrated or overlearned from the data of patients with FASD, or else some control patients may have FASD but have not been diagnosed. These results are not completely demonstrated; thus, the use of ANN as a diagnostic tool must be improved before its use can be suggested.
Results show that the estimated models present a better performance in all cases compared to the results obtained by Zhang’s research. Overall, the use of ANN outperformed by 14.22% (S.D. 10.58) the results obtained by Zhang’s model. Unlike the model obtained by Zhang, all the characteristics obtained from each study are included in the estimated model. This is a distinctive feature since Zhang removed some characteristics due to their view that they did not improve the prediction’s performance. Therefore, our results show that the model indeed improved by including data from sex and age. In addition, we used a feature layer that allows the algorithm to distinguish those characteristics, which contributes to improving pathology’s prediction. In addition, the estimated model shows that, by eliminating sex and age from the studies, the prediction decreased dramatically. Zhang used these variable to normalize the data in Saccade, Antisaccade, and MGsaccade, to increase model’s accuracy from an average of 60% to 70%. The latter is a key feature, since our estimated model uses raw data, keeping the estimation process simple.
Zhang’s model utilized methods, especially SVMR, which require fewer parameters to optimize its performance, reducing the possibility of over-tuning training data, and increasing actual performance. According to the training data results, unlike ANN, the study of different configurations can change its operation and performance in general. The increase in data allows ANN to improve its performance, without changing parameters. Nevertheless, a new training improves the ANN model accuracy, unlike in the case of SVMR. However, SVMR is faster and more stable than other ANNs. Therefore, implementing an ML model depends on the change in data or other characteristics rather than on the accuracy of each one.
One caveat is the occurrence of the false-positive, which is to diagnose a person with FASD that does not have it, since the data between FASD and control patients can be similar. It is essential to mention that some patients are not correctly diagnosed with FASD or have some cognitive problems similar to FASD, which can also lead to many false negatives. This result is because collecting data on alcohol consumption during pregnancy is not accessible. Nevertheless, it requires the mother’s declaration about the consumption. Likewise, over time, the early samples of FASD decrease and are dependent on the time of alcohol exposure. Furthermore, the phase of gestation during which the ingestion occurred and the amount of it. Some patients who present neuropsychological and developmental deficits (between groups diagnosed with FASD and other pathology) do not show significant differences, disabilities, nor behavioral problems. Therefore, the diagnosis of FASD cannot be made lightly and should be considered aside other results, such as magnetic resonance imaging and eye movement, among others.
Future work should be done on using these types of ANN to predict other deceases or conditions that have cognitive development issues, such as Autism or ADHD. For instance, some cases of autism are classified by FASD; therefore, the treatment is not the correct. The introduction of the ANN would allow to a better diagnosis, thus improving their quality of life. The data is currently limited and the testing process is in early stages. To our knowledge, there are no studies that have gone in this direction.
6. Conclusions
There is currently an underestimation of people with FASD. This syndrome affects a significant percentage of the world population. However, its diagnosis requires a certain consumption during embryonic development. It could be undiagnosed in the control population due to the clinical similarities with other neuro-development diseases. This issue is a critical factor for developing a new technique for FASD diagnosis since image-based Artificial Neural Networks (ANNs) and Computer-Aided Detection tools (CADs) studies are invasive and cannot be repeated. For this reason, the use of numerical data from non-invasive tests can be an excellent alternative to diagnose FASD. This paper developed ANN algorithms to determine FASD likelihood in children using numerical psychometric, DTI, and Saccadic eye movement data on children/young people diagnosed with FASD. It is essential to mention that the best results were obtained using psychometric data. It is crucial to carry out a more in-depth study of this test and even incorporate data from children with a neuro-behavioral disorder that helps to classify better and identify some syndromes.
In conclusion, we suggest that ANN algorithms improve their performance with a suitable network configuration by increasing the number of neurons, hidden layers, changing optimization functions, adding a feature layer, and the activation algorithm used to incorporate new data into training. The combination with other data groups even allows us to increase its performance, indicating that ANNs can be a good alternative to detect FASD. This paper evaluated the usage of computational algorithms based on ANN to classify children with FASD for an improved medical diagnosis. As a result, this paper estimated the prediction model using numerical psychometric data, DTI, and saccadic eye movement. Although other studies, like Zhang et al. (2019) [
15], collect and use the same data, our results show a better performance using ANN instead of SVMR or other methods. However, a deeper study of the different network configurations must be undertaken to improve performance.