An Intelligent Multicriteria Model for Diagnosing Dementia in People Infected with Human Immunodeﬁciency Virus

: Hybrid models to detect dementia based on Machine Learning can provide accurate diagnoses in individuals with neurological disorders and cognitive complications caused by Human Immunodeﬁciency Virus (HIV) infection. This study proposes a hybrid approach, using Machine Learning algorithms associated with the multicriteria method of Verbal Decision Analysis (VDA). Dementia, which affects many HIV-infected individuals, refers to neurodevelopmental and mental disorders. Some manuals standardize the information used in the correct detection of neurological disorders with cognitive complications. Among the most common manuals used are the DSM-5 (Diagnostic and Statistical Manual of Mental Disorders, 5th edition) of the American Psychiatric Association and the International Classiﬁcation of Diseases, 10th edition (ICD-10)—both published by World Health Organization (WHO). The model is designed to explore the predictive of speciﬁc data. Furthermore, a well-deﬁned database data set improves and optimizes the diagnostic models sought in the research.


Introduction
The continuous search for clinical diagnoses with greater accuracy has driven the use of hybrid models of computational technologies.These hybrid models use concepts of Artificial Intelligence (AI), Machine Learning (ML), and qualitative methods based on Verbal Decision Analysis (VDA), among others.The Machine Learning (ML) algorithms added agility to the diagnostic processes and have brought innovations to the field of health [1].
These hybrid models can be used to accelerate the diagnosis of neurological disorders such as cognitive and motor dementia in individuals infected with the Human Immunodeficiency Virus (HIV) and thus establish an earlier, more agile, and effective treatment [2].
HIV Associated Dementia (HAD) is the foremost source of morbidity and mortality in infected individuals and the most common in the younger population with HIV.Research has shown that age is not a prevalent factor for predisposition to dementia.Many individuals who are 50 years old or more became infected when they were much younger and have lived with the AIDS virus for an extended period asymptomatically without seeking professional help or the use of antiretrovirals.When this age group has signs and symptoms of the disease, such as a cognitive deficit, which causes a negative impact on their quality of life and life expectancy, it causes great concern to health professionals and their relatives.
The difficulty of an early and accurate diagnosis of dementia, especially among HIV-infected people, is due to the rarity of symptoms at the beginning of the infection, requiring a prolonged observation period to reach a diagnosis.Finding various likely events and characteristics to establish a diagnostic pattern for symptoms associated with dementia in HIV-infected individuals is a significant challenge; however, this would allow for early treatment [3].
The social relevance of this research is due to the advantages of using Information and Communication Technologies to predict the high risk of dementia in people with HIV.These technologies can point out the necessary components to assess problems through the identification of respective classified and ordered characteristics related to HAD.Furthermore, such technologies can be used to plan health strategies and referrals to specialists instead of neuropsychological tests, which are considered time consuming and inadequate for some patients.
Although research into HIV has evolved over the last three decades, AIDS still has no cure.In addition, this disease is stigmatized in society, and as an aggravating factor, HIV-infected individuals are discriminated against in many situations, especially in work environments.Furthermore, society considers that competitiveness is a differential in terms of an individual's profile and productive capacity, and consequently, AIDS has become an exclusion factor in the process of wealth generation [4].
People living with HIV have an increased risk of cognitive impairment due to the neurotropism of the virus for the Central Nervous System (CNS), which is considered the second most affected site after the lymphatic system [5].
However, an early diagnosis points to promising horizons for these individuals.Early clinical interventions help control HIV in the body and allow these people to live longer and have a better quality of life [6].
The main characteristic of Machine Learning (ML) is its ability to learn through experience (historical data) and consequently improve its performance [7].Machine Learning algorithms require consistent and accurate data to define whether a diagnosis is positive or negative.
Decision support methods aim to assist in the decision-making process to simplify the analysis of a problem and its alternatives and thereby justify the choice of a particular action.A decision process using verbal methods analyzes the decisionmaker's preferences more reliably.The basis for decision making is the exposure of the pros and cons and their analysis [8].
Research into the use of machine learning and multicriteria methods to make diagnoses [9] has been carried out for diagnosis prediction of Alzheimer.A battery of tests of patients with this diagnosis was performed, and the data were used to structure a decision tree based on the characteristics that played the main role in the diagnosis, and a scale of preference was established by analyzing the resulting tree [10].Another work applied the deductive method and the scanning research technique to study a Wisconsin Breast Cancer data set case study.Here, the authors aimed to evaluate and compare the performance and effectiveness of machine learning models using TOPSIS [11].On the other hand, the authors in [12] presented an automated model for cognitive workload assessment to accurately categorize the intensity of workload induced by multitasking situations.In [13,14], the authors employed machine learning techniques and a multicriteria decision on the Pima Indian diabetes data set for early diagnosis predicting of diabetes mellitus.
Technologies including machine learning and Multiple Criteria Decision Analysis cannot replace a physician's expertise, but they can support them in taking care of straightforward and time-consuming diagnostic tasks and assist in more demanding procedures.
The Verbal Decision Analysis (VDA) is a methodological approach of Multiple Criteria Decision Analysis (MCDA) that verbally supports solving decision making problems more realistically from the decisionmaker's point of view.When applying VDA methods, one is not required to assign numeric degrees of preferences to a criteria value concerning another.Thus, the procedure used to determine the importance of a decisionmaker in a given context is psychologically valid.In addition, operations that acquire the verbal preferences of a decisionmaker are more stable and consistent [15,16].
The present study proposes a hybrid model based on Machine Learning (ML) and associated with a method of Verbal Decision Analysis (VDA) in order to improve the predictive power of the data characteristic from a set of structured information.The diagnostic models of neurological disorders, such as cognitive and motor dementia, were applied in this work to people infected with HIV (Human Immunodeficiency Virus).
This work is divided into five sections.Section 1 provides the contextualization and motivation for the study.Section 2 describes a theoretical framework that addresses the disease and concepts of the technologies used in the study.Section 3 presents the results of the hybrid model.Section 4 discusses the Results.Section 5 gives the Conclusion and an indication of future work on this subject.

Materials and Methods
This section discusses specific topics of AIDS and the consequent cognitive and motor dementia in people infected with HIV.In addition, it presents a model based on Information Technology (IT) methodologies to uncover the characteristics that are more determinant for early diagnosis of dementia in people with HIV/AIDS.

AIDS/HIV and Cognitive Dementia Considerations
In the early 1980s, the world was surprised by the appearance of a pathological agent, until then little known, which caused a disease called Acquired Immunodeficiency Syndrome (AIDS).This is, to date, one of the most significant public health challenges, as it mainly contaminates young people and, to a lesser extent, the elderly.The spread of the disease is partly due to medications such as sexual stimulants and hormone replacements, injectable drugs, and the low acceptance of the usage of condoms [17].
The World Health Organization (WHO) estimates that approximately 40 million people are infected with HIV.From the first cases of HIV/AIDS in the early 1980s to 2019, 76 million people have been infected with HIV, and 35 million have died from AIDS-related illnesses.In the early stages of the disease, 5% of AIDS patients have cognitive and motor dementia.This percentage reaches 20% in the later stages of the disease [18][19][20].
Furthermore, a recent analysis by the Joint United Nations Program on HIV/AIDS (UNAIDS) shows the potential damage caused by the COVID-19 pandemic in low-and middle-income countries worldwide due to insufficient healthcare for HIV-infected individuals.To prevent the dissemination of COVID-19, measures that restrict the movement of people, such as blockades and the closure of borders, have been introduced, and these are impacting both the production and distribution of AIDS drugs and generating increased costs and supply problems.Generic antiretroviral drugs are used to treat people infected with HIV.The final cost of antiretroviral medicines exported from India, for example, has increased at least 25% compared to regular prices [20].
In the mid-1990s, pharmacological research developed antiretroviral drugs, and these initiated a new form of treatment known as Highly Active Antiretroviral Therapy (HAART).This therapy provided an increase in the quality of life of HIV-infected individuals.However, even with HAART, the rate of individuals with AIDS and dementia increased, as cognitive impairments can be influenced by unemployment, social exclusion, and feelings of depression, among others [21].Notably, AIDS is a syndrome that seriously compromises the immune system of HIV-infected individuals, and consequently, they are prone to other diseases and even death.HIV acts mainly on the lymphatic system, responsible for the body's defenses.However, the damage caused to an individual's cognitive and motor skills comes from the fact that HIV also attacks the Central Nervous System (CNS) [22].
The presence of the virus can be detected in brain tissue and the cerebrospinal fluid early on, extending throughout the evolution of the disease, even if it is asymptomatic to neurological disorders [23].Cognitive and motor alterations in individuals with AIDS show that HIV crosses the blood-brain barrier through a Trojan-horse-like mechanism using infected macrophages [24].HIV infects glial cells that secrete neurotoxins by accessing the brain, causing the neuron to die and, consequently, causing a clinical neurological deficit.In addition, the virus is free in the acellular cerebrospinal fluid due to the difficulty of antiretroviral drugs to cross the blood brain barrier, making it difficult to fight the virus in this region [23,25].Thus, the lack of access to the CNS and the cerebrospinal part by antiretroviral drugs give the virus a certain freedom of action in these tissues, enabling the development of neurological disorders such as cognitive dementia [22].Thus, once the virus accesses the CNS, it can affect an individual's cognitive, sensory, and motor functions, leading to mental impairment with deficits in attention, learning, memory, information processing capacity, and problem solving [26].
Necropsy tests in HIV patients have indicated viruses in cortical and subcortical structures such as the frontal lobes, subcortical white matter, and basal ganglia [27].The mechanisms that lead to cognitive impairment are not yet fully understood.However, neurotoxins released by microglia and periventricular macrophages cause the release of cytokines, which leads to modifications in the synaptic architecture of the cortex.The most common mechanism that leads to cell failure is apoptosis or programmed cell death [28].Individuals with an advanced degree of this disease have deficits in several cognitive domains.
HIV-infected and asymptomatic individuals may manifest mood disorders associated with psychomotor slowness and decreased verbal memory.Another form of manifestation is a decrease in verbal and nonverbal cognitive functioning but without any manifestation of mood disorders [29].
According to the American Academy of Neurology, neurological disorders associated with HIV infection comprise the following categories: "minor motor disorder" and "HIV-associated cognitive dementia" [30].Based on the two criteria mentioned and the research and observations made by publications after the introduction of HAART, a new review of the nosology of HIV-associated neurological disorders was proposed.This established the following categories: asymptomatic neurological disorder, mild HIV-associated neurological disorder, and HIV-associated dementia [31].
The criteria for defining a diagnosis are the deficits in cognitive, behavioral, and motor functions.The essential characteristic of cognitive and motor dementia is the loss of intellectual and motor functions, interfering seriously with social and occupational engagements [30].At less advanced stages, a detailed neuropsychological assessment to determine the degree and nature of the cognitive impairment is necessary to identify any psychological morbidities such as depression and anxiety [32].Breiman introduced the Random Forest method, and the technique has since become an essential data analysis tool.The algorithm is generally recognized for its accuracy and ability to handle small sample sizes, capacity resource spaces, and complex data structures [32,33].

Information Technologies
Theoretically, the analyses are less conclusive, and despite its wide use in practical contexts, little is known about the mathematical properties of Random Forest.So far, most studies have focused on isolated parts, whose main theoretical result offers an upper bound for the generalization error of forests in terms of correlation and strength of individuals trees [34].
The bagging procedures and the Classification and Regression Trees (CART) division criteria critically analyze the Random Forest [35].Like a bootstrap aggregation contraction, bagging is a general grouping scheme that proceeds by generating subsamples from the original data set, building a predictor for each sample and deciding on the mean.Bagging is one of the most effective computationally intensive procedures to improve unstable estimates, especially for large and high-dimensional data sets.However, it is impossible to find a suitable one-step model due to the complexity and scale of the problem.On the other hand, the CART-split algorithm builds individual trees to choose the best cuts perpendicular to the axes.The best value is selected at each node of the tree based on Gini impurity and the squared error of the regression.In Breiman's Random Forest [33], each leaf, a terminal node of the individual trees, contains a fixed, preset number of observations, (this parameter is usually between 1 and 5).
CART was chosen for the induction of decision trees in this study.This choice was motivated by the fact that the algorithm does not restrict the nature of the explanatory variables because it generates results that are easy to interpret and easy to find interactions between the variables.

Fundamentals of the "Logistic Regression" Algorithm
Logistic Regression, also known in the literature as "logit regression", "maximum entropy classification (MaxEnt)", or "log-linear classifier", consists of a Machine Learning algorithm for classification problems and predictive analysis based on the probability concepts [36].Logistic Regression is used to assign observations to a discrete set of classes.Logistic Regression is one of the most popular Machine Learning algorithms for binary classification, given a set of independent variables, and is used to predict a binary result (1 or 0, Yes or No, True or False) [32].It has been applied successfully in various areas, such as Medicine [37], Finance [38], and Economics [39].In this model, the probabilities describing the possible outcomes of a single trial are modeled using a logistic function, namely the logical relationship between a dichotomous response variable and a series of numerical (continuous, discrete) or categorical explanatory variables.[37][38][39][40].
Logistic Regression is a type of linear regression in which the probability log is used as a dependent variable.It predicts the probability of an event occurring by adjusting data to a logit function [41].The algorithm makes several assumptions, such as independence, answers (logits) at all subpopulation levels, where the explanatory variable is normally distributed.The variance is constant between reactions and all the values of the explanatory variable.Intuitively, a transformation for the response variable is applied to produce a continuous probability distribution over the output classes bounded between 0 and 1; this transformation is called a "logistic" or "sigmoid" function [42].
A logistic function or logistic curve is an S, sigmoid curve, generated by the Equation (1) [43]: where: x 0 = value of the midpoint of the sigmoid curve, L = maximum value of the curve, k = the logistic growth rate or slope of the curve.
For values of x in the real-number domain from −∞ to +∞, the S curve, standard logistic sigmoid function, i.e., L = 1, k = 1, x 0 = 0, as shown in Figure 1, is obtained in the graph with f approaching L when x approaches +∞ and with f approaching 0 (zero) when x approaches −∞.
Logistic regression is more of a classifier than a regression technique, despite its name.This statement is especially true in medicine or the psychosocial sciences, focusing on predicting and explaining [44].
Moreover, for Logistic Regression, the response variable is quantitative; that is, the log of the chance of being classified in the i-th group of a binary or multiclass answer [45].The probability of a positive event or diagnosis can be estimated directly in Logistic Regression.In the case of a dependent variable Y assuming only two possible responses or binary states of interest (True or False), there is also a set of p independent variables, X 1 , X 2 , . . ., X p .
graph with f approaching L when x approaches +∞ and with f approaching 0 (zero) when x approaches −∞.Logistic regression is more of a classifier than a regression technique, despite its name.This statement is especially true in medicine or the psychosocial sciences, focusing on predicting and explaining [44].
Moreover, for Logistic Regression, the response variable is quantitative; that is, the log of the chance of being classified in the i-th group of a binary or multiclass answer [45].The probability of a positive event or diagnosis can be estimated directly in Logistic Regression.In the case of a dependent variable Y assuming only two possible responses or binary states of interest (True or False), there is also a set of p independent variables,  ,  , … ,  .
The logistic regression model can be written as follows: The coefficients  ,  , … ,  are the regression coefficients that are estimated by the maximum likelihood of the data set.The probability of Y = 1 for a new instance is then estimated by replacing β with its estimated counterparts and X with its realizations for the new instance considered in Equation (2).The new instance is then assigned to class Y = 1 if P (Y = 1) > c, where c is a fixed limit, and class Y = 0 otherwise.The commonly used limit c = 0.5 produces a classifier called Bayes [46,47].
For discrimination of two groups, the classification rule is as follows: In this study, P can be interpreted as the probability of an individual with HIV having a positive diagnosis for high risk of dementia, given the characteristics qualified during the assessment by the health professional, represented by,  ,  , … ,  .
The objective is to find a set of weights.The negative likelihood log is minimized over the defined training set using optimization techniques such as descending gradient or descending stochastic gradient [42].Minimizing the negative likelihood log also means maximizing the  parameter estimate of selecting the correct class.The loss function that measures the difference between the fundamental truth label and the predicted class label is called the cross-entropy.If the forecast is too close to the true label, the amount of loss is low.Alternatively, the resulting log loss is more significant if the prediction is far from the true label.The logistic regression model can be written as follows: The coefficients β 0 , β 1 , . . ., β p are the regression coefficients that are estimated by the maximum likelihood of the data set.The probability of Y = 1 for a new instance is then estimated by replacing β with its estimated counterparts and X with its realizations for the new instance considered in Equation ( 2).The new instance is then assigned to class Y = 1 if P (Y = 1) > c, where c is a fixed limit, and class Y = 0 otherwise.The commonly used limit c = 0.5 produces a classifier called Bayes [46,47].
For discrimination of two groups, the classification rule is as follows: In this study, P can be interpreted as the probability of an individual with HIV having a positive diagnosis for high risk of dementia, given the characteristics qualified during the assessment by the health professional, represented by, X 1 , X 2 , . . ., X p .
The objective is to find a set of weights.The negative likelihood log is minimized over the defined training set using optimization techniques such as descending gradient or descending stochastic gradient [42].Minimizing the negative likelihood log also means maximizing the p i parameter estimate of selecting the correct class.The loss function that measures the difference between the fundamental truth label and the predicted class label is called the cross-entropy.If the forecast is too close to the true label, the amount of loss is low.Alternatively, the resulting log loss is more significant if the prediction is far from the true label.
The simplicity and interoperability of logistic regression can occasionally outperform other nonlinear models.Still, if the response variable is taken from a small sample, logistic regression models become unsatisfactory and perform poorly for binary responses [48].
The Logistic Regression can be categorized as follows: i. Binomial or binary: • Here, it addresses situations where the observed result for a dependent variable can have only two possible types, "0" or "1" ("sick" or "healthy").

•
They are usually referred to as "Logistic Regression".

•
It predicts the probability that an observation falls into one of two categories of a dichotomous dependent variable, based on one or more independent variables, which can be continuous or categorical [46,49]. ii. Multinomial: • This is an extension of a simple linear regression, used when we want to predict the value of a variable based on the value of two or more other variables.

•
Alternatively, when we want to predict the variable called a dependent variable (or sometimes an outcome, objective, or criterion variable).

•
The variables we are using to predict the value of the dependent variable are called independent variables (or sometimes predictive, explanatory, or regressive variables) [41,44]. iii. Ordinal: • This works with dependent and ordered variables.

•
It is usually called "ordinal regression".• It is used to predict an ordinal dependent variable, given the independent variables [48].A Naive Bayes classifier is structured in the calculation of the posterior probability distribution P (Y | X), where Y = (y 1 , y 2 , . . ., y p ) is the random variable to be classified presenting categories "k", and X = (x 1 , x 2 , . . ., x p ) is a set of discrete explanatory variables presenting "p".In addition, to calculate the conditional probability P (Y | X), this method assumes a probabilistic independence between the explanatory variables, which facilitates the computational approach following Equation (3): Therefore, a Naive Bayes classifier is based on calculating the probability that a particular observation belongs to a specific category and classifies that observation into the most suitable variety, as shown in Figure 2.

Fundamentals of the Verbal Decision Analysis
The Verbal Decision Analysis (VDA) assumes that most decision-making p can be expressed in natural language and consist of the decision-making process b the representation related to the problem [50].Notably, this methodology addre structured issues, characterized by the absence of logical and well-defined proced be applied in its solution.

Fundamentals of the Verbal Decision Analysis
The Verbal Decision Analysis (VDA) assumes that most decision-making problems can be expressed in natural language and consist of the decision-making process based on the representation related to the problem [50].Notably, this methodology addresses unstructured issues, characterized by the absence of logical and well-defined procedures, to be applied in its solution.
These problems are qualitative, and therefore, it is complex to organize, formalize, and numerically measure them.Furthermore, it is unusual to have all the data required to solve them.Thus, the analysis process is subjective, which requires collecting information from the decision-maker.In this research, the decision-makers were highly qualified health professionals with scientific experience and in clinical practice with HIV/AIDS individuals.
According to Larichev [50], the methods that make up the structure of Verbal Decision Analysis are ZAPROS-III, ZAPROS-LM, PACOM, and ORCLASS and their characteristics and applications.These methods require a large quantity of data to be analyzed by humans as follows: 1.
Comparison of two assessments on a verbal scale by two criteria.2.
Assignment of multicriteria alternatives to decision classes.

3.
Comparative verbal evaluation of alternatives according to different criteria.
These methods constitute models for DSS (Decision Support Systems) that help a decisionmaker classify or rank the alternatives of multiple attributes.
Figure 3 shows a simplified visualization of the various methods that make up the Verbal Decision Analysis methodology according to the problems.

Fundamentals of the Verbal Decision Analysis
The Verbal Decision Analysis (VDA) assumes that most decision-making problems can be expressed in natural language and consist of the decision-making process based on the representation related to the problem [50].Notably, this methodology addresses unstructured issues, characterized by the absence of logical and well-defined procedures, to be applied in its solution.
These problems are qualitative, and therefore, it is complex to organize, formalize, and numerically measure them.Furthermore, it is unusual to have all the data required to solve them.Thus, the analysis process is subjective, which requires collecting information from the decision-maker.In this research, the decision-makers were highly qualified health professionals with scientific experience and in clinical practice with HIV/AIDS individuals.
According to Larichev [50], the methods that make up the structure of Verbal Decision Analysis are ZAPROS-III, ZAPROS-LM, PACOM, and ORCLASS and their characteristics and applications.These methods require a large quantity of data to be analyzed by humans as follows: 1. Comparison of two assessments on a verbal scale by two criteria.2. Assignment of multicriteria alternatives to decision classes.3. Comparative verbal evaluation of alternatives according to different criteria.
These methods constitute models for DSS (Decision Support Systems) that help a decisionmaker classify or rank the alternatives of multiple attributes.
Figure 3 shows a simplified visualization of the various methods that make up the Verbal Decision Analysis methodology according to the problems.This work uses the ZAPROS method, which adapts to the characteristics of the problem and allows the structuring of a decision rule used to compare the alternatives that are not changed, even if the set of alternatives is modified.In addition, this applies to problems with many alternatives [51,52].
In the Verbal Decision Analysis (VDA) structure, the ZAPROS-III-i method can be considered an evolution of the ZAPROS-III and ZAPROS-LM.Like the ZAPROS-LM and PACOM methods, the ZAPROS-III-i method aims to order a group of alternatives, from the most preferable to the least preferable [52].
Although ZAPROS-III-i applies a similar procedure to obtain the preferences of its successor, it implements modifications that make it more effective and more accurate for inconsistencies.Furthermore, the number of alternatives is substantially less than in ZAPROS-III and ZAPROS-LM.
The ZAPROS-III-i method is applied in three steps: Problem Formulation, Obtaining Preferences, and Comparing Alternatives [52].

Given:
Appl.Sci.2021, 11, 10457 9 of 22 1.K = 1, 2, . . ., N, representing a set of N criteria; 2. n q represents the number of possible values on the scale of q-th criterion, (q ∈ K); for ill-structured problems, as in this case, usually n q ≤ 4; 3. X q = {x iq } represents a set of values to the q-th criterion, and this set is the scale of this criterion; |X q | = n q (q ∈ K), where the values of the scale are ranked from best to worst, and this order does not depend on the values of other scales; 4. Y = X 1 × X 2 × . . .× X n represents a set of vectors y i (every possible alternative: hypothetical alternatives + real alternatives) in such a way that: y i = y i1 ; y i2 ; . . .; y iQ , and y i ∈ Y, y i q ∈ X q and Q = |Y|, such that |Y| = ∏ Q q=1 n q .

B. Elicitation of Preferences:
The scale of preferences for quality variations (Joint Scale of Quality Variations JSQV) is constructed in this stage.The elicitation of preferences follows the order of steps shown in Figure 4 [53].This structure is the same as proposed in [54]; however, substages 2 and 3 (numbered on the left side of the figure) were put together in just one substage.
Although ZAPROS-III-i applies a similar procedure to obtain the preferences of its successor, it implements modifications that make it more effective and more accurate for inconsistencies.Furthermore, the number of alternatives is substantially less than in ZAPROS-III and ZAPROS-LM.
The ZAPROS-III-i method is applied in three steps: Problem Formulation, Obtaining Preferences, and Comparing Alternatives [52].

A. Problem Formulation:
Given: 1. K = 1, 2,…, N, representing a set of N criteria; 2. nq represents the number of possible values on the scale of q-th criterion, (q ∈ K); for ill-structured problems, as in this case, usually nq ≤ 4; 3.  = {xiq} represents a set of values to the q-th criterion, and this set is the scale of this criterion; | | = nq (q ∈ K), where the values of the scale are ranked from best to worst, and this order does not depend on the values of other scales;

B. Elicitation of Preferences:
The scale of preferences for quality variations (Joint Scale of Quality Variations JSQV) is constructed in this stage.The elicitation of preferences follows the order of steps shown in Figure 4 [53].This structure is the same as proposed in [54]; however, substages 2 and 3 (numbered on the left side of the figure) were put together in just one substage.Instead of basing the decisionmaker's preferences on the first reference situation and then establishing another scale of preferences using the second reference situation, we propose that the two substages be transformed into one.The questions made considering the first reference situation are the same as those considering the second.Thus, both situations are presented and must be regarded as answering the question not to cause dependence of criteria.The alteration reflects on an optimization of the process: instead of making 2n questions, only n are made.The questions, about Quality Variations (QV) belonging to just one criterion, are made as follows: supposing a criterion A with X A = A 1 , A 2 , A 3 , the decisionmaker is asked about their preferences between the QV, a 1 -a 2 , a 1 -a 3 and a 2 -a 3 .Thus, there is a maximum of three questions to a criterion with three values (n q = 3).
The question for two criteria of preferences is formulated differently due to difficulties in understanding and a delay in the decisionmaker's answers when the QV was exposed to different criteria.The question is made by dividing the QV into two items.For example, having the set of criteria k = A, B, C, where n q = 3 and X q = q 1 , q 2 , q 3 considering the pair of criteria A, B and the QV a 1 and b 1 , the decisionmaker should analyze which imaginary alternative would be preferable: A 1 , B 2 , C 1 or A 2 , B 1 , C 1 .However, this answer must be the same as the alternatives: A 1 , B 2 , C 3 and A 2 , B 1 , C 3 .If the decisionmaker answers that the first option is better, then b 1 is preferable to a 1 , because it is preferable to have B 2 on the alternative instead of A 2 .

C. Comparison of Alternatives:
In order to reduce the number of cases of incomparability, we applied the same structure as proposed in [54] but modified the comparison of pairs of the alternative substage according to the one proposed in [55].Figure 5 shows the structure of the comparison of the alternative process.
The question for two criteria of preferences is formulated differently due to difficulties in understanding and a delay in the decisionmaker's answers when the QV was exposed to different criteria.The question is made by dividing the QV into two items.For example, having the set of criteria k = A, B, C, where nq = 3 and  = q1, q2, q3 considering the pair of criteria A, B and the QV a1 and b1, the decisionmaker should analyze which imaginary alternative would be preferable: A1, B2, C1 or A2, B1, C1.However, this answer must be the same as the alternatives: A1, B2, C3 and A2, B1, C3.If the decisionmaker answers that the first option is better, then b1 is preferable to a1, because it is preferable to have B2 on the alternative instead of A2.

C. Comparison of Alternatives:
In order to reduce the number of cases of incomparability, we applied the same structure as proposed in [54] but modified the comparison of pairs of the alternative substage according to the one proposed in [55].Figure 5 shows the structure of the comparison of the alternative process.According to [31] and [15], the Decision Support Systems (DSS) are computer-based systems that aid the decisionmaker in informing and exploring the implications of their preferences, and this leads to a decision based on the problem comprehension.Figure 6 presents a flowchart with the steps to apply the ZAPROS-III-i/VDA.According to [31] and [15], the Decision Support Systems (DSS) are computer-based systems that aid the decisionmaker in informing and exploring the implications of their preferences, and this leads to a decision based on the problem comprehension.Figure 6 presents a flowchart with the steps to apply the ZAPROS-III-i/VDA.A disadvantage of the method arises when there is exponential growth in the numbe of alternatives of the problem as well as the amount of information required to obtain th preferences.To control this complexity, the number of criteria and values of the treate criteria are limited.
The method proposes that the two reference situations generated by the AI Algo rithm and the one corresponding to the health professionals' decisions are analyzed sim A disadvantage of the method arises when there is exponential growth in the number of alternatives of the problem as well as the amount of information required to obtain the preferences.To control this complexity, the number of criteria and values of the treated criteria are limited.
The method proposes that the two reference situations generated by the AI Algorithm and the one corresponding to the health professionals' decisions are analyzed simultaneously to optimize the process and avoid criteria dependence [53].
According to Tamanini and Pinheiro [52,53], these modifications increased the comparability of the method so that several alternatives defined as incomparable, when applying the ZAPROS method in its pure form, can now be compared directly or indirectly.However, these changes in the method process do not alter its computational complexity [56].

Proposed Hybrid Model Used in this Study
There is great interest in applications for Artificial (AI) in healthcare in order to be able to acquire an early diagnosis and early treatment.Furthermore, AI can work with large amounts of data and complex systems and generate relevant results.Health events are more stochastic and less deterministic; therefore, they need a centralized predictive control system to learn from their own experience, improving the same algorithms they use to provide recommendations [57].
The main characteristic of Machine Learning technology, a subset of AI, is its learning from experiences recorded in historical data, and its performance improves according to this learning [58].
In this study, a large amount of structured data that was composed of many rows and columns containing unique numbers or values was converted to extract only the information able to improve the proposed model, refined from several test bases, one for each learning cycle.The origin of each test base of each cycle counted on the exploration of a large and varied amount of data structured in rows and columns.The learning cycle had to be repeated until the results were satisfactory, consequently making the algorithm more accurate.
The Dementia in People with HIV database was constructed using data collected from HIV-positive individuals aged fifty years and older with mental and clinical conditions suitable to participate in the database [59].The data collection and monitoring took place at the outpatient clinic of the Reference Hospital for Infectious and Parasitic Diseases located in the city of Fortaleza, CE-Brazil.Data from 294 people were collected; unfortunately, due to the COVID-19 pandemic and the need for social isolation, no further increase in the sample was possible.For the training and testing basis, we applied k-fold cross-validation (with k = 10), which performed the fitting procedure a total of ten times, with each fitting being performed on a training.The instruments used to acquire the data were: Evaluation of Clinical and Immunological Aspects; International HIV-Associated Dementia Scale; Brief Symptom Inventory; and Social Support and Instrumental Activities of Daily Living Scale.The individuals at high risk of dementia were identified after data collection.A total of 70% of the data was used for training and 30% for the tests.The collection was started after approval by the Research Ethics Committee of the Federal University of Ceará /Brazil, respecting the ethical precepts of nonmaleficence, beneficence, autonomy, and justice for its participants, under No. 3038967.
The main objective of this study is to propose a hybrid model based on Machine Learning, associated with a multicriteria Verbal Decision Analysis method for the early diagnosis of dementia in HIV-infected individuals.The VDA method chosen was the ZAPROS-III-i, as presented in Section 2.2.4,due to its presence in several multicriteria decision support models in the health area.In turn, three fundamental Machine Learning algorithms were applied that, according to [60], have been widely successfully implemented in various areas, mainly in the health field, for classification and regression purposes.The Random Forest algorithm was presented in Section 2.2.1, the Logistic Regression was presented in Section 2.2.2, and the third Machine Learning algorithm, Naïve Bayes, was presented in Section 2.2.3.This latter algorithm was included to serve as a third option to the Random Forest and Logistic Regression.
The hybrid model proposed in the present study is shown in Figure 7, where the operational steps of the process can be seen.
ZAPROS-III-i, as presented in Section 2.2.4,due to its presence in several multicriteria decision support models in the health area.In turn, three fundamental Machine Learning algorithms were applied that, according to [60], have been widely successfully implemented in various areas, mainly in the health field, for classification and regression purposes.The Random Forest algorithm was presented in Section 2.2.1, the Logistic Regression was presented in Section 2.2.2, and the third Machine Learning algorithm, Naïve Bayes, was presented in Section 2.2.3.This latter algorithm was included to serve as a third option to the Random Forest and Logistic Regression.
The hybrid model proposed in the present study is shown in Figure 7, where the operational steps of the process can be seen.As can be seen in Figure 7, the process has four steps.The first step is made up of visual research and data filters; they are compiled from the DSM-5 (Diagnostic and Statistical Manual of Mental Disorders, 5th edition), the American Psychiatric Association, and the ICD-10 (International Classification of Diseases, 10th edition), and relevant information on cognitive and motor dementia from the HIV-infected individuals.This information is added to these individuals' clinical and behavioral data and to the research database, forming the Analytical Base Table (ABT).As can be seen in Figure 7, the process has four steps.The first step is made up of visual research and data filters; they are compiled from the DSM-5 (Diagnostic and Statistical Manual of Mental Disorders, 5th edition), the American Psychiatric Association, and the ICD-10 (International Classification of Diseases, 10th edition), and relevant information on cognitive and motor dementia from the HIV-infected individuals.This information is added to these individuals' clinical and behavioral data and to the research database, forming the Analytical Base Table (ABT).
In the second step, the ABT embedded in Step 1 is submitted for analyses by each of the Machine Learning algorithms used in this study: Random Forest, Logistic Regression, and Naive Bayes.It is essential to highlight that, in this study, the Orange Framework was used (version 3.29.3)for applying Machine Learning algorithms with default parameters.
Moreover, the Orange Framework is a set of machine learning and data mining tools for analyzing databases using Python script and visual programming.Orange is aimed at experienced users and programmers, as well as researchers using Data Mining.This framework is based on the C++ language, allowing search solution developers to work in the Python language [61].
Practitioners assessed patients before using Machine Learning and applied an instrument called HIV-Associated Dementia, which reveals whether the patient is at high or low risk of dementia.This instrument has four items to assess memory and psychomotor disorders in people with HIV, whose full assessment is 12 points and with a cutoff value used in this research <10 points for high risk of dementia.This criterion is considered the gold standard for targeting the dementia field.Thus, patients who scored less than 10 after the application of this instrument were classified as 1.0 (high risk), and those who obtained a score greater than 10 were rated 2.0 (low risk).
Regarding the database (information from the widget files on Orange): 294 instances, the lines being registered patients, 58 features (columns), one target (Dementia Field) and four meta-attributes (reason_for_admission, CV_start, TCD4_start, current_TCD4).In the database, there are 176 cases with high risk for dementia and 118 low risks for lower dementia.Using the Orange tool, we selected the Dementia field as a target in the widget file as shown here in Figure 8: in this research <10 points for high risk of dementia.This criterion is considered the gold standard for targeting the dementia field.Thus, patients who scored less than 10 after the application of this instrument were classified as 1.0 (high risk), and those who obtained a score greater than 10 were rated 2.0 (low risk).
Regarding the database (information from the widget files on Orange): 294 instances, the lines being registered patients, 58 features (columns), one target (Dementia Field) and four meta-attributes (reason_for_admission, CV_start, TCD4_start, current_TCD4).In the database, there are 176 cases with high risk for dementia and 118 low risks for lower dementia.Using the Orange tool, we selected the Dementia field as a target in the widget file as shown here in Figure 8: The target variable is the one that defines the event record with high-risk dementia or not.To determine the risk for dementia, we analyzed the main symptoms or most recurrent characteristics presented in the decision tree generated by Random Forest.That is, if a particular case has the same characteristics identified with dementia equal to 1, it may indicate that the patient is at high risk of dementia.
Figure 9 shows the execution process of each Machine Learning algorithm in Step 2 of the hybrid model.The target variable is the one that defines the event record with high-risk dementia or not.To determine the risk for dementia, we analyzed the main symptoms or most recurrent characteristics presented in the decision tree generated by Random Forest.That is, if a particular case has the same characteristics identified with dementia equal to 1, it may indicate that the patient is at high risk of dementia.
Figure 9 shows the execution process of each Machine Learning algorithm in Step 2 of the hybrid model.Step 2 of the hybrid model classifies the main characteristics of cognitive and motor dementia, the object of the study.Step 3 is based on the features identified in the previous step.An online questionnaire was designed to contain the main characteristics, and their respective qualifiers (mild/moderate; severe; complete), and criteria (A, B, C) based on the International Classification of Functioning, Disability and Health developed by the World Health Organization (2001), which aims to standardize the description of health to its clinical situation [62].
The creation of this form contains the main characteristics classified by the best performing algorithm, Random Forest, for application to decisionmakers.Due to the mandatory social isolation established during the COVID-19 pandemic, the questionnaire was structured in Google Forms and sent along with the Informed Consent Form (FICF) by Step 2 of the hybrid model classifies the main characteristics of cognitive and motor dementia, the object of the study.Step 3 is based on the features identified in the previous step.An online questionnaire was designed to contain the main characteristics, and their respective qualifiers (mild/moderate; severe; complete), and criteria (A, B, C) based on the International Classification of Functioning, Disability and Health developed by the World Health Organization (2001), which aims to standardize the description of health to its clinical situation [62].
The creation of this form contains the main characteristics classified by the best performing algorithm, Random Forest, for application to decisionmakers.Due to the mandatory social isolation established during the COVID-19 pandemic, the questionnaire was structured in Google Forms and sent along with the Informed Consent Form (FICF) by email to specialized healthcare professionals with clinical experience scientific research in infectious diseases.These professionals were selected through the Lattes Platform, explaining purpose of the work, the benefits, and safety when participating in the research.The questionnaire aims to obtain information for the formulation of criteria that facilitate the diagnosis of dementia in HIV-infected individuals.The professionals used their free will to assess according to each one's experience.Some points of facilities and difficulties were addressed with the aim of ordering which characteristics are considered most important in the process for assessing high risk for HAD.Features have multiple choices based on the degree of difficulty defined by the ICF.
The questionnaire addressed the characteristics of patients at high risk for HAD so that the professional, when answering the questions and carrying out their assessment, would qualify the questions from three alternatives.In addition to collecting data that meet the concept, it is essential to obtain them so that the necessary treatment can be applied later to test the hypotheses.Therefore, it is essential to pay attention to the design of the collection instrument, with the type of information it provides and the type of analysis that can be conducted after obtaining the answers on the form.With the collection of useful or relevant data to test the hypotheses, the analysis model was confronted with the collected data, considering the requirements to put it into practice.
Finally, in the fourth step, the answers to the questionnaires were submitted to the ZAPROS-III-i method, which was implemented in the ARANAÚ tool, and then the Verbal Decision Analysis (AVD) was performed to define the order of the main HAD characteristics.These characteristics were used in the proposed model.
Thus, the ZAPROS-III-i places these characteristics in the decisionmaker's preference and constructs a protocol structured in Machine Learning.Notably, the ZAPROS-III-i method is implemented in the multiplatform Aranaú tool.It is developed in Java, in a ".Jar" format with portable code characteristics.Furthermore, this can be compiled and executed on Operating Systems without the need to change the code [63].
Step 4 produces a list of critical features of preference by the decisionmaker to support the diagnosis.

Results
This section gives the results obtained from the hybrid model using Machine Learning with the Random Forest algorithm, which was identified as the best performance with an accuracy of 87.6%, compared to Logistic Regression 83.9% and Naive Bayes 83.5%.
Random Forest showed the best performance of the model; the precision indicates the amount of correct choices among all the class classifications for a high risk of dementia.Recall indicates one of the high-risk situations, and F1 reveals the harmonic mean between Precision and Recall, as seen in Figure 10: Decision Analysis (AVD) was performed to define the order of the main HAD characteristics.These characteristics were used in the proposed model.Thus, the ZAPROS-III-i places these characteristics in the decisionmaker's preference and constructs a protocol structured in Machine Learning.Notably, the ZAPROS-III-i method is implemented in the multiplatform Aranaú tool.It is developed in Java, in a ".Jar" format with portable code characteristics.Furthermore, this can be compiled and executed on Operating Systems without the need to change the code [63].
Step 4 produces a list of critical features of preference by the decisionmaker to support the diagnosis.

Results
This section gives the results obtained from the hybrid model using Machine Learning with the Random Forest algorithm, which was identified as the best performance with an accuracy of 87.6%, compared to Logistic Regression 83.9% and Naive Bayes 83.5%.
Random Forest showed the best performance of the model; the precision indicates the amount of correct choices among all the class classifications for a high risk of dementia.Recall indicates one of the high-risk situations, and F1 reveals the harmonic mean between Precision and Recall, as seen in Figure 10:       The result obtained by Random Forest showed that the main characteristics fo risk of dementia in people with HIV were depression, interpersonal sensitivity, a bic anxiety, viral load, and constructive capacity.
Criteria were based on the International Classification of Functioning, Disabi Health (ICF) for the six characteristics identified in the Orange Canvas.The charac The result obtained by Random Forest showed that the main characteristics for a high risk of dementia in people with HIV were depression, interpersonal sensitivity, age, phobic anxiety, viral load, and constructive capacity.
Criteria were based on the International Classification of Functioning, Disability and Health (ICF) for the six characteristics identified in the Orange Canvas.The characteristics of depression, interpersonal sensitivity and phobic anxiety were placed in criterion A, which corresponds to emotional functions, interactions, and interpersonal relationships that present a barrier or difficulty.Constructive capacity was placed in criterion B regarding psych motivating functions, learning, and application of knowledge that present an obstacle or difficulty.Finally, age and viral load were placed in criterion C, which shows that the immune response presents a barrier or difficulty.
According to the ICF manual [62], the assessment qualifiers specify the extent to which the functionality of a given feature is compromised.For this research, qualifiers of the Mild/Moderate type present from 0% to 49% of neurological impairment caused by HIV; Severe from 50% to 95%; and Complete identifies that the interference of the infection in cognition and behavior can compromise from 96% to 100% of the person living with the AIDS virus.
The questionnaire was analyzed by 40 health professionals; most (37.5%) were aged between 31 and 40 years old, 56.3% were women, 22.9% were physicians, 45.8% had specialist qualifications, 29.2% had an MSc, and 2.5% had PhD degrees related to infectious diseases.In addition, the items were based on the main characteristics identified by the Random Forest algorithm in cases of older people with HIV.We verified the relationship in the ICF for each of the characteristics according to functionality and disability.Finally, we have added classification options to indicate their extent or magnitude.
The results of the questionnaire can be seen in Table 1: After the Elicitation of Preferences, each professional's preferences in the research generated the results shown in Table 1, which shows the search of the ideal result that would significantly affect the detection of dementia in people with HIV, which were compared with the results obtained through decision trees.Then, the data with the values of criteria and alternatives acquired in the research were loaded in Figure 12.The ARANAÚ tool was used to define the order of preference of the main characteristics, which were established by the ZAPROS-III-i method and their qualifiers most likely to determine the definitive diagnosis.The ZAPROS-III-i compared the Quality Variations (QV), the minimum distances of two alternatives, and the Formal Index of Quality (FIQ), which shows the minimum number of qualifiers pairs the decisionmaker also compared.After the Elicitation of Preferences, each professional's preferences in the research generated the results shown in Table 1, which shows the search of the ideal result that would significantly affect the detection of dementia in people with HIV, which were compared with the results obtained through decision trees.Then, the data with the values of criteria and alternatives acquired in the research were loaded in Figure 12.The ARANAÚ tool was used to define the order of preference of the main characteristics, which were established by the ZAPROS-III-i method and their qualifiers most likely to determine the definitive diagnosis.The ZAPROS-III-i compared the Quality Variations (QV), the minimum distances of two alternatives, and the Formal Index of Quality (FIQ), which shows the minimum number of qualifiers pairs the decisionmaker also compared.These data enabled the combination of vectors formed by "criteria × alternatives," that is, "rows × columns" of Table 1.The graphical representation in Figure 12 shows that the decision-makers' most preferable characteristics have more outgoing arcs than incoming arcs.
Thus, the combination of the classifications resulted in the final order of preferences of the characteristics in descending order, namely: 10-Difficulty in voluntarily relating to other people (auditory and visual perception functions, as well as hallucinations or illusions) due to a high risk of developing dementia; 9-Difficulties in producing a confident and assertive temperament for a positive diagnosis of dementia; 8-Difficulty in reproducing an event or symbol (to imitate or copy a drawing); 7-Functions that allow volitional thought control such as rumination, affection, sadness, fear, and tension for a positive diagnosis of high risk of dementia; 6-I do not like to walk alone; 5-General mental functions of physiological and psychological mechanisms that encourage the individual to act persistently to satisfy needs; 4-Difficulty in focusing on an external stimulus or an internal experience for the necessary period of time (maintenance functions, change and division of attention, and distraction); 3-Difficulties in self-care (taking care of the body and parts of the body, such as skin, face, teeth, scalp, and genitals); 2-Specific memory functions to record and store information and retrieve it when needed in the short and long term, recent and remote.General mental functions of physiological and psychological mechanisms encourage the individual to act persistently to meet needs; 1-Body functions related to protection against foreign substances, including infections, through specific immune responses (immunodeficiencies, lymphadenitis).
Figure 12 above shows the graphic representation of the final order of preference of the main alternatives established by the ZAPROS-III-i method to positively diagnose the high risk of dementia in people with HIV.

Discussion
This section discusses the characteristics of the most accurate algorithm, Random Forest (87.6%), which was more determinant for the high risk of dementia and organized in descending order respected the classification result obtained by the VDA and by the ZAPROS III-i methods.
When aided by AI computational algorithms that can show possible neurological changes caused by the AIDS virus, the health professional can optimize the clinical reasoning that leads to an early and accurate diagnosis of any possible Neurocognitive Disorders.Thus, health professionals can offer a safe, agile, and effective therapeutic follow-up to search for improved quality of life and a good prognosis for HIV.Patients with untreated HIV have an average life expectancy of less than six months without dementia [64].
According to the results of this study, the immune response was considered the preferable characteristic for a high risk of dementia in individuals with HIV.Despite the advent of antiretroviral therapy capable of delaying immunodeficiency, people undergoing treatment are susceptible to neurological changes because the AIDS virus is neurotropic.In addition, other factors, such as viral resistance, drug toxicity, and the need for a high level of treatment adherence, are considered the main difficulties in reducing the viral load, affecting the patient's cognitive and behavioral aspects [64].
According to the Centers for Disease Control and Prevention (CDC), the most susceptible determinant to HIV-Associated Neurocognitive Diseases (HAND) is the degree of immunosuppression of the infected individual, whose values are the nadir of TCD4 Lymphocytes (LCTD4) + < 350 cells/mm 3 or LTCD4 + current < 350 cells/mm 3 [65,66].
Neuroinflammation can trigger different cognitive and motor declines.In addition, these are classified according to neuropsychological assessments and the impact of the disease on the performance of daily activities, clinical diagnosis, and analysis of cerebrospinal fluid and plasma.These methods reveal a more detailed knowledge of CNS changes during HIV infection and differentiate HAND from other CNS diseases [67][68][69].
In addition to monitoring the medication adherence, health professionals need to know and observe the subtle signs of psychomotor changes reported and manifested by patients.When identified quickly, they can be reversed by HAART.Studies show that follow-ups based on a cognitive-behavioral approach generate confidence for patients to adhere to interventions guided by the professional, such as the continuous use of antiretrovirals, searching for information about their clinical condition, and improving self-perception of the infected person and their social relationships [70].
Studies regarding social and emotional factors show that infected individuals, when they have feelings of loneliness, negative thoughts about living with HIV, and low social support from family and friends, imply that the progression of the disease makes them more resistant to the guidelines and makes it more difficult to carry out the treatment correctly.These social and emotional factors also hinder them in performing daily activities, even in the early stages of the disease.In addition, they enhance negative thoughts due to depression related to infection and anxiety and the thoughts of death [71][72][73].
Monitoring stress-induced behavioral changes is essential due to the progression of chronic neuroinflammation, neuroendocrine changes, and anomalous neurotransmission signaling.Following, HIV-positive individuals with psychiatric disorders such as depression, interpersonal sensitivity, and phobic anxiety tend to present increased levels of inflammatory cytokines (TNF-alpha, IL-1beta, and IL-6), alterations in the glutamatergic and monoaminergic pathways in the cortical and subcortical regions of the brain.When such symptoms go underdiagnosed, they can cause irreversible neurocognitive damage such as dementia [74].
The high risk of dementia is clinically significant when it affects the functioning of daily activities.Infected individuals with mild cognitive impairments can have occupational problems, even in the early stages of the disease.These individuals have difficulties in the learning process, and this can be identified due to low performance at work, attention, and memory, especially in performing instrumental activities of daily living, causing difficulties in self-care, such as caring for the body and its parts, such as skin, face, teeth, scalp, and genitals [75][76][77].
It is essential to emphasize that the analysis of cognitive and behavioral conditions related to HIV is still difficult and requires great attention from professionals and researchers.These conditions represent factors that influence a high risk of dementia, and therefore mechanisms for their delay and control by the health professional must be performed to control and prevent the progression of dementia caused by neuroinflammation.

Conclusions
Identifying mental disorders is a complex process, given the subjectivities experienced by each individual, and when dealing with the immunosuppressed, it becomes something broader.The presence of HIV, in addition to the neurological impairment, can impact the psychological and social factors of the infected person, as it causes a mixture of feelings of fear, panic, suicidal tendencies, anxiety, and social isolation, which can interfere with adherence to antiretroviral therapy and the individual's quality of life.
The hybrid approach with the Machine Learning algorithm, Random Forest, combined with Verbal Decision Analysis, accurately obtained the characteristic signs of dementia.The application of the hybrid approach proved to be efficient and revealed ten characteristics that influence the prediction of a high risk of dementia in people with HIV.
When analyzed by health professionals, the characteristic "immune factor" came first, with functions that allow volitional control of thought, such as rumination, affection, sadness, fear, and tension, for a positive high-risk diagnosis with insanity in the last position.The importance of combining the algorithm and health professional's experience to obtain the diagnostic framework is essential, given the need for clinical experience and subjective responses.
A future work will apply the hybrid model to individuals with HIV/AIDS over eighteen years old to expand the universe of classifications and the risk of developing dementia in the younger population.In addition, we intend to incorporate other AI algorithms to classify patients and develop clinical protocols, using methodologies such as Bayesian networks and cognitive maps.Finally, another future work could apply only the Verbal Decision Analysis (VDA) of a decision to the bases presented.
As a limitation of the study, the participation of health professionals was affected as the online submission of the questionnaire coincided with the high incidence of people infected and hospitalized for complications caused by SARS-COV2 during the COVID-19 pandemic.

2. 2 . 3 .
Fundamentals of the "Naïve Bayes" Algorithm Bayesian networks can be applied to classification problems using Bayesian classifiers.The simplest structure of the Bayesian Network is a classification algorithm called Naive Bayes, also known as the Bayesian classifier with k-dependence and the structure of Simple Bayesian Networks with K-Dependency [47].Therefore, Naive Bayes is a simple probabilistic algorithm based on Bayes' theorem.This algorithm uses training data to form a probabilistic model based on evidence of specific characteristics in an extensive database.The algorithm assumes independence between the features of the model, implying the absence of relationships between a given quality and the other attributes of the database [47].

Figure 7 .
Figure 7. Structure of the proposed hybrid model operating process.

Figure 7 .
Figure 7. Structure of the proposed hybrid model operating process.

Figure 9 .
Figure 9. Application of Machine Learning algorithms.

Figure 9 .
Figure 9. Application of Machine Learning algorithms.

Figure 10 .
Figure 10.Test and score measures.

Figure 11
Figure 11 demonstrates the Confusion Matrix generated in the Orange tool, revealing the number of instances correctly/incorrectly classified by Random Forest, which also stood out in relation to the other algorithms trained and tested in the ABT data.Random Forest classified with an accuracy of 94.9% of the data as high risk of HAD (1.0).

Figure 10 .
Figure 10.Test and score measures.

Figure 11
Figure 11 demonstrates the Confusion Matrix generated in the Orange tool, revealing the number of instances correctly/incorrectly classified by Random Forest, which also stood out in relation to the other algorithms trained and tested in the ABT data.Random Forest classified with an accuracy of 94.9% of the data as high risk of HAD (1.0).

Figure 10 .
Figure 10.Test and score measures.

Figure 11
Figure 11 demonstrates the Confusion Matrix generated in the Orange tool, re the number of instances correctly/incorrectly classified by Random Forest, wh stood out in relation to the other algorithms trained and tested in the ABT data.R Forest classified with an accuracy of 94.9% of the data as high risk of HAD (1.0).

Figure 12 .
Figure 12.Order of preference of the main characteristics for dementia in people with HIV.Figure 12. Order of preference of the main characteristics for dementia in people with HIV.

Figure 12 .
Figure 12.Order of preference of the main characteristics for dementia in people with HIV.Figure 12. Order of preference of the main characteristics for dementia in people with HIV.

Table 1 .
Main characteristics of high risk of dementia according to decisionmakers.
8. Difficulties in self-care (taking care of the body and body parts, such as skin, face, teeth, scalp and genitals).9. Difficulties in producing a safe and assertive temperament for a positive diagnosis for dementia.Difficulty concentrating on an external stimulus or an internal experience for the necessary period (functions of maintenance, change and division of attention, distraction).