An Investigation into the Relationship among Psychiatric, Demographic and Socio-Economic Variables with Bayesian Network Modeling

The aim of this paper is to investigate the factors influencing the Beck Depression Inventory score, the Beck Hopelessness Scale score and the Rosenberg Self-Esteem score and the relationships among the psychiatric, demographic and socio-economic variables with Bayesian network modeling. The data of 823 university students consist of 21 continuous and discrete relevant psychiatric, demographic and socio-economic variables. After the discretization of the continuous variables by two approaches, two Bayesian networks models are constructed using the bnlearn package in R, and the results are presented via figures and probabilities. One of the most significant results is that in the first Bayesian network model, the gender of the students influences the level of depression, with female students being more depressive. In the second model, social activity directly influences the level of depression. In each model, depression influences both the level of hopelessness and self-esteem in students; additionally, as the level of depression increases, the level of hopelessness increases, but the level of self-esteem drops.


Introduction
Bayesian networks (BNs), also called belief networks, are graphical structures used to reason and represent knowledge in an uncertain field. They are a combination of graph theory and probability theory to explore relationships between variables [1]. The network structure is a directed acyclic graph (DAG) in which nodes represent random variables [2,3] and arcs represent direct dependencies between variables [4,5].
In psychiatry, scales are used as a helper tool to be used in diagnosis. In this study, the Beck Depression Inventory (BDI), the Beck Hopelessness Scale (BHS) and the Rosenberg Self-Esteem Scale (RSES) are used to investigate the psychiatric characteristics of university students. Bayesian networks have been widely used in many domains including engineering [6], customer behaviors [7][8][9], social behaviors [10,11], clinical decision support [12], system biology [13,14], ecology [15], and so on.
In this paper, the psychiatric variables, which are continuous, are discretized using two approaches: (i) expert knowledge in literature and (ii) a statistical method. Then, Bayesian networks are used to analyze the relationship among BDI, BHS, RSES, demographic variables and socio-economic variables.
The structure of this paper is as follows: In Section 2, the data used in this study are briefly described. In Section 3, the theory of Bayesian networks is summarized. In Section 4, the continuous data are discretized using two approaches. In Section 5, the data are analyzed, and the results are interpreted. Finally, in Section 6, the two methods are compared, and the study is concluded.

Bayesian Networks
Let G = (N, A) be a directed acyclic graph (DAG), where N is a finite set of nodes and A is a finite set of arrows between the nodes. The DAG characterizes the structure of the BN.
Each node n ∈ N in the graph G represents a random variable X n . The set of variables associated with the graph G is denoted by X = (X n ) n∈N . A local probability distribution (p(x n |x pa(n) )) is allocated to every node with parents pa(n). A BN for random variables X is a combination of G and P (G, P), where P is the set of local distributions for all variables in the network [19].
The absence of arrows in G encodes conditional independence between the random variables X through the factorization of the joint probability distribution, In Equation (1), there are both discrete and continuous variables as shown in [20]. N consists of Ω and Ψ (N = Ω Ψ), where Ω and Ψ represent the sets of discrete and continuous variables, respectively. The random variables X can be redefined as X = (X n ) n∈N = (I, J) = ((I ω ) ω∈Ω , (J ψ ) ψ∈Ψ ), where I and J are the sets of discrete and continuous variables, respectively. To ensure the availability of exact local computation methods, discrete variables are not allowed to have continuous parents. The joint probability distribution then factorizes into a discrete part and a mixed part [21], p(x) = p(i, j) = ∏ ω∈Ω p(i ω |i pa(ω) ) ∏ ψ∈Ψ (j ψ |i pa(ψ) , j pa(ψ) ). (2)

Structural Learning
There are two kinds of structure learning algorithms. The first is the search-and-score algorithm, which allocates a score to every Bayesian network structure and chooses the structure model with the highest score. The latter is called constraint-based structural learning, and it creates a set of conditional independence analysis on the data and uses this analysis to produce an undirected graph. With an additional independence test, the network is converted into a Bayesian network [22]. In this study, the tabu search algorithm [23] is preferred.

Parameter Learning
After learning the structure of the BN from the data, the parameters can be estimated and updated. There are two approaches widely-used in the literature: (i) maximum likelihood estimation and (ii) Bayesian estimation [24]. In this paper, Bayesian estimation based on Dirichlet priors is used.

Data Discretization
The data discretization is a technique that converts continuous data into discrete data with a finite number of intervals, and it has become remarkably popular in many research areas including data mining, machine learning, artificial intelligence and Bayesian networks. There are many reasons to discretize data, among which the most important ones are (i) reducing and simplifying the dataset, (ii) making modeling fast and easy, (iii) obtaining easily interpretable outputs [25] and (iv) the statistical method to be used may operate in discrete data only as in this study. As Figure 1 shows, the psychiatric variables are not normally distributed, which violates the Gaussian Bayesian networks (GBNs) assumption. In this case, we may specify a suitable conditional distribution for every variable and build a hybrid Bayesian network [24]. Nevertheless, this approach requires prior knowledge, which is not available in our case. Therefore, we transform the continuous variables into discrete ones, i.e., discretization, and build discrete Bayesian networks. In this paper, we discretize the continuous variables in two ways, which are given in the next two sections.

Data Discretization by Expert Knowledge
One way to discretize continuous variables is to categorize them by following the studies in the literature. The intervals for the continuous variables in this study were categorized by the domain experts in the previous studies. Therefore, we use this knowledge to discretize the psychiatric variables. Hence, the discretization of the psychiatric variables is as follows: 1.
Rosenberg Self-Esteem Scale Score: We discretize this variable following [28] into three levels: The RSES scores below 15 indicate low self-esteem, 15-25 normal self-esteem and 26-30 high self-esteem.

Data Discretization by Statistical Methods
When no prior knowledge from domain experts is available, discretization methods are used to discretize the continuous variables. In the literature, there exist many discretization methods. Some of them are entropy-based discretization [29], error-based discretization [30], the one-rule discretizer (1RD) [31], equal frequency discretization [32] and information-preserving discretisation [33].
In this paper, the information-preserving discretization method is used to discretize the continuous variables. In this method, the basic notion is to initially discretize each variable into large numbers of intervals, say k 1 . Hence, the amount of information lost is kept minimum. Then, the algorithm repeats over the variables and retains, for each of them, the pair of attached intervals minimizing the loss of pairwise mutual information. Once all variables have k 2 , which is the number of intervals the user specifies, ≤ k 1 intervals left, the algorithm stops [24]. One of the reasons to use this method in this paper is that it enables us to specify the number of intervals. As the continuous variables in our study measure the severity of depression, hopelessness and self-esteem, it is reasonable to discretize them into three categories as low, normal and high. Another reason is that this method can be easily applied in R.
When we discretize the psychiatric variables using the discretize function in R (see Appendix A), the intervals are obtained as follows: 1.
Beck Depression Inventory Score:

Analysis
In this section, we analyze the datasets by constructing suitable BNs. All of the analysis is conducted in statistical software R using bnlearn [34], lattice [35], gRain [36] and Rgraphviz [37] packages.
First of all, to test the reliability of the psychiatric scales, Cronbach's alpha internal consistency coefficients [38] need checking. Cronbach's alpha internal consistency coefficients for BDI, BHS and RSES are 0.89, 0.87 and 0.83, respectively, meaning these scales are safe (greater than 0.70) for use in the analysis.
We build two different discrete BNs using the same method as two different approaches of discretization were used. We call the BN whose data were discretized by expert knowledge Bayesian Network 1 (BN1) and call the BN whose data were discretized by Information-Preserving Discretization Bayesian Network 2 (BN2).

Bayesian Network 1
We build the BN using R, and all the code used is given in Appendix B. Before learning the BN, we must specify a prior BN. Since we have no prior knowledge available, we specify an empty DAG. In addition, we must block all the arrows towards Gender and Age since no other variables can affect these two variables. Figure 2 shows the DAG of the BN1 and the network score of BN1 is calculated as −15,214.8. To summarize the significant results, Mother s_occupation is independent of all variables. This is because 93% of the mothers are unemployed.
Age and Gender are independent of all variables as we specified in the model. However, Age directly affects Student s_income, and Gender directly affects Depression, School_type, Type_o f _accommodation, Social_activity and Smoking_status.
While Depression is dependent on Gender, the other two psychiatric variables Hopelessness and Sel f _esteem are conditionally independent of all demographic and socio-economic variables (excluding Mother s_occupation) given Depression. However, Hopelessness and Sel f _esteem are directly dependent on Depression. Figure 3 represents the conditional probabilities of Depression given Gender, i.e., P(Depression|Gender). It is clear that female and male students have approximately equal probabilities of having severe depression (10%) and moderate depression (19%). While the probability of a male student having none or minimal depression is 43%, it is 27% for his female counterpart. However, female students are more likely to have mild depression (44%) than male students (29%). From Figure 3, it can be said that female students are more likely to have a higher level of depression than male students.  Figure 4 shows the bar charts of the conditional probabilities of Sel f _esteem given Depression. As the level of depression increases from none or minimal to severe depression, the probability of having a low level of self-esteem increases. For the depression levels none or minimal, mild, moderate and severe depression, the probabilities of students having low self-esteem are 0.01, 0.04, 0.15 and 0.33, respectively, while the probabilities of having high self-esteem are 0.35, 0.20, 0.08 and 0.11, respectively. Therefore, it can be concluded that the level of depression has a negative influence on self-esteem.

Bayesian Network 2
Now, we build a BN just like we did in the previous section. Again, we specify an empty DAG as the prior DAG and block all the arrows towards Age and Gender (see Appendix C). Figure 6 represents the DAG of BN2, and the network score of BN2 is −14,748.93.
To abstract the notable relationships, Age directly affects Student s_income, while Gender directly affects Type_o f _accommodation, School_type, Social_activity and Smoking_status.
Of all the demographic and socio-economic variables, the only variable that has a direct influence on one of the psychiatric variables, that is Depression, is Social_activity. In addition, Sel f _esteem and Hopelessness are directly dependent on Depression, but both of them are conditionally independent of all the demographic and socioeconomic variables (excluding Mother s_occupation) when the status of Depression is known.  Figure 7 is the bar chart of the conditional probabilities of Depression given Social_activity. It shows that the presence of Social_activity has a negative impact on Depression. While only approximately 17% of the students with social activity have a high level of depression, this proportion is about 28% in the students with no social activity. In addition, the proportion of students with a low level of depression is significantly lower in the students with social activity compared to those with no social activity. Figure 8 is the bar chart of the conditional probabilities of Hopelessness given Depression. When a student is known to have a low level of depression, the probability that he/she has a low level of hopelessness is around 0.83, while the probability of his/her having a high level of hopelessness is merely 0.02. In the students with high level of depression, the proportions of students with low and high levels of hopelessness are approximately 0.27 and 0.35, respectively. Therefore, it is concluded that there is a positive correlation between Depression and Hopelessness.  Figure 9 illustrates the conditional probabilities of Sel f _esteem given Depression. At first glance, it is noted that the students who have a high level of self-esteem dominate the others in each group. The probabilities that a student has high and low levels of self-esteem are about 85% and 2%, respectively, when he/she has a low level of depression, whereas these probabilities are 41% and 33%, respectively, when the student has a high level of depression. These numbers indicate that as the severity of Depression increases, the severity of self-esteem decreases.

Discussion and Conclusions
In this paper, the relationship among the 3 psychiatric, 2 demographic and 16 socio-economic variables was investigated through BN modeling in R. The data were obtained from 823 university students via a survey.
In the analysis, firstly, the Beck Depression Inventory, the Beck Hopelessness Scale and the Rosenberg Self-Esteem Scale scores were calculated from the answers of students. Then, Cronbach's alpha internal consistency coefficients for the these psychiatric scales were tested and found satisfying. Next, the continuous data were discretized by two approaches, which led to two BN models (BN1 and BN2). Finally, after building the BNs with the tabu search algorithm, inference and queries were made through graphs and conditional probabilities.
To summarize the results, only the occupational status of the mother has no relation with the other variables in both models.
According to BN1, the gender of the students directly influences the choices of students on school type, accommodation type, social activity and smoking status, while the age directly influences only the income of the student. The gender also directly affects the level of depression of students, and female students have a higher chance of being more depressive than male students. This finding is supported by similar studies [39][40][41][42].
According to BN2, similar to BN1, only the income of the students is directly dependent on the age. The gender has direct influence on accommodation type, school type, social activity and smoking status. In addition, depression is directly dependent on social activity. The presence of social activity has a negative effect on depression, which is also supported by [43,44]. Furthermore, self-esteem and hopelessness are directly dependent on depression.
To compare BN1 and BN2, firstly, BN2's network score is greater than that of BN1, meaning BN2 is more likely to produce better predictions than BN1. While depression is directly dependent on the gender in BN1, it is conditionally independent of the gender given social activity in BN2. In each model, hopelessness and self-esteem are directly dependent on depression and conditionally independent of all the demographic and socio-economic variables excluding the occupational status of the mother, which is independent of all the variables.
In both models, one of the most important results is that as the level of depression rises, so does the level of hopelessness. This finding is consistent with other studies in the literature [45,46]. Another one is that, in contrast to hopelessness, the rise in the severity of depression results in lower self-esteem scores in students. This finding is in line with other studies on the relationship between depression and self-esteem [43,47,48].