1. Introduction
Related to pubertal development and increasing social demands, the incidence of depression in adolescence is higher than that in childhood and has increased significantly in recent years [
1]. The prevalence of depression disorders in Chinese children and adolescents aged 6–16 years is 3.0% [
2], and 14.8% of adolescents experience high depressive symptoms [
3]. Given that depressive symptoms affect adolescents’ daily function and academic achievement, and have a long-term negative impact on adolescents’ mental health and social adjustment [
1], it is important to identify risk and protective factors for depression in Chinese adolescents and develop effective preventive and intervention programs.
According to the social-ecological system theory [
4], family and school play pivotal roles in influencing the social development of adolescents. Revealing the risk and protective factors within the family and school environments and comprehending the interaction mechanisms among different factors can facilitate the identification of students at high risk of mental health, enabling focused attention and targeted intervention. Moreover, it facilitates effective coordination of resources between families and schools in order to create favorable environments in both settings and to prevent and control mental health problems such as adolescent depression.
Previous studies conducted in China and Western countries have identified numerous family factors associated with depressive symptoms in children and adolescents, including low family socioeconomic status [
5], negative events in early life [
6], parents’ depression problems [
7], parent-child communication problems [
8], harsh parenting [
9] and a negative family emotional climate [
10]. In a meta-analysis conducted by [
11] on Chinese middle school students’ depression-related factors, it was found that among these family factors, parent-child communication problems, low family function and cohesion were particularly strongly correlated with depression at a medium to high level [
11]. Conversely, positive family function can improve individual psychological resilience and have a positive predictive effect on mental health outcomes [
12].
Most Chinese adolescents spend their teenage years immersed in the educational system. Research has consistently demonstrated that experiences of social isolation and bullying in the school setting are important predictors of depression in children and adolescents [
13,
14]. Furthermore, academic pressure and poor academic performance are also significantly correlated with adolescent depression [
15]. However, fostering a positive school atmosphere, and providing support from teachers and peers can effectively promote students’ positive development [
16,
17]. Additionally, the availability of psychological education resources within schools plays a protective role in enhancing students’ mental health [
18].
Risk and protective factors in the family and school environments concurrently influence adolescents’ mental health. Previous research has indicated that students with a familial history of depression can benefit from positive relationships with another parent and school connectedness [
19,
20]. In this case, school protective factors have the potential to mitigate the negative impact of family risk factors. However, there might be other interaction patterns between school and family factors. For example, previous research has indicated that students facing higher family risks are less likely to benefit from school resources [
21]. Therefore, gaining a comprehensive understanding of the risk and protective factors as well as their interactions, can facilitate the development of targeted mental health education programs aimed at preventing and managing adolescent depression.
It should be noted that the risk and protective factors for depression in adolescents may differ by gender. The depression prevalence in females is higher compared to males in any age group from adolescence [
22]. Because of gender inequality in most societies, females experience greater stress in daily life and social pressures to conform to gender roles increase when children move through puberty [
23]. In addition, females have a greater tendency to be concerned with relationships with others and others’ opinions of themselves; thus, they are at higher risk for depression when confronting conflicts in relationships [
24]. There is also evidence that depression in adolescent boys is more strongly correlated with harsh parenting behaviors [
11]. Thus, it is necessary to identify risk and protective factors for depression separately for adolescent girls and boys.
Regarding the age difference, it has been found that depression prevalence usually increases during the period of adolescence and reaches its peak in mid to late adolescence. Nevertheless, it has been found that students in primary school have exhibited a relatively high prevalence of depressive symptoms [
25]. The quantity and quality of risk factors for depression might vary at different stages of pubertal development and the socialization process. Therefore, this study also explored the risk and protective factors of depression for primary school students (11–12 years old) and middle school students (13–15 years old) separately.
In addition to gender and age, this study also attempted to compare the family and school risk and protective factors of depression in left-behind and non-left-behind students. Due to the imbalance of economic development, young adults in underdeveloped areas of China go to work in developed areas and leave their children in their hometowns. As revealed by a number of studies, left-behind children and adolescents are at higher risk of depression compared to their non-left-behind counterparts [
26]. From the perspective of improving the mental health of left-behind adolescents, it is worth focusing on whether school resources can play a compensatory role for disadvantages in family resources, and further, which school factors effectively promote the mental health of left-behind adolescents.
In summary, although a large number of family and school factors have been found to be associated with depressive symptoms in adolescents, there remains a research gap in this field. Firstly, most previous studies have primarily focused on examining the individual or limited factors related to depression, neglecting comprehensive exploration of the joint effects of multiple familial and school-related factors. Association rule mining has not been used to address this issue to our knowledge. Secondly, given the difference in depression prevalence by gender, age and left-behind status, it is imperative to examine the risk and protective factors for depression among adolescents with different gender, age, and left-behind status to facilitate targeted intervention.
Aiming at deepening the understanding of the risk and protective factors of depression in young people and their joint effects, a cross-sectional investigation was conducted with primary school students and middle school students as participants. Association rule mining, a data mining technique used to discover relationships or patterns between item sets or object sets in large datasets, was used to explore factor combinations associated with depressive symptoms for adolescents with different gender, age and left-behind status. By using association rule methods for mutual relationship analysis, it is easier and more effective to obtain relevant rules between multiple variables and provide valuable insights, thereby improving the decision-making process. The specific research questions were as follows: (1) exploring and comparing risk factor combinations associated with depression and protective factor combinations related to nondepression for adolescent girls and boys; (2) exploring and comparing risk factor combinations associated with depression and protective factor combinations related to nondepression for students in primary schools and middle schools; (3) exploring and comparing risk factor combinations associated with depression and protective factor combinations related to nondepression for left-behind and non-left-behind students.
This study extends previous research by revealing the combined effects of multiple protective/risk factors in family and school domains and specifying protective/risk factors for subgroups of adolescents differing in age, gender and left-behind status with a novel machine learning approach. This study also contributes to the field of school psychoeducation by providing evidence for the identification of high-risk students and the development of targeted interventions.
2. Method
2.1. Participants and Procedure
This study received approval from the Ethics Committee of our institute. Students were recruited in two counties located in southern and northern China and the two counties had a medium GDP in China. The cluster sampling method was employed and 38 primary schools and 11 middle schools were involved. Consent was obtained from students and their parents, ensuring ethical compliance. The paper-based questionnaires were administered to students during class sessions. A total of 2800 students were recruited, and 2445 of them provided completed data for analyses. Students were acknowledged for their participation and received a small token of appreciation. Among the participants, 1292 (52.8%) were girls and 1153 (47.2%) were boys. Among the primary school participants (n = 1164), there were 590 (24.1%) fifth graders and 574 (23.5%) sixth graders. The middle school sample (n = 1281) comprised 414 (16.9%) seventh-grade students, 418 (17.1%) eighth-grade students and 449 (18.4%) ninth-grade students. Additionally, there were 870 (35.6%) left-behind adolescents in the entire sample.
Data were entered using Epidata 2.1 software, with double-entry verification implemented to ensure data accuracy.
2.2. Measures
2.2.1. Demographic Questionnaire
Demographic information including gender, grade, and ethnicity was collected. In addition, data on family structure, parents’ education attainment, left-behind status of children, family economic status and academic ranking were also gathered. The description of the variables is listed in
Table 1.
2.2.2. Family Cohesion
The cohesion subscale in the Family Adaptation and Cohesion Evaluation Scales II-Family Version (FACES-II) [
27] was utilized in the present study. There are 16 items on the scale measuring closeness among family members. An exemplar item is “Family members experience a strong sense of emotional closeness”. Ratings for these items were obtained using a 5-point Likert scale ranging from 1 (almost never) to 5 (almost always), with higher scores indicating greater levels of family cohesion. The Chinese version of FACES-II demonstrated satisfactory reliability and validity [
28]. In the present study, Cronbach’s α was 0.82. As displayed in
Table 1, participants were categorized as being at risk if their total score fell below the 25th percentile, while those scoring above the 75th percentile were considered to possess protective resources [
29].
2.2.3. Family Conflict
The 9-item conflict subscale of the Family Environment Scale [
30] was used to measure family conflicts. An example item is “Family members often blame and criticize each other”. The items are rated on a dichotomous scale (no = 0, yes = 1). Total scores are calculated and with higher scores indicating higher levels of family conflict. The Chinese version of this scale shows good psychometric properties [
28]. The Cronbach’s α was found to be 0.63 in the present study. Scores above the 75th percentile were categorized as indicating risk, while scores below the 25th percentile were categorized as protective [
29].
2.2.4. School Climate
Teacher support, peer support and autonomy support were measured using the school climate scale [
31]. The scale consisted of seven items tapping teacher support (e.g., “teachers believe I can do well”), 13 items tapping peer support (e.g., “students care about one another” and five items tapping opportunities provided for autonomy in the school (e.g., “students are given the chance to help make decisions”). Participants rated these items on a 4-point Likert scale ranging from 1 (“never”) to 4 (“always”), with higher scores indicating greater levels of teacher support/ peer support /autonomy support. The Cronbach’s α values were 0.80, 0.81, and 0.76 for each subscale, respectively. As demonstrated in
Table 1, scores below the 25th percentile in the three subscales were deemed to indicate a higher level of risk, while those above the 75th percentile were considered to possess protective resources [
29].
2.2.5. Mental Health Education Resources in Schools
Three questions were used to investigate the provision of mental health education in schools, specifically, “Does your school offer psychology courses?”, “Are there psychological counselors available in your school?” and “Does your school have a psychological consultation facility”. The criteria for risk and protection in terms of mental health education resources are displayed in
Table 1.
2.2.6. Adolescent Depression
The Depression Self-Rating Scale for Children (DSRSC, Ref. [
32] was employed to assess depressive symptoms. The DSRSC is appropriate to assess depressive symptoms in children aged 8 to 16 years. This scale comprises 18 items that are rated on a 3-point scale (0 = never, 1 = sometimes, 2 = often), with higher scores indicating greater levels of depressive symptoms. The Chinese version of the DSRSC has demonstrated robust psychometric properties and a cutoff point of 15 has been proposed to differentiate adolescents at low and high risk of depression [
33]. In this study, the internal consistency of the DSRSC was deemed acceptable (α = 0.84).
2.3. Data Analyses
In machine learning algorithms, the association rule is used to find frequent patterns, associations, correlations, or causal structures that exist between sets of items or objects in datasets [
34]. By analyzing the correlation between multiple attributes in the data, the association rule discovers valuable rules. The basic meaning is as follows:
Let be an item set, where represents an item, that is, a prediction variable. Let be a transaction set, where is a set containing items, called a -item set (e.g., {gender, depression} is a 2-item set), and . The form of the association rule is , where , and . and are both item sets, and they are referred to as the left-hand side (LHS) and right-hand side (RHS) of the association rule, respectively.
The association rule evaluates the association strength through support, confidence and lift. The definitions and formulas of these evaluation parameters are shown in
Table 2.
To meet certain requirements, it is necessary to specify the support and confidence thresholds. is considered valuable when and are greater than or equal to the set threshold value, respectively. These two values are called min_sup and min_conf, min_sup describes the minimum importance of association rules, and min_conf specifies the minimum reliability that association rules should meet.
However, the settings of min-sup and min_conf have a significant impact on the final result. The min_sup is too large, and plenty of potential rules may be deleted. In contrast, many redundant rules could be generated, making it difficult to study and discover association relationships. Therefore, taking the lift as the basic requirement for effective strong association rules and based on the principle of not omitting important rules, min_sup and min_conf are set through multiple threshold setting experiments.
This study used the Apriori algorithm [
35] of the association rules to identify risk factors related to depression and protective factors linked to nondepression by analyzing the relationship between predictive variables and outcomes. The implementation process was divided into two parts: (1) discover frequent item sets, defined as item sets with support greater than or equal to the given min_sup, and (2) generate association rules. Subsequently, effective strong association rules were obtained based on different support, confidence and lift parameters, and then the influencing factors behind the relevant results were analyzed.
5. Conclusions
In this study, according to different age, gender and left-behind status groups, the combinations of risk/protective factors that affect depression were examined by utilizing association rules analysis. The analysis results reveal strong associations between familial relational factors and school-related relational factors, with adolescent depression. When these factors are combined, the risk of depression in adolescents increases. Moreover, female gender, middle school students, low family socioeconomic status, family structural risk, separation from parents, and a lack of mental health education resources at school exacerbate the negative impact of the aforementioned risk factors. Furthermore, the risk and protective factors for depression varied according to gender, age group and left-behind status.
5.1. Implications
The results of this study have some implications for the prevention and control of depression among adolescents. First, it is crucial to identify the population at risk as a fundamental prerequisite for effective intervention. Our research revealed that common risk factors include high family conflict, low family cohesion, low peer support, and low teacher support. When these factors are combined, the likelihood of depression in adolescents increases substantially. The risk of depression further improves when combined with other factors such as being female, being middle school students, separating from parents, having a low family socioeconomic status and having low family economic difficulties. These vulnerable teenagers require heightened attention and targeted support. Secondly, informed by the results, it is imperative to prioritize teacher-student relationships and peer relationships as important targets in school-based mental health education programs, aiming to create a harmonious and friendly campus atmosphere. At the same time, parents’ education should be included to cultivate a warm and positive family atmosphere for the positive development of adolescents. Furthermore, the focus of mental health education should be clarified according to the differences in gender, age, and left-behind status. Given that girls are more susceptible to interpersonal stress compared to boys, greater emphasis may be placed on addressing peer relationship issues for them. For left-behind children with limited familial resources, particular attention should be given to fostering a supportive campus atmosphere and nurturing healthy teacher-peer relations. Additionally, apart from general mental health education, it is important to provide school-based mental health services such as group counseling sessions and individual therapy to adolescents at risk.
5.2. Limitations and Future Direction
This study has certain limitations. Firstly, the inclusion of school and family environmental factors in this study remains limited, making it challenging to comprehensively reflect the risk and protective factors across different groups. Future research should consider additional environmental factors associated with depression, such as parent-child relationships and academic stress. Secondly, this study solely focuses on external environmental risk and protective factors for adolescent depression. However, individual differences may interact with these environmental factors and influence psychological adaptation. Therefore, future studies should incorporate individual differences into analyses. Despite these limitations, the results of this study suggested that applying association rules to a large sample of student mental health research data yields meaningful results. If additional data is accumulated or the survey scope is expanded in the future, applying this analysis method is not only possible but also has the potential to generate more accurate and reliable psychological risk screening standards.