Efficacy and Effectiveness of Universal School-Based Wellbeing Interventions in Australia: A Systematic Review

The World Health Organisation defines health in terms of wellbeing, and wellbeing has become both a construct and a measure of impact in early intervention and prevention programs in schools. In Australia, schools report on their wellbeing initiatives and there is a plethora of government-funded wellbeing programs already in place in schools. However, education systems and stakeholders worldwide are facing significant challenges with mixed evaluation results of program impact and intervention effect. To better support students, schools, school-based healthcare workers, and community, it is important to know about the effectiveness of school-based programs; yet in the last decade, there has been no national appraisal of these programs in Australia. This systematic review aims to report on the effectiveness of Australian school-based wellbeing programs through a search of 13 databases. Out of 2888 articles, 29 met inclusion criteria. The results found that seventeen interventions comprising 80% of the total number of participants reported no statistically significant intervention effect on wellbeing outcomes. We argue that supporting wellbeing through robust program intervention is important as wellbeing presents both an indication of later onset of more serious mental health issues, and an opportunity for early intervention to break the trajectory leading to full disorder.


Introduction
This special issue of the International Journal of Environmental Research and Public Health offers the opportunity to problematize the meaning and application of wellbeing within an educational context. The purpose of this paper is to critically analyse the meanings and measures of wellbeing in school-based interventions and to provide an objective appraisal of intervention effect. Previous school-based reviews have reported on wellbeing as a secondary measure to programs whose primary measures were related to mental health and psychiatric disorder. Review of only wellbeing based on the critical analysis of how wellbeing is measured within educational settings is sparse. The purpose of this systematic review is to report on the efficacy and effectiveness of school-based wellbeing interventions that use validated wellbeing instruments to identify an intervention effect on the wellbeing outcomes of school children and adolescents. In so doing, we also aim to problematize the meaning of wellbeing and to establish how it can be measured through validating measuring instruments.
Wellbeing is a popular term that has entered the vernacular throughout the Englishspeaking world today. However, the meanings and multidimensional nature of wellbeing present a major challenge to researchers and healthcare workers to understand and measure Education Research found: "there is little clear evidence about the effectiveness of schoolbased wellbeing programs in terms of their impact on both students' wellbeing and on academic outcomes" [25] (p. 3).
Part of the challenge has been to identify the measures that relate to wellbeing: should wellbeing measures include all mental disorders that can be measured through diagnostic instruments; or should wellbeing include all measures related to psychosocial aspects? The OECD recommends measuring wellbeing using standardized and validated wellbeing measuring instruments [26]. These include but are not limited to the 'Psychological Well-Being Scale' and 'Flourishing Scale', where the latter is used due to the meaning of wellbeing being given as 'flourishing' [6]. Other key terms related to wellbeing include 'life satisfaction' [27], happiness [28][29][30], and 'resilience' [31][32][33]. As mitigating strength-based measures, protective factors and coping skills are associated with the ability to maintain wellbeing [34][35][36]. Recently, self-esteem [37] and self-efficacy [6] have been included as a measure of wellbeing.
There is a scarcity of systematic reviews of universal school-based interventions that address effectiveness in terms of multiple criteria: study quality, relative effect size, and statistical significance of intervention impact. Recently, there has been an increased research output, including systematic reviews and meta-analyses, of school-based mental health and wellbeing. The primary target measure in these reviews is mental health outcomes based on diagnostic instruments, while wellbeing measures are reported as secondary outcomes. An extensive summary of global school-based programs by Berger and colleagues reported that programs with long-term outcomes tended to implement cognitive therapy (CBT), social and emotional skills, and mental health literacy [38]. Other wellbeing reviews suggest that early intervention and participant age is a key factor related to implementation effect [39,40]. On the other hand, a systematic review by Moore and colleagues, which featured no Australian interventions, reported that intervention effect and sustainability is a structural factor related to school governance, rather than being defined by intervention characteristics [41]. However, targeted interventions have a greater statistically significant effect on intervention outcomes than universal interventions [42][43][44]. Given that most wellbeing programs are universal, they may show little or a small statistically significant intervention effect [43]. One multicomponent review of psychological and subjective school wellbeing identified one Australian study [45]. This review showed a small but significant improvement in psychological wellbeing remaining over time. Another multicomponent based on positive psychology also suggests that a higher number of sessions is likely to yield positive results [39]. A review looking at the effect of interventions focusing on positive psychology on subjective wellbeing reported a small effect favouring teacherdelivered interventions [45], although there is debate around whether delivery of mental health programs is more effective when delivered by teachers, program staff, or healthcare workers [46]. That review did not include any Australian studies. Another systematic review examining the impact of interventions on subjective wellbeing found four Australian studies out of 55 and reported that only one third of interventions employed strong experimental designs and that positive results were mainly found in studies with a poor study design [39]. A review of 29 school-based mental health and emotional well-being programs identified only one study that was Australian from within the same search period. The study identified three key program themes: increased help-seeking, mental health literacy, and increased social and emotional wellbeing [47]. The studies showed promising results but suffered from weak study designs. An international review looking at outcomes related to mental health and well-being included 10 Australian studies and concluded that half of the studies in the review showed a positive impact. However, the review was limited to psychological, psychosocial, and subjective wellbeing, and did not provide statistical analyses of effect [36].
Many of the studies in these reviews had flawed research designs which limits the generalizability and validity of their results. In addition, few Australian studies were included in these reviews. Collectively, reviews of wellbeing have provided partial direc-tion for educators and researchers in that wellbeing is often secondary to mental health outcomes, is poorly defined, measured, and lacking clarity in terms of identifying program effectiveness. Australian studies feature minimally or are absent in global systematic reviews of school-based wellbeing programs. First, we aim to identify Australia wellbeing programs and second, we seek to establish effectiveness of programs specifically in terms of wellbeing outcomes as part of a universal intervention and prevention strategy for children and young people. In the present review we found 29 interventions within the Australian context, and our search terms focused on wellbeing outcomes alone (and its connecting measures, such as flourishing and resilience, for example). We also provide statistically measures of outcomes reported on validated measuring tools for the purpose of providing statistical clarity to the efficacy descriptions given in key reviews. The purpose of this systematic review is to address the second part of this statement, which is related to school-based wellbeing outcomes.

Search Strategy
The search strategy in this systematic review used Preferred Reporting Items for Systematic Reviews and Meta-Analyses [48] (Figure 1). Many of the studies in these reviews had flawed research designs which limits the generalizability and validity of their results. In addition, few Australian studies were included in these reviews. Collectively, reviews of wellbeing have provided partial direction for educators and researchers in that wellbeing is often secondary to mental health outcomes, is poorly defined, measured, and lacking clarity in terms of identifying program effectiveness. Australian studies feature minimally or are absent in global systematic reviews of school-based wellbeing programs. First, we aim to identify Australia wellbeing programs and second, we seek to establish effectiveness of programs specifically in terms of wellbeing outcomes as part of a universal intervention and prevention strategy for children and young people. In the present review we found 29 interventions within the Australian context, and our search terms focused on wellbeing outcomes alone (and its connecting measures, such as flourishing and resilience, for example). We also provide statistically measures of outcomes reported on validated measuring tools for the purpose of providing statistical clarity to the efficacy descriptions given in key reviews. The purpose of this systematic review is to address the second part of this statement, which is related to school-based wellbeing outcomes.

Search Strategy
The search strategy in this systematic review used Preferred Reporting Items for Systematic Reviews and Meta-Analyses [48] (Figure 1). Eleven databases were included in this review (A+Education, BEI, Bibliomap, Embase, Epistemonikos, ERIC, MEDLINE, PsycINFO, PubMed, Scopus, TRoPHI EPPI) and an additional four databases were used for cross-referencing (Campbell Systematic Reviews, Cochrane Central Register of Controlled Trials, Dissertations and Theses via Proquest, and DoPHER Database of Promoting Health Effectiveness Reviews). These were supplemented with internet searches on www.googlescholar.com, www.scirus.com, and www.alta-vista.com (accessed on 2 January 2023). The search strategy for the database searches is given in Table 1. Eleven databases were included in this review (A+Education, BEI, Bibliomap, Embase, Epistemonikos, ERIC, MEDLINE, PsycINFO, PubMed, Scopus, TRoPHI EPPI) and an additional four databases were used for cross-referencing (Campbell Systematic Reviews, Cochrane Central Register of Controlled Trials, Dissertations and Theses via Proquest, and DoPHER Database of Promoting Health Effectiveness Reviews). These were supplemented with internet searches on www.googlescholar.com, www.scirus.com, and www.alta-vista.com (accessed on 2 January 2023). The search strategy for the database searches is given in Table 1.

Key Word Search String
Wellbeing "wellbeing" OR "well-being" OR "well being" OR "mental wellbeing" OR "mental well-being" OR "mental well being" OR "subjective wellbeing" OR "subjective well-being" OR "subjective well being" OR "flourish *" OR "eudaimonia" Clinical diagnosis "mental health" OR "mental illness" OR "mental disorder" OR "psychiatr*" OR "psycholog*" Negative emotional states "social and emotional" OR "psychosocial" Target group "child*" OR "adolescen*" OR "school age*" OR "school-age*" OR "schoolchild*" OR "school child*" OR "school-child*" OR "youth" OR "young person" OR "student*" OR "pupil*" Context "school*" OR "school-based" OR "school based" OR "whole of school" OR "classroom*" Country Australia Filters English, Humans, from 1 January 2012 to 1 January 2022 Duplicates were removed and two authors (HG and AV) independently read the titles and abstracts. Full-text articles were then screened for eligibility. A third reviewer (SC) was used to resolve disagreement regarding eligibility. In total, 2298 results were obtained in the first database search.

Eligibility Criteria
Articles eligible for inclusion were school-based interventions that measured the impact on the wellbeing of young people and adolescents. These are measured through validating measuring instruments for measuring wellbeing, flourishing or eudaimonia as outlined by the OECD [26]. A broad reading of wellbeing was taken that includes happiness and resilience measures [12,49]. Studies were included that provided an effect for time and condition on pre-and post-intervention measures and a control group, either active, placebo, or waitlist [50]. If the intervention occurred outside of school grounds, such as nature walks or a sporting activity, then the study was included if it was organized through the school in terms of recruitment of participants and obtaining consent [38]. Articles were filtered for the English language and published between 2012 and 2022.
Studies that did not report on wellbeing outcomes were excluded. This review excluded unpublished doctoral theses, conference material, and articles without empirical data including letters, commentaries, memorandums, and opinion pieces, as they have not undergone a peer-review process. For programs that involved multiple publications, the first study published during the review period was taken [37]. For interventions with a non-school component, such as a pre-school/kindergarten component, only the schoolcomponent was taken insofar as the school data could be extracted. There are many mental health-based programs currently running in Australian schools [38,51] that did not meet the inclusion criteria for systematic review.

Data Extraction and Analysis
In line with inclusion and exclusion criteria, data extraction was carried out using the PICOTS method [52]. Records were listed in an Excel sheet under the following categories: authors; year of publication; program name; population (age, mental health condition); intervention (study quality, delivery personnel, exposure, follow-up); outcome (intervention effect (all mental health outcomes, effect size, and assessment instrument used to measure effect); timing; and setting (universal school-based context or school-based external context, e.g., camp). Extraction of data related to intervention characteristics specific to wellbeing included: general wellbeing, emotional wellbeing, psychosocial wellbeing, social wellbeing, subjective wellbeing, coping styles, flourishing, life satisfaction, quality of life, protective factors, resilience, self-esteem, self-efficacy.
Data were extracted by two reviewers, double-blinded by listing who conducted blinded reviews of articles based on title and abstract search (HG and AV). The percentage of the coding decisions on which pairs of coders agreed was used to determine inter-coder reliability and was calculated as 90%. Differences were resolved by a third reviewer (SC).

Risk of Bias
The studies were appraised using the Effective Public Health Practice Project (EPHPP) quality assessment tool [53]. This tool has previously been used to assess quality of wellbeing programs generally [54], and in one of the key systematic reviews of randomized and nonrandomised trials in Australia [42]. Assessment criteria were: selection bias, study design, confounding, blinding, data collection, and study attrition. Each criterion was rated 0, 1, or 2 where 2 was given for high quality. The maximum score is 12. A low, medium, and high study quality score refer to the range of 1-4, 5-8, and 9-12, respectively. We felt this method was historical appropriate to the Australian context and it assesses bias in all studies under the same criteria, which is significant given the high number of quasi-experimental studies related to wellbeing in this review.
Each article was independently assessed for quality by two reviewers (HG and AV). Discrepancies were resolved using a third author, SC.

Effect Size
Where data were available and extractable, effect size was calculated to obtain statistically significant effects for time and condition impact on pre-and post-intervention measures. Cohen's d was calculated using the difference between estimated means of the two conditions (intervention and control over time) divided by the baseline standard deviation of raw scores [55]. The range for Cohen's d was: 0.2, 0.4, and 0.7, for small, medium, and strong effect, respectively [56]. Where data were not able to be converted to Cohen's d, the effect size was reported verbatim as authors reported on the manuscript.

Results
Out of 2298 records, 29 (N = 29) met inclusion criteria for Australian school-based wellbeing interventions. All 29 studies scored between 4 and 10 (out of 12) corresponding to low, medium, and high study quality.

Intervention Characteristics
The 29 studies comprised a total sample size of n = 13,537 participants. Individual studies varied from 44 participants [57] to 3630 [58]. Fifteen interventions (n = 52% of total number of participants) were randomized controlled trials (RCTs) or cluster RCTs, two were non-randomized (n = 3%), and 12 (n = 45%) were quasi-experimental designs. Students' age ranged from 5 to 18 years, school grades 1-12, from both metropolitan and rural schools ( Table 2). Emergent studies in this field sought to engage a whole of school and community approach that involves parents to actively partake in interventions for children with mental health issues [58,59]. (Table 2)  Out of 29 interventions that measured wellbeing, 2 interventions comprising 3% of the total sample size had high study quality scores. One intervention was a social skill building program [60] that reported no significant effect on wellbeing outcomes. However, a martial arts-based program [73,74] reported small effects on self-efficacy (F(2, 238) = 14.94, p < 0.001) but not significant improvements in measures of wellbeing. Both interventions were RCTs that had high exposure of 10-16 weeks and included 3-months follow-up.
Over half of the studies had medium quality scores (N = 15, n = 78% of the total sample). Within the medium range, 11 were RCTs and 4 were quasi-experimental designs. Eleven studies reported no significant effect on wellbeing. Interventions with low study quality scores (N = 12, n = 19% of the total number of participants) lacked blinding of participants and/or delivery personnel, were not randomized, had uneven exposure within clusters and control, or no control group, and used one measure (self-report) rather than having measures objectively verified using a range of instruments and assessors (parent, teacher, and clinician). Low study quality interventions were dominated by quasi-experimental designs (N = 12, n = 16%) compared to RCTs (N = 3, n = 3%).

Intervention Effect
Seventeen of 29 interventions (n = 80% of the total number of participants) reported no statistically significant effect overall on measures related to wellbeing as well as in other related outcomes measures (Table 3). Within the group who reported no significant effect, 11 were RCTs or CRCTs (n = 47% of total number of participants) and six were quasi-experimental designs (n = 33%).
Eight of 29 interventions reported a significant small effect on outcomes measures, but only three reported an effect on wellbeing outcomes. All three were quasi-experimental designs and involved acceptance and commitment therapy (ACT) (FS d = 0.20, p = 0.57) [65], psychoeducation (GSE d = 0.314, p < 0.01) [72], and resilience-building in students with unhealthy perfectionism (CINSS np2 = 0.11, p < 0.001) [83]. Two of the interventions had low study scores, while the other had a low-medium study score due to absence of blinding, sampling, randomization, and attrition.
Three interventions reported a statistically significant medium intervention effect, but only one reported on wellbeing outcomes. An intervention based on ACT reported a significant medium effect on flourishing (FS d = 0.47, p = 0.030) [84]. However, in the absence of data collected for the control group, the outcome measures may not necessarily have achieved significance.
One intervention reported a large effect on social and emotional wellbeing (SEW 9 of 23 lbeing, 2 interventions comprising 3% of ores. One intervention was a social skill ant effect on wellbeing outcomes. Howted small effects on self-efficacy (F(2, 238) nts in measures of wellbeing. Both inter-10-16 weeks and included 3-months foly scores (N = 15, n = 78% of the total samand 4 were quasi-experimental designs. wellbeing. Interventions with low study mber of participants) lacked blinding of not randomized, had uneven exposure and used one measure (self-report) rather a range of instruments and assessors (parinterventions were dominated by quasid to RCTs (N = 3, n = 3%).
he total number of participants) reported sures related to wellbeing as well as in in the group who reported no significant mber of participants) and six were quasicant small effect on outcomes measures, outcomes. All three were quasi-experimmitment therapy (ACT) (FS d = 0.20, p = 0.01) [72], and resilience-building in stu-2 = 0.11, p < 0.001) [83]. Two of the interhad a low-medium study score due to nd attrition. significant medium intervention effect, n intervention based on ACT reported a 0.47, p = 0.030) [84]. However, in the abe outcome measures may not necessarily social and emotional wellbeing (SEW ŋp2 d on building social-emotional developowever, the sample was small, and the Eight of 29 interventions reported a significant small effect on outcomes measures, but only three reported an effect on wellbeing outcomes. All three were quasi-experimental designs and involved acceptance and commitment therapy (ACT) (FS d = 0.20, p = 0.57) [65], psychoeducation (GSE d = 0.314, p < 0.01) [72], and resilience-building in students with unhealthy perfectionism (CINSS np2 = 0.11, p < 0.001) [83]. Two of the interventions had low study scores, while the other had a low-medium study score due to absence of blinding, sampling, randomization, and attrition.
Three interventions reported a statistically significant medium intervention effect, but only one reported on wellbeing outcomes. An intervention based on ACT reported a significant medium effect on flourishing (FS d = 0.47, p = 0.030) [84]. However, in the absence of data collected for the control group, the outcome measures may not necessarily have achieved significance.
One intervention reported a large effect on social and emotional wellbeing (SEW ŋp2 = 0.16, p < 0.01) [61]. This intervention was based on building social-emotional development, well-being, and academic achievement. However, the sample was small, and the participants came from one school.

Intervention Duration and Follow-Up
Intervention duration varied across interventions from 3 weeks [83] to seven days in outdoor camp [85]. The majority (N = 21, n = 66%) of interventions lasted between six to ten weeks. The two interventions that showed the highest effect ran for 10 weeks [61], and the other had an uneven exposure across clusters. There was no significant correlation between the duration of exposure and study effect.
Eighteen out of 29 interventions had a follow-up which varied from eight-weeks [72] to 24 months [58]. There was no correlation between the period or number of follow-ups and study effect.

Delivery Mode
The delivery mode varied across interventions. Fourteen interventions (N = 14, n = 78%) were delivered by the schoolteacher or staff (with and without training), including the school nurse and school psychologist. Program staff delivered 25 interventions (N = 25, n = 32%) and included psychologists, student psychologists, and researchers. In some publications, it was unclear whether the psychologist was a school counsellor or an external psychologist belonging to project staff. It is assumed that the allocation sequence was adequately generated in all studies, unless it was an interrupted series [76], or in circumstances where the school chose the intervention and control groups [85] rather than adhering to a blinded randomization process.

Wellbeing Outcomes
Five different types of wellbeing were identified corresponding to measures of general wellbeing, emotional wellbeing, psychosocial wellbeing, social wellbeing, and subjective wellbeing. Wellbeing outcomes are also measured by other terms directly related to wellbeing that include eudaimonic forms, such as flourishing, resilience, life satisfaction, and quality of life. Wellbeing measures also include an individual's capacity to recover from adverse events and these are measured in terms of protective factors, coping styles, self-esteem, and self-efficacy. Table 4 summarizes the wellbeing outcome(s) that each intervention targeted, and the measuring instrument used to measure each outcome.

Wellbeing
Out of 29 school-based interventions, 8 interventions measured wellbeing through validated measuring instruments (CA, FS, RWBS, SWEM, WEMWBS) including wellbeingrisk (K10) [57]. Out of 8, one intervention reported a significant medium effect on wellbeing and was related to music (WEMWBS. d = 0.26, p < 0.08) [71]. Two intervention measured emotional wellbeing and involved a whole school approach (MHI d = −0.24, p = 0.02) [58] and two a music intervention (El Sistema [76], but both reported no significant impact on emotional wellbeing outcomes. Four interventions measured subjective wellbeing and involved ACT [65], resistance training [80], and two involved positive psychology [64,67]. Only ACT [65] as an intervention showed a small effect on subjective wellbeing measures (FS d = 0.20, p = 0.57). One music-based intervention measured psychosocial wellbeing [76] and was based on El-Sistema-inspired music for largely disadvantaged groups. It measured three areas of wellbeing as well as protective factors and reported a large effect on social wellbeing outcomes (CA d = 0.28, p = 0.06).

Flourishing
Flourishing is also a measure of wellbeing. A small effect on flourishing was evident using an online positive psychology intervention (SWEMWBS d = 0.26, p = 0.02) [63], and a medium effect was found for an intervention that combined ACT with self-determination theory (FS d = 0.47, p = 0.030 [84].

Resilience
Resilience was measured in 7 out of 29 wellbeing interventions using several instruments (CD-RISK, CYRM, Kidscope, RS, RYDM, SDQ, SEARS). Two interventions involved outdoor related activities, namely, football [73][74][75] and wellbeing warriors [73,74] but reported no significant effect on resilience outcomes. A third intervention using behaviour activation and emotional regulation techniques [70], and a fourth study focusing on self-efficacy, resilience, and coping strategies [72] both reported no significant effects on resilience outcomes. One intervention based on psychoeducation [82] reported small effects on competence, relatedness, and autonomy, but not on resilience or wellbeing. A comprehensive intervention based on social and emotional skills-building focused on the areas of emotional literacy, personal strengths, positive coping strategies, problem-solving strategies, stress management and emotional regulation, help-seeking with peer support [57]. In this study, resilience was measured in five areas (Resilience Internal Assets (RYDM); Resilience School Resources 1 (RYDM); Resilience School Resources 2 (RYDM); Resilience Cooperation and Communication (RYDM); Resilience Class Connectedness (RYDM)). This study reported no significant intervention effect on any of measures.

Quality of Life
Quality of life and life satisfaction measures (MSLSS, PedQl 4.0TM, SLSS) are associated with wellbeing. Two interventions that used social skills building techniques [60,78], and a third that reduced screen time using self-determination theory [62], each reported no significant effects on quality of life or life satisfaction. However, a friendship building skills program for depression reported a small but significant effect on life satisfaction (MSLSS d = 0.2, p < 0.01) [78].

Self-Esteem
Self-esteem was associated with wellbeing outcomes and measured using PSDQ, CS-FEI, and RSES. Two interventions based on music therapy [77] and resistance training [80], respectively, reported no effect on self-esteem. Two psychoeducation-based interventions, one based on emotional freedom techniques [81] and CBT-based health and education module [82] also reported no effect on self-esteem.

Self-Efficacy
Self-efficacy has recently been associated with wellbeing [6] and was measured in three interventions using CYRM, GSE, and SES. Two interventions reported a small effect, one based on martial arts (F(2, 238) = 12.14, p < 0.001) [73,74] and the other based on building resilience in in regional youth (GSE d = 0.314, p < 0.01) [72]. A third intervention based on an outdoor youth program reported a significant medium-to-large effect on self-efficacy (F = 20.38, p < 0.001) [79].

Protective Factors and Coping Styles
Both protective factors and coping style impact on the ability of an individual to sustain wellbeing, and three interventions reported measures (CA, Kidscope, RYDM) in these areas. One was based on music therapy [76] and reported no significant effect on wellbeing. Another intervention involved self-care techniques related to overcoming obstacles, media, and mastery over life, which also reported no significant effect on wellbeing [72]. A third intervention directly targeted protective factors using school-based health promotion, but also reported no significant effect on wellbeing outcomes [31].

Discussion
The purpose of this systematic review was to examine the effect of intervention on the wellbeing of young people and adolescents in primary and secondary educational context. Focus was given to Australia as its situation in unique in that, despite the high number of government initiatives that support wellbeing, nationally, there has been a 50% decline of wellbeing and other mental health measures since 2007 [86] as well as high relapse rates [87]. Further, a critical review of previous global systematic reviews conducted during the same search period excluded many Australian school-based interventions, although the study with the largest analysis of Australian cases included 10 interventions during the same search period.
Our study identified 29 school-based interventions that measured wellbeing outcomes in a total of 13,537 participants. The main findings of this systematic review are that 18 interventions, comprising 80% of wellbeing measures, found that there was no intervention effect, regardless of the type of intervention implemented. Eight interventions (n = 15%) reported a small effect, and three interventions (n = 7%) reported a medium size effect on wellbeing outcomes. One intervention reported a large effect on mental health outcomes. This outcome is consistent with other systematic reviews that reported low outcome measures showing largely small effects for universal school-based preventative interventions [32,44,50]. This review supports the claim that large universal prevention interventions with a small effect can produce meaningful improvements at population levels [41,88].
The study quality of interventions was generally low to medium, with only two interventions achieving high study quality scores, and both were RCTs. One intervention was a social skills-building program and the other was a martial arts-based program. The latter reported small effects on self-efficacy but not significant improvements in measures of wellbeing. Fifteen interventions had medium study quality scores and 12 studies had low study quality. As previous research has shown, however, school-based interventions are challenged on many levels [42,50], predominantly in achieving blinding for participants and implementation personnel, and where the study is based only on children's self-report. Another reason for lower study quality scores is inadequate generation of sequence, including removing control groups post intervention, overlapping of intervention and waitlist control groups, or where schools were allowed to choose their intervention curriculum topics, including duration. Therefore, reported high intervention effects need to be weighed against study quality. This review found a general association between reported high study impact with generally low study quality scores, which supports the findings of international reviews of school-based interventions [36].
Interventions that showed the highest effects on wellbeing were mixed in type. The highest effect on wellbeing was a social and emotional wellbeing program that combined social and emotional development with academic achievement. Two interventions reported small to medium effects on wellbeing outcomes, and these were based on ACT and psychoeducation. Therefore, an effective implementation strategy to combine wellbeing intervention with school-based learning emerged in findings in this review for a positive intervention effect. This finding is supported by recent research that shows that programs with long-term positive outcomes may occur by combining mental health literacy [38].
There were insufficient data to report on an association between delivery personnel and intervention effect. However, a recent systematic review shows that teacher-delivered interventions with training and/or professional development are effective for implementation of school-based interventions [89]. In addition, no association was found between intervention duration and effect; however, sustainability and duration are considered beneficial to producing long-term results in students [38,90]. While long term results through prolonged but low exposure and duration did not reveal a beneficial intervention effect, it may be the case that high exposure and duration may have a significant effect on wellbeing intervention effectiveness. Further research is needed, however, to explore if effects last through follow-up assessments.
This review found that few Australian school-based interventions produced a significant effect on wellbeing outcomes as measured through validated measuring instruments. Music-based interventions and, to a lesser extent, ACT-based interventions reported significant small-to-medium effects on wellbeing outcomes. Flourishing measures had the greatest impact from ACT and self-determination theory-based interventions. A small significant effect on life-satisfaction was reported from a web-based positive psychology program. Martial arts and outdoor youth programs reported a significant medium to large effect on self-efficacy. Although resilience is closely associated with wellbeing, no specific interventions reported a significant effect. Likewise, no intervention effects were reported on self-esteem measures.
Part of the rationale for focusing on Australian school-based wellbeing interventions was the lack of Australian studies included in previous systematic reviews. The Australian Educational Leader suggests that despite the vast number of programs that measure wellbeing in Australia, programs may be excluded in global systematic reviews because of low study quality, suggesting that "more high-quality program evaluations are needed across Australia" [91] (page 43). Therefore, while the high number of interventions (80%) reported no statistically significant outcomes on wellbeing, this review supports findings in other international reviews, suggesting that program fidelity and rigour are needed in program design across school-based interventions [36]. In addition, we draw attention back to the WHO definition of mental health, which ties mental health to wellbeing. Mental disorders among 16-24-year-olds in Australia went up 50% since 2007 according to a recent 'National Study of Mental Health and Wellbeing' [86]. Post the COVID-19 pandemic, schools are faced with both challenges and opportunities to change the way we approach wellbeing. Placing wellbeing as a primary measure, rather than as a measure that is secondary to a broad range of mental health interventions, may be the opportunity we need to establish wellbeing measures as an effective early detection measure for the onset of major mental health issues.
The implications for school-based personnel are considerable: for teachers who have to address and support students with social and emotional issues in their classroom, these findings indicate that they have limited tools and intervention programs that work. On the other hand, a measure of low wellbeing from a validated measuring instrument may present both an indication of later onset of a more serious mental health issue, and an opportunity for early intervention to break the trajectory leading to full disorder. Learning to measure wellbeing outcomes using validated wellbeing instruments requires mental a certain level of mental health training. School healthcare staff may also require training to understand which implementation criteria produce more favourable outcomes for students. Finally, there are also implications for families. As previous research has shown [91], family involvement with schools tends to produce better outcomes for young people. With the minimal impact of mental health programs in schools, it may be the case that alternative mechanisms may be needed, such as stronger cooperation between schools and families to find wrap-around pathways of support for young people's wellbeing.
There were several limitations to this review. First, the search period was 2012-2022 and the last 3 years have been disruptive to schools due to COVID-19 pandemic measures in schools. This event prevented research being conducted in Australian schools and much of the data collection and studies may therefore be representative of pre-pandemic levels of wellbeing. Future studies might focus specifically on post pandemic measures of wellbeing in school-based settings, which are likely to reveal greater mental health and wellbeing needs. Second, one of the outcomes, measures of wellbeing (happiness), could not be analysed because of a lack of studies reporting on happiness measures. Third, there was significant heterogeneity among interventions, which varied in terms of research design, engagement metrics, and research methodologies, including data collection, analysis, and reporting. Due to high heterogeneity, aggregated levels of efficacy using meta-analysis were not feasible. Fourth, statistical calculation of effect was not possible in a small number of studies due to data being unavailable (published or through contact with authors). Subsequently, the effect size calculation may not be the exact value, even though the effect reported for each study (in terms of a significant or not-significant measure) was based on each author's reported results verbatim. Fifth, many interventions reported data based on self-report or from one source (such as only the child report, or only the parent report) that may be partial to acquiescence resulting in false positives. These measures were not always concurrently verified with teacher, parent, and clinician measures. Sixth, the inclusion criteria were restricted to articles published in peer-reviewed journals, which excludes ongoing programs running in Australian schools that have not published their intervention findings nor have been evaluated. Finally, this review only found one intervention related to Aboriginal children. Further, while some interventions did focus on other minority groups, there were no interventions that focussed specifically on ethnic minority groups. As such, wider search terms may be needed to include a wider set of disadvantaged groups, who are known to experience low wellbeing outcomes.

Conclusions
Wellbeing is a term that is attached to a range of school-based interventions related to child and adolescent health, mental health, mental disorder, and psychological states. In this review, we aimed to narrow the definition of wellbeing to specific measurable criteria, thereby providing an analysis of wellbeing outcomes in school-based interventions. This systematic review found that most interventions (80%) did not report a statistically significant effect on students' wellbeing outcomes. Yet, there is an increasing burden on schools to manage the wellbeing of students. Therefore, we suggest that wellbeing be utilized more usefully as an early detection measure for mental health and mental disorders. Rather than a secondary measure that appears in all health and mental-health programs, we suggest that researchers, healthcare workers, and school staff may be able to implement more successful intervention strategies through early detection by targeting wellbeing outcomes as an early intervention and prevention strategy for mental health and mental disorders in children and young people.