A Multivariate Approach to a Meta-Analytic Review of the Effectiveness of the D.A.R.E. Program

The Drug Abuse Resistance Education (D.A.R.E.) program is a widespread but controversial school-based drug prevention program in the United States as well as in many other countries. The present multivariate meta-analysis reviewed 20 studies that assessed the effectiveness of the D.A.R.E. program in the United States. The results showed that the effects of the D.A.R.E. program on drug use did not vary across the studies with a less than small overall effect while the effects on psychosocial behavior varied with still a less than small overall effect. In addition, the characteristics of the studies significantly explained the variation of the heterogeneous effects on psychosocial behavior, which provides empirical evidence for improving the school-based drug prevention program.


Introduction
Drug abuse is a prevalent problem affecting young generations worldwide [1]. In response to this issue, many drug prevention programs have been implemented in schools. The Drug Abuse Resistance Education (D.A.R.E.) program is the largest, school-based drug prevention program in the United States and other countries as well [2]. The D.A.R.E. program originated in 1983 from a local drug OPEN ACCESS prevention program jointly sponsored by a school district in Los Angeles and the city police department. By 2007, more than 36 million school children around the world, including 26 million children in the United States, participated in the school-based drug prevention program [2]. D.A.R.E. America [2] also reported that, in the past three years, about 1,000 communities started D.A.R.E. programs in their schools; and, as a result, more than 75 percent of American school districts and 43 countries around the world now incorporate a D.A.R.E. program. The increasing number of school districts adopting the D.A.R.E. program speaks to its long-lasting reputation, and it became so popular and significant that one day each year has been declared as the National D.A.R.E. Day by the United States Presidential Proclamation since 1988.
The D.A.R.E. program was designed to help elementary and junior high school students resist the peer-pressure of experimenting with drugs, tobacco, and alcohol. The D.A.R.E. program aims to reduce drug abuse among children by providing them with information that encourages them to make healthy decisions. Its effectiveness has been assessed by its two major outcomes: (a) the reduction of drug use, which includes tobacco, alcohol, marijuana, and other illicit drugs; and (b) the improvement of psychosocial behavior, which includes social skills (i.e., peer-pressure resistance), self-esteem, attitudes towards drug use, attitudes towards police, and family bonding. The program is normally taught by a police officer; and the core curriculum has 17 lessons, usually offered once a week for 45 to 60 minutes [3]. This typically results in an expensive program. According to Dukes et al. [4], the average cost per uniformed police officer approached $50,000 per year, and the cost per student was at least $100. In recent years, the annual federal expenditures on the D.A.R.E. program reached $750 million [5]. Nonetheless, the parents were positive about the D.A.R.E. program because they viewed the D.A.R.E. officers as effective educators [6]; the classroom teachers' also gave their high ratings to teacher-officer interaction, role-playing exercises, and graduation ceremony [7].
Considering the tremendous investment of time and money in the D.A.R.E. program, these inconsistent findings necessitate a conclusive synthesis of the research to assess the effectiveness of the program. To date, only two published research syntheses or meta-analytic reviews exist that solely focused on evaluating the effectiveness of the D.A.R.E. program [5,10]. Unfortunately, the two metaanalyses have some limitations. For example, Ennett et al. [10] examined only eight studies, and four of them were annual reports produced exclusively for the D.A.R.E. agencies, which was called into question [5,37]. West and O'Neal [5], on the other hand, reviewed the effects of the D.A.R.E. program only on drug use. Additionally, neither of the two meta-analyses explored the relationships between the study characteristics and the outcome measures. Finally, the two reviews analyzed the outcomes either independently [10] or as one simple sum of drug use measures [5]. Because the two major outcomes, drug use and psychosocial behavior, are conceptually unique but realistically related to one another, a multivariate meta-analysis [38] serves as a more appropriate analytical approach to analyze the multiple outcomes simultaneously.
The purpose of this multivariate meta-analytic review was to: (a) quantitatively synthesize updated evaluation studies of the D.A.R.E. program, and (b) simultaneously synthesize all the outcomes of the D.A.R.E. program. Specifically, this review addressed the following three research questions: (a) Did the effects of the D.A.R.E. program on the outcomes vary across the studies? (b) What was the overall effect of the D.A.R.E. program on the outcomes? (c) What study characteristics explained the variation of the effects of the D.A.R.E. program on the outcomes?

Literature Search
Using the terms "Drug Abuse Resistance Education," "D.A.R.E.," and "school-based drug prevention program" as keywords, After applying the criteria for inclusion described below, 20 final studies were selected for this metaanalysis.

Study Inclusion Criteria
The first criterion for inclusion required the study to have sufficient quantitative information for calculating the outcome measure or the effect size of the outcome: Cohen's d [39]. Cohen's d is a standardized mean difference between the treatment group and the control group. That is, where t x is the mean of the treatment group, c x is the mean of the control group, and s p is the pooled standard deviation. Whether the study utilized an experimental or quasi-experimental design was the second criterion because these designs are more rigorous and provide more valid research results than other less scientific designs. The third criterion necessitated that the study evaluated at least one of the outcomes on drug use and psychosocial behavior. The fourth and final criterion called for studies where the effect of the D.A.R.E. program could be independently evaluated. That is, whether the studies provided a D.A.R.E. treatment group and a comparable control group.

Recorded Variables
Outcome measures. The outcome measures for the present review were two sets of effect sizes: one was for drug use and the other for psychosocial behavior. Each effect size in the former set was the average Cohen's d for all the available drug use outcomes (i.e., tobacco use, alcohol use, marijuana or other illicit drug use), and the latter for all the available psychosocial behavior outcomes (i.e., peerpressure resistance, self-esteem, attitudes towards drug use, attitudes towards police, or family bounding). Different effect size measures calculated from various statistical methods in the studies were converted to Cohen's d. In line with the previous reviews [5,10], the Cohen's ds of the outcomes across the studies were calculated at the longest follow-up, which ranged from 0 (i.e., right after the program) to 10 years.
Study characteristics. The following study characteristics were recorded for the analysis: Name of first author, year of publication, sample size, statistical method (e.g., descriptive statistics, general linear models, and multilevel models, which are in the order of methodological rigor), year of D.A.R.E. curriculum, follow-up time, proportion of female participants, and proportions of ethnic groups. The selection of the study characteristics was partially guided by the pervious reviews and partially based on common information available in the studies.

Coding Procedure
Each value of the variables of the study characteristics and outcome measures needed to be recorded or coded from the 20 studies. A concurrent double coding was performed independently by the researchers. Each researcher spent more than forty hours, equivalent to five full-time work days, on coding the 20 studies. Then, the researchers engaged in extensive discussions to compare every coded item. No variable was finalized until reaching an agreement.

Statistical Analysis
Descriptive analysis was first conducted for each of the two outcomes, drug use and psychosocial behavior, by calculating the unweighted mean effect size of the outcome. According to Cohen's guideline [39], d = 0.20, 0.50, and 0.80 are considered small, medium, and large effect, respectively. 95% confidence intervals of the effect sizes were also computed. The confidence intervals showed whether the effect sizes were heterogeneous across the studies.
In terms of inferential analysis, a Hedges and Olkin's [40] Q-statistic was computed for each of the two outcomes, drug use and psychosocial behavior. The test for the Q-statistic provided statistical evidence for the heterogeneity of the 20 studies. If the test was significant, a random-effects model was tested, and the weighted mean effect size was calculated to provide a more valid estimate for the mean effect size than the unweighted mean effect size from the descriptive analysis.
In the case of heterogeneous effect sizes, the study characteristics were entered into a weighted regression model to explain the variation in the heterogeneous effect sizes. Following Hedges' [41] suggestion, the standard error used in the t-test for individual regression coefficient was adjusted as follows: Adjusted s.e.
where s.e. is the original standard error given by common computer programs, and MS Error is the mean square value for errors from the analysis of variance for the regression given by the computer programs. Note that some study characteristics had missing values that resulted from unavailable information, and they were replaced by means [42] because the mean is the best single replacement value when no other information is available [43][44][45]. Table 1 summarizes the 20 studies with the recorded variables. The unweighted mean effect sizes were 0.05 (ranging from -0.08 to 0.36) and 0.10 (ranging from -0.09 to 0.38) for drug use and psychosocial behavior, respectively. According to Cohen's [39] interpretation, both the mean effect sizes were less than small although the mean effect size for psychosocial behavior was larger than that for drug use.   Dukes (1996) [28] 248 176 D 1987 3 -----0.04 -0.01 Dukes (1997) [27] 356 264 D 1987 6 50.0 ----0.08 0.06   [29] 715 608 G 1989 2 49.0 54.0 22.0 24.0 0.05 0.11 Hansen (1997) [ Figure 1 shows the 95% confidence intervals of the effect sizes for drug use and psychosocial behavior, respectively. From the confidence interval plots, we can see that the effect sizes across the studies were more heterogeneous for psychosocial behavior than those for drug use.

Inferential Analysis
Test of homogeneity. Under the null hypothesis of H 0 : θ 1 = … = θ 20 = θ, the Hedges and Olkin's [40] Q-statistic values of Q Total were 13.34 with df = 17 (p = 0.71) and 96.61 with df = 12 (p < 0.0001) for drug use and psychosocial behavior, respectively. The homogeneity test results showed that the effect sizes across the 20 studies were statistically heterogeneous for psychosocial behavior, but not for drug use. This inferential finding was consistent with the descriptive finding demonstrated in the confidence interval plots above. By testing a random-effects model for psychosocial behavior under H 0 : θ = 0, a z = 2.92 (p < 0.01) indicated that the weighted average effect size of the 20 studies from the random-effects model was statistically different from zero but was still 0.10, a less than small effect.
Weighted regression analysis. Because the effect sizes were heterogeneous for psychosocial behavior, a weighted regression analysis was conducted to identify the study characteristics that explained the heterogeneity. Table 2 displays the estimated coefficients of the significant characteristics of the studies from the weighted regression analysis with the adjusted standard errors (Eq. 2). From Table 2 we can see that five of the study characteristics significantly explained most of the variation of the effect sizes (R 2 = 89.8%). Specifically, the longer follow-up time (B = -0.21, t = -2.49, p < 0.02) and the more rigorous statistical method (B = -0.13, t = -5.75, p < 0.001) the study used, the less effect of the D.A.R.E. program would be found for psychosocial behavior; whereas the later D.A.R.E. year (B = 0.04, t = 2.58, p < 0.02), the more White students (B = 0.01, t = 4.02, p < 0.002), and the more Black students (B = 0.01, t = 2.47, p < 0.03) the study had, the more effect of the D.A.R.E. program would have on psychosocial behavior.

Discussion
By including more updated studies and analyzing the study characteristics related to the outcomes of the D.A.R.E. program on both drug use and psychosocial behavior, this present multivariate metaanalysis provided a more comprehensive review than previous ones on the effectiveness of the D.A.R.E. program; and therefore, the present review helps us to better understand the widespread, expensive, but controversial D.A.R.E. program. The results of the present review revealed that the effects of the D.A.R.E. program on drug use were homogeneous but less than small, which confirmed the findings in the literature [5,[9][10][11]13]. The present review also demonstrated that the effects of the D.A.R.E. program on psychosocial behavior were less than small but heterogeneous, which may explain why the D.A.R.E. program is still implemented in schools, welcomed by the parents, accepted by the communities, and supported by the government [6][7], despite some evidence that the D.A.R.E. program is not successful in reducing drug use among children.
For the heterogeneous effects of the D.A.R.E. program on psychosocial behavior, the present review found that the study characteristics explained most of the variation of the effects. The heterogeneous effects suggest that some studies showed larger effects than others. By examining the specific characteristics of the studies that had larger effects, which was executed in the weighted regression analysis, future program implementations can learn from those effective studies for improving the program effects. Among the significant study characteristics, follow-up time and statistical method were negatively related to the effects; and D.A.R.E. year, percent of White participants, and percent of Black participants positively related to the effects.
These findings provided some important implications. First, the validity of long-term effects might be threatened by maturity and history. This point was also noted in the previous reviews [5,10] but without analyzing it. Second, more rigorous statistical methods that control for confounding variables could provide smaller, but more accurate estimates of the effect size. The similar methodological concerns about research design and sampling were also mentioned in Ennett et al. [10]. Third, specific, culturally-tailored D.A.R.E. programs might be needed to increase the effectiveness of the D.A.R.E. program on non-White and non-Black minorities. This implication is particularly meaningful for effectively implementing the program worldwide; and it would be interesting to explore the effectiveness of the D.A.R.E. program in other countries like Canada or Europe where the course is taught jointly with psychologists or specialists in different aspects of mental health and pedagogy. Fourth, the D.A.R.E. program has undergone several revisions since its inception [16]. The new D.A.R.E. program uses D.A.R.E. police officers as facilitators for student participation rather than as lecturers [16]. Some other significant revisions are to integrate high technology into the enhanced curriculum which includes internet safety, drive under influence, cyber bullying, and so on [2]. As such, it can be anticipated that the new revisions of the D.A.R.E. program would produce more effective outcomes. Therefore, it would be desirable to conduct a follow-up meta-analytic review on the effectiveness of the new D.A.R.E. programs. Last, it is worthy to note that the heterogeneity of the effects of the D.A.R.E. program on psychosocial behavior might come from other sources other than the study characteristics investigated in the present review. An example of such extra sources could be flaws in the implementation of the program.
In sum, the effects of the D.A.R.E. program appear to be different on drug use and psychosocial behavior. The results of the present review provide an evidence-based interpretation to the inconsistent conclusions found in the previous research that was conducted on the D.A.R.E. program. This study found that, on one hand, the D.A.R.E. program had a less than small effect on reducing drug use (Cohen's d = 0.05); on the other hand, the school-based drug intervention program also had a less than small effect on improving psychosocial behavior (Cohen's d = 0.10). The analysis from this review also identified areas in the new versions of the D.A.R.E. program that need improvement. It would be, however, more important if the new versions of the D.A.R.E. program could transform the improved psychosocial behavior into the students' actions of reducing drug use-the ultimate outcome.