4.3. Data Collection
The evaluation process was designed by the authors, who attended several early shows to ensure that data were collected as outlined in a written guide provided to the theater company administering the project. Survey responses were collected between October 2018 and February 2019. In total, the evaluation was conducted at 32 out of 40 performances. For the remaining eight performances, audiences were deemed to be too large (e.g., over 200) for the evaluation to be properly managed by the few theater company staff in attendance. Schools or youth centers signed up for the performance, and attendance was mandatory for students. As mentioned, an extra effort was made to promote the project to schools with a student population deemed to have been at risk from exposure to extremism (predominantly schools in the larger cities), but participation was offered and open to all schools. All schools and youth centers that expressed interest in the performance received an invitation. Before beginning the survey, participants were informed about the purpose of the study. No deception was employed, and participants were informed about how their data would be stored and used. Informed consent was collected from attendees. Where the consent box was not ticked, the data were excluded from the study.
The data collection process centered on a pen-and-paper survey distributed to attendees and overseen by the theater company. Randomization of participants to condition (control and treatment) was obtained by randomly providing all attendees with one of two different “evaluation envelopes” as they entered the performance. Before the performance began, all attendees were asked to open the envelope, which contained an initial set of questions (on paper, pen provided) and a smaller, sealed envelope that participants were instructed not to open. Participants were instructed not to talk to each other while filling out the survey. For half the attendees (control condition), the initial questions addressed the aims of the project and background questions about previous exposure to extremism. For the other half (treatment condition), the questions focused on demographic information, including age, gender, postcode, and questions about previous exposure to extremism. After the conclusion of the performance and workshop, attendees were asked to open the sealed envelope and answer the questions within. For the attendees who had initially answered questions concerning the project’s goals, the second set of questions were the demographic questions that treatment condition participants completed prior to the performance. For those who had initially answered the demographic questions, the smaller envelope contained the questions operationalizing the project aims. Control group participants thus only answered questions measuring the factors the project aimed to influence (e.g., political tolerance) before the performance began. The attendees in the treatment condition answered the same questions after the conclusion of the performance. Comparison of the control and treatment condition thus allows for isolating the causal impact of the project. Attendees typically spent between 10 and 15 min completing the survey both before and after the intervention, which lasted approximately 90 min. This method permitted meaningful analysis of the immediate effects of the intervention, but more longitudinal research will be required to assess the degree to which these effects last over a longer period.
4.4. Sample Characteristics
The survey was completed by 2156 participants. After data cleaning (excluding participants who failed to tick the informed consent form, who straight-lined all responses, or who were adults/teachers), the final N was 1931, with participants aged 13–20. Of these, 976 were in the control condition (50.5%) and 955 were in the treatment condition (49.5%). A total of 50% of participants were female, 44% male, and 6% preferred not to answer. A total of 538 participants were aged 13–14 (33%), 717 aged 15–16 (40%), 392 aged 17–18 (22%), and 105 aged 19–20 (5%). Mean age was 15.5. A total of 64% of participants were enrolled in public primary schools (n = 1.188), 22% in secondary youth education (STX, HF, HTX, HHX) (n = 396), and the rest in private primary schools (73), vocational schools (111), higher education (24), or other (55). Participants were from across Denmark, with a majority (1207) from Copenhagen and surrounding areas, and 724 from the provinces.
A comparison of the control and treatment groups on demographic variables confirmed that the randomization worked in creating similar groups. Thus, a set of t-tests showed that there was no significant difference in age for the control (M = 15.44; SD = 1.64) and treatment group (M = 15.54, SD = 1.68; t (1796) = −1.28, p = 0.19), in proportions of those residing in cities (64% versus 62%; z = −0.84, p = 0.40), females (55% versus 52%; z = 1.36, p = 0.17), or students in primary school (66% versus 64%, z = −0.78, p = 0.43). Likewise, there was no difference in self-estimated benefit from the performance in the control (M = 3.89, SD = 0.92) and treatment condition (M = 3.92, SD = 0.91; t (1757) = −0.86, p = 0.38).
4.5. Operationalization: Constructs and Measures
To assess the effect of the performance against the evaluation standard, we included in the survey measures of political tolerance, political efficacy, ability to recognize extremist recruitment tactics, confidence in knowing what to do if exposed to extremism, perceived legitimacy of violence, and participant satisfaction with performance. As control variables we included measures of gender, age, location, and previous exposure to extremism.
Political tolerance involves accepting the political rights of others, such as freedom of speech, even for groups with which one disagrees or fears [
59]. Tolerance of difference has been identified as a potentially protective factor against violent extremism, curbing “us and them” thinking and heightening acceptance of the unfamiliar. Efforts to enhance tolerance (or reduce intolerance) feature in P/CVE efforts across a range of contexts [
60,
61,
62]. Political tolerance was measured using three items (α = 0.79), based on measures used previously by Peffley, Knigge, and Hurwitz [
63], that capture tolerance of a political/religious group disliked by the participant to make a speech in their city (M = 3.94, SD = 1.09), hold a meeting in their neighborhood (M = 3.43, SD = 1.14), and to use Facebook to recruit to their group (M = 3.51, SD = 1.23).
Political efficacy refers to people’s trust in government and belief that they understand and can influence politics. Studies have shown that higher levels of political efficacy may not only reduce the psychological factors associated with support for radicalization to violence [
64], but also increase willingness to participate in counter-extremism activities in specific contexts [
65]. The measures selected focused on internal efficacy, meaning an “individual’s self-perceptions that they are capable of understanding politics and competent enough to participate in political acts” [
66]. Two commonly used items were utilized [
67,
68], namely perceptions concerning the degree to which: (i) politics and government seem so complicated that someone like the participant cannot really understand what is going on (M = 3.62, SD = 1.15), and (ii) people like the participant do not have any say about what the government does (M = 2.65, SD = 1.33). A combined scale showed a relatively low internal consistency (α = 0.35), so we looked at the items separately as well as combined in the analysis. Both political tolerance and political efficacy were measured on a five-point Likert scale. Options were “strongly agree”, coded as a score of (1), “tend to agree” (2), “neither agree nor disagree” (3), “tend to disagree” (4), and “strongly disagree” (5).
To measure the impact of the project on participant ability to recognize extremist recruitment methods, participants were presented with eight possible methods and asked to assess on a five-point scale how likely extremist groups were to use each method. Some of the options were genuine methods drawn from the radicalization literature, such as offering excitement and status [
69,
70]. Others were unassociated with extremist recruitment, such as providing detailed, fact-based arguments, and giving people time to research the issues independently. The internal consistency of this scale (α) was 0.70. Research has shown that there are a range of online and offline pathways to membership of extremist groups [
71,
72], and enhancing the ability of youth to recognize recruitment methods is therefore important for many P/CVE initiatives.
The authors were unable to identify a suitable existing measure of confidence in responding to extremism and therefore designed three items (α = 0.60) based on the project content and discussions with P/CVE practitioners. Participants were asked to indicate the degree to which they agreed that they: (i) could recognize extremist ideas if they came across them (M = 3.75, SD = 1.00), (ii) would know what to do if they heard extremist conversations in school/everyday life (M = 3.41; SD = 1.07), and (iii) would know where to get help if an extremist was trying to exploit them or a friend (M = 3.36, SD = 1.25). Options were “strongly agree”, coded as a score of (1), “tend to agree” (2), “neither agree nor disagree” (3), “tend to disagree” (4), and “strongly disagree” (5). The capacity to recognize extremism, and to understand how to access support, is important not only for enhancing protective factors around the individual but also for wider approaches to extremism, which increasingly ask the public to report suspicious behaviors or concerns about radicalization to the authorities [
73,
74,
75].
The list experiment to measure the degree to which the project affected perceived legitimacy of political violence randomly assigned participants (across both control and treatment condition) to assess how many of four or five statements they agreed with. Inspired by Dinesen and Sønderskov [
76], the authors included one statement that they expected most people would agree with (“I like watching movies”), one statement they expected most people would not agree with (“I want to work as a garbage collector”), two statements they expected people to disagree about (“In schools they ought to teach more music and dancing” and “In schools they ought to teach more religious values”), and in the five-item condition also the critical item (“It can be justified to use violence against public authorities or politicians”). We chose this critical item (the technique allows for only one) because the Danish authorities were particularly concerned about direct threats to democracy. Participants were instructed to indicate how many rather than which specific statements they agreed with. The order of items was randomized. The mean difference between the four- and five-item conditions was interpreted as the percentual acceptance of the critical item.
A single measure of participant satisfaction with the project was used, asking participants to select the degree of benefit they felt from hearing about the formers’ experiences (M = 3.90, SD = 0.91). There were five options: (1) “no benefit”, (2) “small benefit”, (3) “average benefit”, (4) “large benefit”, and (5) “very large benefit”.
Control variables: standard items were used to capture gender and age. Location was measured by having participants report postal code of residency. Previous exposure to extremism was measured using three items (α = 0.61) focused on participant experiences of: someone they know “liking extremist content on the internet” (M = 0.56, SD = 0.76), extremist propaganda in their school or neighborhood (M = 0.34, SD = 0.64) and, finally, concerns about extremism vis-a-vis someone they know (M = 0.41, SD = 0.67). Participants were asked whether they had experienced these situations (1) never, (2) once, or (3) more than once.