Citizenship Education for Political Engagement: A Systematic Review of Controlled Trials

: Citizenship Education could play a pivotal role in creating a fairer society in which all groups participate equally in the political progress. But strong causal evidence of which educational techniques work best to create political engagement is lacking. This paper presents the results of a systematic review of controlled trials within the ﬁeld based on transparent search protocols. It ﬁnds 25 studies which use controlled trials to test causal claims between Citizenship Education programs and political engagement outcomes. The studies identiﬁed largely conﬁrm accepted ideas, such as the importance of participatory methods, whole school approaches, teacher training, and doubts over whether knowledge alone or online engagement necessarily translate into behavioral change. But the paucity of identiﬁed studies also points both to the difﬁculties of attracting funding for controlled trials which investigate Citizenship Education as a tool for political engagement and real epistemological tensions within the discipline itself.


Introduction
Despite the critical democratic role Citizenship Education could and should play in encouraging and enabling political engagement, there remains a dearth of robust evidence as to "what works" (Geboers et al. 2013). Whilst academic interest in approaching the issue through robust methodologies is growing, as this Special Issue is testament to, the field lacks a sense of how many of the multitude of available evaluations can truly be considered reliable members of the evidence base. This paper is therefore the beginning of an attempt to consolidate controlled trial evidence of the causal efficacy of Citizenship Education to produce politically active citizens. This review focuses exclusively on controlled trials (ideally randomized) as a robust method for measuring cause and effect. This is not to suggest that other methods have no value in understanding citizenship education, for controlled trials are certainly limited in their explanatory power, scope, and scalability, but controlled trials represent a frequent omission in the current evidence base which is difficult to compensate through other methods. Campbell (2019) points precisely to this in a recent literature review entitled "What Social Scientists Have Learned About Citizenship Education", and similar reviews by Bramwell (2020) and Manning and Edwards (2014) are also suggestive of a lack of controlled trials. As no systematic review of controlled trials within this area has yet been undertaken, we do so here for explorative purposes, to see how many studies of this kind exist and what aspects of Citizenship Education they address. The aims of this review are therefore two-fold: scoping and mapping, as described by Grant and Booth (2009) in their typology of reviews. These translate into two simple research questions:

1.
What is the size and scope of the available research literature documenting control trials of Citizenship Education for political engagement? 2.
What type of education initiatives have been described in the literature identified in (1) and what do their findings show?
Whilst we offer some discussion of pedagogical approaches, program delivery methods, and the political outcomes realized, we do not attempt a meta-analysis or grand conclusions in response to the narrower question of what exactly the causal relationship between Citizenship Education and political engagement is, as this would require us to go well beyond the capacity of the evidence we found.

Citizenship Education for Political Engagement
As far back as Addams ([1902] 2002), Dewey (1923), and Marshall (1950), thinkers have recognized that social justice is not guaranteed by mere legal rights but requires active and informed participation in decision making. In other words, social justice must be asserted through the ballot box and an active civil society. A strong participatory democracy (Barber 2003) grounded in equality in political engagement (Dahl 2008;Verba et al. 1995) is therefore a prerequisite for a truly inclusive society. In such a democracy, individuals from all parts of society vote and express their views within their communities to promote the kind of society they wish to see. Crucially, the health of democracies relies on political engagement from citizens of all social backgrounds. Yet in western democracies, in particular in the UK and the US, we see a recurring pattern in which the most privileged social groups are also the most politically active, and consequently able to direct political decision making toward their own interests and priorities (Dalton 2017;Verba et al. 1995). Conversely, disadvantaged groups, which should have the most to gain from asserting their democratic power, have become alienated from a political realm which is not seen as addressing their concerns or speaking their language (Bovens and Wille 2017). One hope of disrupting this vicious circle of political socialization, which reproduces and exacerbates inequalities, is to use education to politically engage all young people, regardless of social backgrounds, during their formative years (Hoskins et al. 2017;Hoskins and Janmaat 2019).
In principle, the subject in which to address young people's political engagement at school is Citizenship Education (alternatively known as Civic Education or Civics, in the US). However, not every conception of citizenship promoted by national education systems encourages active political engagement. In some cases, the co-option by nationalistic agendas (Starkey 2018) might stress compliance, quiet obedience, or intolerance, whilst in others the subject is simply deprioritized (Burton et al. 2015) or depoliticized (from the students' point of view, at least) through the use of a thin liberal conception of citizenship which protects the status quo. As an example of the latter, government policy on Citizenship Education in England has departed in more recent years from an agenda of political participation toward character education and moral responsibilities (Weinberg 2020). Despite this, many teachers and third-sector Citizenship Education organizations have tried to keep the original political focus alive, and it is this interpretation of the concept of Citizenship Education as a tool for encouraging political engagement which is of interest to us in this paper.

Why Control Trials?
Empirical research on Citizenship Education for political engagement has advanced rapidly in recent years, particularly in relation to the analysis of large international datasets such as the IEA International Citizenship and Civic Study (cross-sectional and comparative data) and even some longitudinal datasets at the national level, such as the Citizenship Education Longitudinal Survey (England). This allows for the analysis of varying degrees of exposure to diverse forms of learning citizenship across educational pathways and different education systems. For example, Hoskins and Janmaat (2019) find an association between exposure to Citizenship Education in schools in England and voting intentions at age 16 and, particularly encouragingly, some indication that disadvantaged students appear to benefit the most (Hoskins and Janmaat 2019). However, Hoskins et al. (2012) also warn that Citizenship Education does not always have this positive effect, and it is in establishing exactly "what works" that the picture become far less clear, not least because modes of delivery, program design, and implementation can vary considerably within the same education system. One attempt to parse apart different pedagogical approaches to Citizenship Education has been through the conceptual distinction between acquisition and participatory models of learning (Sfard 1998), with some within the field suggesting that the evidence weighs more heavily on the success of the participatory approaches (Hoskins et al. 2021). For example, there is very strong evidence that an open classroom method of learning, which would be considered an inherently participatory approach, is associated with political engagement (Torney-Purta 2002;Campbell 2008;Hoskins et al. 2012;Quintelier and Hooghe 2012;Keating and Janmaat 2016;Knowles et al. 2018), positive attitudes towards political engagement (Hoskins et al. 2021;Geboers et al. 2013, p. 164), critical thinking (Ten Dam and Volman 2004), citizenship skills (Finkel and Ernst 2005), political knowledge (Hoskins et al. 2021;McDevitt and Kiousis 2006), and political efficacy (Hoskins et al. 2021). Such evidence is certainly highly suggestive, but does it demonstrate causation?
In reality, convincingly establishing causation between different types of Citizen Education programs and political engagement outcomes is something that can only be approached by degree. There is no panacea, and the notion of unequivocal demonstrable causality falls apart on metaphysical as well as methodological grounds. Nevertheless, there are pragmatic criteria (such as Bradford Hill (Hill 2015)) which can be turned to when making a case for or against the existence of a causal relationship. Different methodological approaches allow for different elements of such criteria to be invoked. For example, theory-led approaches may allow for plausible causative mechanisms to be revealed, whilst longitudinal data may allow one to show that the suspected cause temporally precedes the implied outcome. Away from the analysis of secondary data, many small-scale evaluations of specific Citizenship Education initiatives combine these two principles by explaining the theoretical basis of the program and then administering surveys to participants before and after the program. A successful example of this is Oberle and Leunig (2016), who used this approach to suggest that using simulation games in Citizenship Education classes can lead to improved knowledge about the European Union's political processes and increasing levels of trust, in particular for more socioeconomically deprived groups.
But controlled trials can add unique value to this mix of methods, as they have a characteristic not available to other methods (we should note here that Oberle and Leunig themselves acknowledge that control groups would have strengthened their study). For whilst statistical techniques applied to data may attempt to retrospectively estimate the effect of both observed and omitted variables (i.e., unobserved heterogeneity), they cannot be expected to satisfactorily reconstruct the counterfactual. In other words, what would have happened if the participants did not receive the educational treatment? By comparison, a randomized controlled trial (RCT) comes as close as is possible outside of laboratory experiments to reconstructing the counterfactual by introducing a control group whose members are subject to the same measurements (normally pre-and post-intervention) as the treatment group but are not exposed to the treatments itself. Given sufficient numbers, the statistical expectation is that the random allocation of individuals to the control or treatment group reduces any other difference between the groups other than their exposure to the treatment, with the highest level of confidence requiring multiple trials carried out by independent research terms, each with large numbers of participants. In this review we also include studies in which the allocation of participants or participant-groups is not strictly random, as Citizenship Education initiatives are frequently compelled to make use of existing organizational structures, such as classes within schools. This clearly weakens the method to some extent but can still be a useful step toward making a causal argument if the groups have comparable baseline characteristics and are in the same environment.
As Connolly et al. (2017, p. 14) put it, "What RCT's offer, therefore, is not just the opportunity to provide robust evidence relating to whether a particular program is effective or not, but also-and over time-the creation of a wider evidence base that allows for not only the comparison of the effectiveness of one program or educational approach over another but also for how well any particular program works in specific contexts and for differing subgroups of learners". Yet control trials have also been contested within education research. Connolly et al. (2018a) identify four underlying criticisms: (1) that RCTs are not possible, on a practical level, to undertake; (2) that they ignore context; (3) that they seek to generate universal laws of cause and effect; and (4) that they are inherently descriptive and do not advance theoretical understanding. But the subsequent analysis by these authors of over 1000 RCTs of educational initiatives casts doubt over each of these criticisms, demonstrating that controlled trials can be undertaken, can acknowledge context by including process evaluation and differentiating effects on subgroups, can discuss the limitations of the generalizability of findings, and can be both rooted in theory and make arguments for the future development of theory. Though Connolly et al. (2018a) also note that the extent to which particular studies address these concerns can vary, and the debate within educational research continues. Each of these points of contestation are as applicable to Citizenship Education research as they are to educational research in general, to which might be further added the particularly acute influence of Paolo Freire's critical pedagogy (Freire 1996) on Citizenship Education for political engagement (Crawford 2010) and by association his scrutiny of research power dynamics and wariness of techniques associated with positivism and the reinforcement of structures of control (Freire 1982;Brydon-Miller 2001). We do not resolve these debates in this paper, but simply note them as an important context prior to presenting the results of the systematic review.

Search Protocols
Our approach is similar to that of Sant (2019), who recently undertook an exploratory systematic review within a related field, though focusing on conceptualizations rather than controlled trials. The systematic review begins with searches for standardized terms (known as protocols) in all appropriate academic databases before the articles were screened manually. We operationalized our focus on controlled trials within the search protocol through the inclusion of the term "controlled trial" as well as the common variant "control trial". The abbreviation for randomized controlled trials, "RCT", was found to be largely redundant given the previous terms and was left out, as it leads to the inclusion of studies on Rational Choice Theory. We also include the terms "citizenship" or "civic" along with "education", capturing what we believe to be the most common signifiers within the field. Admittedly, there is now a proliferation of different terminology used for Citizenship Education, both in schools and also non-formal learning within the youth and third sectors, so our coverage cannot be considered complete. Variants such as Global Citizenship Education, Education for European Citizenship, and Education for Democratic Citizenship each have slightly different meanings and associations, but by including the words "education" and "citizenship/civic" as free floating search terms rather than joining them into a phrase (i.e., "citizenship education" or "civic education"), our searches should at least include studies which use alternative phrasings of this type.
The word "political" is also included to narrow the results to those studies concerned with education as a route to political engagement rather than the nationalistic or liberal (depoliticized) conceptions of Citizenship Education described previously. As with all the qualifier terms, and no more so than with the word "political", the mere use within a search protocol does not guarantee that the resulting articles reflect the meaning of the words in the way we would wish them to. The false inclusion of articles by the protocol, whereby studies do not, for example, measure what we consider to be political outcomes, is dealt with during the manual screening process explained in next section and is only problematic in so much that it necessitates subjectivity and injects some inefficiency into the review process. Of far more concern were false exclusions, whereby the protocol, when applied to a database, does not return articles which actually do describe control trials of Citizenship Education for political engagement.
Indeed, it soon became apparent that searching conventional academic databases and indexes was producing sparse results. To give one example, the Web of Science produced only three results which fulfilled the criteria, two of which passed the manual screening. This trend was widely repeated with 40 other relevant databases, yielding just 125 results, with only four of these passing the manual screening. This appears to be due to the inability of most databases to perform full text searches on many of the articles and our requirement of four search terms to properly specify what we were looking for. We therefore turned to Google Scholar, whereby the same protocol produced results of an entirely different scale of magnitude (>13,000) and included all the studies from the conventional academic databases previously searched that had successfully passed the manual screening stage. Although Google Scholar is far more restrained in the sources it draws from than a conventional Google search, its coverage is much wider than curated academic databases, is inherently multidisciplinary, and makes use of semantic search algorithms which attempt to return results corresponding to the meaning of the search terms rather than only literal matches. All of this contributes to a liberal return of results, but with a trade-off in accuracy and reproducibility, and makes Google Scholar rather less systematic than is ideal for a systematic review, as documented as well by Gusenbauer and Haddaway (2020). But these same authors note the popularity of semantic search engines for exploratory research. Moreover, despite the shortcomings, this study is illustrative of their undoubted appeal in this regard, as it was only Google Scholar that allowed for the studies we eventually selected, albeit combined with considerable manual screening. Researchers will find that an immediate problem which arises when taking this more inclusive route is that the number of results can exceed the capacity for manually screening. In our case, the inspection of the results showed them to be dominated by medical studies of little relevance, RCTs being far more prevalent within medical research. Therefore after some experimentation, we found that by using some medical terms as disqualifiers we were able to reduce the search results back to a manageable number of 2620 articles which progressed to the manual screening stage.

Manual Screening
All 2620 articles identified by the amended search protocol were then screened manually, first by title, then by abstract, and then by full text where necessary. The process through which a decision was made as to whether to include an article in the final list can be conceived of as a set of criteria, some of which are objective in nature and therefore simple to apply, and some of which unavoidably require more subjective judgments. We briefly list the criteria below and provide some examples of the more subjective judgments which were made in implementing the final two criteria.

1.
Article returned by search protocol. Results were not filtered by date, though the oldest study identified as fulfilling all of the subsequent criteria below was published in 2006.

2.
Article provides sufficient detail in English (or has an accessible English translation available) on which to make assessments for all other criteria. A certain amount of detail of the study is required in order to make an informed judgment. If a study was briefly outlined in an article with references to a more adequate description elsewhere, then it was included on the basis of the secondary source. It should be noted that the search itself biases results toward English language articles, as the search terms entered were in English.

3.
Article is not a representation of a study which has already be identified. Although Google Scholar is efficient in nesting multiple versions of the same article within a single result item, occasionally multiple accounts of the same evaluation were found (e.g., a policy paper and academic article), in which case the most complete account was selected. 4.
Study uses control groups to produce quantitative data to which statistical testing is applied. Studies which do not use control groups, use comparison groups only for qualitative purposes, or do not deploy statistical testing on results were excluded. However, no stipulations were made on sample size, and allocation to control groups did not need to be random.

5.
Study evaluates an education scheme. Whilst the interdisciplinary nature of Google Scholar allows for studies to be included which have not been published in education journals, it creates a slight issue during screening in having to decide what represents an educational program. In the case of Citizenship Education, it is not appropriate to limit a review to initiatives which take place within formal learning environments such as a school. Rather, we must make a wider but more subjective judgment as to whether the scheme involved a process of systematic formative instruction rooted in pedagogy. In practice, this meant the exclusion of short-term positive reinforcement or suggestive "nudge" mechanisms such as those studied by Aker et al. (2011);Bond et al. (2012) and Costa et al. (2018). Similarly, real-life exposure to political events outside of a learning framework was also excluded, though some studies of this type may nevertheless be instructive for the design of future educational programs. For example, Wong and Wong (2020) undertook an interesting RCT involving exchange students during the Umbrella Movement in Hong Kong, but the experience was not situated within an educational framework, and to include such studies would imply the review should also look at the effect of other life experiences on politicization and begin to broaden the topic away from our core concern. 6.
Study measures political outcomes. Given that one of the gaps in the evidence base is an accepted theory of change for instigating political participation, we take a broad approach to political engagement, that includes both political actions (protesting in all the diversity of ways this occurs, including both online and offline voting in elections at different levels and contacting and volunteering for political parties) and the competences (attitudes, values, knowledge, and skills) that enhance the quality of the engagement and enable competent political behavior. The list of possible knowledge, skills, attitudes, and values that this could encompass are vast, but a useful delineation which resonates with our own understanding is the Council of Council of Europe (2018) reference framework for democratic culture. In practice, this amounted to the exclusion of initiatives aimed at developing teamwork or individual character traits featured prominently in the search results, but for the most part had little direct relevance to political engagement (e.g., Siddiqui et al. 2019;Connolly et al. 2018b;Silverthorn et al. 2017;Siddiqui et al. 2017;Kang 2019). We also found several studies dealing with conflict resolution, community cohesion, and reducing violent behavior, but these were again screened out, as their concern was generally restricted to harmonious societal relations rather than active political behavior (e.g., Niens et al. 2013;Chaux et al. 2017;Enos 2013), though we acknowledge that counter-arguments could be made here.

What Types of Programs Have Been Tested by Control Trials?
In total, 25 controlled trials which test political outcomes deriving from educational initiatives have been identified (Table 1). To structure the discussion of these studies we group the RCT articles based on different approaches that have been considered, within the international practitioner field of citizenship educators, to be successful in teaching Citizenship Education (UNESCO 2015). The first three categories describe different strategies to delivering Citizenship Education within schools. School-based Citizenship Education can either be delivered as a stand-alone program, as a cross-curricular approach, or as a holistic whole school approach which influences multiple aspects of school life under a guiding ethos. Underpinning each of these three is teaching training, which can itself be the focus of initiatives and therefore represents our fourth category. However, Citizenship Education does not only happen within schools, and any initiative outside of the education system (e.g., by NGOs or community groups) is referred to as "non-formal", and the articles on such programs comprise our fifth category. Our final two categories could occur both in non-formal programs and in the various aspects of school life. These two themes describe initiatives with a clear participatory learner-centered approach (category six) and those looking to unlock the potential of digital techniques, generally within online environments (category seven). Our categories should not be considered mutually exclusive parts of a comprehensive typology, but rather as useful ways to present the results which reflect common practitioners' vocabulary. To avoid repetition in the discussion below, we focus upon the most illustrative studies for each category, with Table 1 representing a more thorough categorization, in which some articles are tagged as belonging to more than one category.

School-Based Program (Stand-Alone)
The classroom is the theatre in which specific teaching practices play out, and it is the specific activities within the classroom which most immediately come to mind when thinking of Citizenship Education. Representative of this is the Student Voice program (Syvertsen et al. 2009), in which students practice civic skills, debate political issues, and connect their own community interests to the platforms of candidates before simulating the process through mock elections. Teachers invite local candidates and journalists into the schools for question-and-answer sessions with students. The RCT was of 1670 high school students in 80 social studies classrooms and found significant effects of the program on various self-reported political measures, such as the ability to cast an informed vote, knowledge of the voter registration process, belief that their vote matters, communication with others at school about politics, sense of civic obligation, and media use and analysis. This alone is quite persuasive evidence that the type of basic participatory good practices long spoken about in the field (Hoskins et al. 2012) can show signs of causal efficacy under control trial conditions. Yet some programs have gone beyond this standard good practice and produced intriguing results in doing so. Notably, the study by McDevitt and Kiousis (2006) of the Kids Voting program appears to show that incorporating the students' home environments as part of the learning environment may bring an added effect. The Kids Voting program included experiential learning based on group-problem solving, peer discussion, and cooperative activities, and in many ways is somewhat analogous to the previously described Student Voice Program. However, what seems to be unique to this program is that it includes activities for the children to complete with their families, such as creating a family election album, roleplaying in which students act as political reporters interviewing family members, and a children's ballot where students can cast a vote at the same polling stations as their parents. The analysis of 491 students aged 16-18 years old suggests that the interplay of influences from school and family magnified the effects of the election-based curriculum and sustained them in the long term, resulting in an increased probability of voting for students when they reached voting age.
However, not all school-based activities will be as successful as hoped, and given the publication bias toward positive results, it is extremely useful to have control trial evidence of the possible limits of some approaches. For example, a promising interactive environmental program which, as in the previous study, involved activities for children to complete with their own families, was ran in the UK. Yet Goodwin et al. (2010) found in their study of 448 primary school students in 27 primary schools that there were no effects compared with the control group on behavior, and an extended version of the program did not yield positive results. There is no clear reason why the program did not produce better results, though the vagaries of context and implementation can be difficult to appreciate from a distance. The authors themselves note that the awareness of the control group also rose during this period, which would seem to suggest contextual complications.
Continuing on a cautionary note is the study by Green et al. (2011), who strongly question the assumption that knowledge alone leads to attitudinal or behavioral change. They undertook an RCT of an enhanced civics curriculum of 1000 15 to 16 year-old students in 59 high schools. The curriculum looked to increase their awareness and understanding of constitutional rights and civil liberties, and although the students displayed significantly more knowledge, no corresponding changes in their support for civil liberties were found. The association between knowledge and behavior change has been critiqued before, not least from the stance of critical pedagogy, which suggests that the assimilation of knowledge can lead to a passive acceptance of the status quo, but to have such clear control trial evidence of the inability of knowledge alone to lead to political behaviors is of real value.

Cross-Curricular Approach
Whilst the efficacy of the acquisition of knowledge alone is widely doubted, the significance of skills development is a much more contested area, and one study provides evidence that learning environments which consistently encourage social skills can encourage political engagement. Holbein (2017) addresses this by testing the hypothesis that the targeted development of social and emotional skills can in itself lead to behavioral changes in political engagement. The study looks at the impact of a wide program of interventions to develop social and emotional skills including parent training, peers training, stories, films, games, roleplays, and joint reading activities. The study involving 812 students across 55 schools seems to point toward the importance of the quality of social interaction within the learning environment for the development of these skills rather than the valorizing of a single activity. The finding is quite striking, as it seems to indicate that the early development of psychosocial skills leads to a noticeable increase in long-term voter turnout.
In some jurisdictions, schools can decide to run Citizenship Education itself across the curricula, traversing traditional subject areas. One successful example of this was the science and civics instruction used to promote sustainable development in the article by Condon and Wichowsky (2018), who studied the program for 11-14 year-olds aimed to develop citizen-scientists in the US. The program was based on a real-world, community improvement, and problem-based inquiry that focused on reducing the unnecessary use of resources. It gets students to monitor the use of gas, electricity, and water in their home and in their school and to conduct experiments to identify if they can reduce consumption. The clustered RCT included 551 students across 13 schools and found that integrating science and civics into a unit about community water conversation improved engagement in both areas.

Whole School Approach
The ultimate elevation of school-based Citizen Education from a single subject, and even beyond a cross-curricular approach, is the whole school approach (Gibb 2016). Given that the practice itself is less common, we are fortunate to have the experiment by Gill et al. (2018) involving a U.S. charter school which uses control trial principles to evaluate the effects of a whole school approach driven by the unique mission and strategy of the organization. One of the more unique features of these types of schools in the U.S. context is that they are publicly funded schools but independent from officials and yet still have a core civic mission. The specific school studied, "Democracy Prep", has educated more than 5000 students across multiple campuses in New York, and its mission statement is "to educate responsible citizen scholars for success in the college of their choice and a life of active citizenship". This school facilitates the learning of citizenship throughout its curricula, including experiential learning (visiting legislators, attending public meetings, testifying before legislative bodies, and running get out and vote campaigns during elections) and more traditional knowledge-based activities like writing essays on civic and governance. To give just one specific example, during the final year students develop a "change the world" project that investigates a real-world social problem, then design a method for addressing the issue, and then implement their plan. By taking advantage of the random allocation of 1060 students (due to oversubscription) into the charter school, Gill et al. (2018) found that those who were admitted to the school went on to have an increased probability of future voting. This is very important evidence that a school, by adopting a civic mission and civic ethos, which then allows citizenship to flow into all aspects of school life, can motivate tangible differences in political behaviors.

Teacher Training
Any Citizenship Education scheme is only as good as its implementation, and it can be easy to overlook the differences in the capabilities and enthusiasm of teachers to deliver programs. Indeed, there are two studies which provide some indication that investing in the development of teachers really can make a difference. For example, Andersson et al. (2013) showed that an initial teacher training on education for sustainable development (ESD) led to positive effects regarding the attitudes, perceptions, felt personal responsibility, and desire to contribute toward sustainable development among the student-teachers. This comes from an analysis of parallel-panel data surveys of 404 student-teachers which included a control group but was not randomized.
Whether or not well-intentioned teachers are then able to pass this on to their own students is of course another question. But the Facing History program studied by Barr et al. (2015) suggests that arming teachers with conceptual tools and teaching materials can result in observable changes in the students. The program was evaluated in the US through an RCT amongst 14-16 year-olds (n = 1371) and found that when teachers that had received this training and given the materials brought the program into the classroom, it promoted respect and tolerance for the rights of others among the students, an increased awareness of prejudice and discrimination, and a sense of civic efficacy. Whilst untangling the training of the teachers from the classroom methods they then implement is difficult, these examples provide some evidence that quality teacher training should at least be a component of introducing effective political Citizenship Education into the classroom.

Non-Formal Education
Stepping momentarily away from schools, we now consider some control trials which looked at interventions outside of the formal education system and are therefore referred to as "non-formal". Some of these non-formal programs look at the effect of community or group-level initiatives on the political engagement of the individual. For example, Blattman et al. (2011) used a clustered RCT to evaluate a community empowerment program in Liberia across over 230 communities. Their study measured the respect for human rights, equality, civic participation, and community cohesion, and the findings showed modest increases in the first two but little change in the latter two. The authors also stress that the observed impacts were not always in expected ways, which perhaps highlights the complexity of operating in the community and the relative lack of control organizers have over such socially dynamic environments when compared to a school setting.
More encouragingly, in the UK, a Cabinet-Office-funded evaluation of the National Citizen Service program by Booth et al. (2014) yielded some positive results. The National Citizen Service runs over five phases, from residential inductions to community-based action projects. Though initially restricted to self-reported attitudes the quasi-experimental study goes on to measure overall increases in community engagement, volunteering, and intention to vote amongst 7379 of the 15 to 17 year-olds in the study.
Other non-formal initiatives looked at the effect of providing basic information to adults. For example, Pang et al. (2013) investigated the effects of training women in China on their voting rights for village committee elections. Involving 700 adults, the RCT demonstrated that the women who had received the training not only had a greater knowledge of their rights but were also more likely to exercise these rights. The authors are clear that the study shows that the lack of basic knowledge in rural villages is a barrier to voting in village committee elections. Barros (2017) also looked at the effect of providing basic information on the importance of voting, concluding that the participants studied in Portugal could be encouraged to vote if this led to their valuing the act itself, a phenomenon the author terms warm glow voting. These results appear to nuance the previous observation that knowledge does not lead to action, by showing that, in specific contexts, and in applied settings rather than in the classroom, basic timely information can make a difference. However, as acknowledged in the latter experiment, it is the value placed on the act as a result of a greater understanding, rather than merely the knowledge itself, which is ultimately responsible for motivating the action.
An interesting project that operated as a hybrid between formal and non-formal education and combined knowledge acquisition with participatory approaches was conducted in Peru (Agurto and Torres 2020). This project combined knowledge acquisition on financial literacy and life skills training on leadership, public speaking, and team-work with sending students as ambassadors into the community as change makers to support the provision of basic bank accounts and financial inclusion for disadvantaged communities. The project involved 131 students from a university scholarship program and led to an increased level of self-efficacy, empowerment, and community engagement for female students.
Finally, Bowen and Kisida (2018) looked at different perceptions of civil rights after Holocaust museum visits. They report a positive impact on students' desires to protect civil rights and liberties across 865 students participating in an RCT in 15 middle and high schools. However, the effects are limited and seem to stop short of behavioral change, with no significant evidence that the intervention affected students' sense of civic obligation, empathy, willingness to take on roles as upstanders, or inclinations toward civil disobedience. This study is therefore more consistent with the notion that knowledge alone, even when affecting students, has its limits in triggering political mobilization. There are also notable interactions with gender, ethnicity, and social class which should serve as a warning of the danger of drawing universal conclusions from controlled trials and the benefit of obtaining large sample sizes, so that these finer grain analyses can be investigated.

Participatory Approaches
Many of the initiatives described by the articles identified in this review have made some use of participatory techniques to a greater or lesser extent, among which we can include regular discussions, debates, and simulation exercises (such as mock elections and trials) (Hoskins et al. 2012). For example, Kawashima-Ginsberg (2013) found evidence for the efficacy of exactly these practices in a control trial analysis of 10 to 16 year-old pupil scores on the national civics assessment test. For brevity, we will not repeat the description of other studies with common participatory elements described under different headings, but would encourage readers interested in this theme to look at the studies by Syvertsen et al. (2009);McDevitt and Kiousis (2006); Gill et al. (2018); Condon and Wichowsky (2018).
That said, special attention under this heading is given to a couple of articles which are particularly instructive. Firstly, a very thorough participatory approach was studied by Ozer and Douglas (2013). This program in the U.S. tested the difference that participating in youth-led research has for the young people involved. The approach is learner-centered at every stage, with the research topics selected by the students themselves, and consequently included a diverse range of topics, such as: prevention of school drop-out; stress related to family, academics, or peers; improving the school lunch; cyber-bullying; improving teaching practices to engage diverse students; and improving inter-ethnic friendships at the school. The RCT study involved 401 students at five high schools and found that attending these participatory research elective classes during the school day was associated with increases in the students' sociopolitical skills and motivation to influence their schools and communities. The indication that learner-led approaches such as this may circumvent the previously discussed disconnect between knowledge and motivation to act is a primary attraction of participatory methods over more acquisition-based approaches.
Secondly, the study by Feldman et al. (2007) is quite unique, as it was able to isolate the effects of various elements of a Student Voice program. The program as a whole was quite participatory in that teachers were given a framework of election-based activities but could deviate significantly based on student interests. Overall, the program produced increased interest, knowledge, and efficacy in regards to politics, as measured across 22 U.S. high schools, each of which had a control group. But they were also able to show that it was political discussion within classrooms which was the primary driver of this change, more so than other eye-catching activities within the program, such as actually meeting the election candidates.

Digital
Perhaps the timeliest studies are those which evaluate the emergence of online learning environments. The findings across this section suggest that the digital world is similar to the offline world and that it is high engagement actions, in this case the student-led creation of content, that lead to changes in attitudes and behavior.
A study by Smith et al. (2009) stresses the importance of active participation in online environments. They conducted a novel RCT of online discussions on moderated chatrooms using a large mixed age market research panel in the UK (n = 6009) and found that only those who posted content showed evidence of developing their opinions through discussion. This is contrasted with those who spent time reading the message boards but did not actively post themselves, and subsequently showed no discernible change in opinions. Strandberg (2015) carried out a similar online RCT deliberation across 70 adults in Finland, finding that some alleviation of the polarization of opinion as well as the participants' feelings of efficacy.
The importance of social support within online environments is taken up by Levy et al. (2015), who studied a sample of 309 US high school students, out of which one class was instructed to keep political blogs to document their thoughts on the unfolding election. The authors find that the "bloggers" developed greater political interest and confidence in their political skills and knowledge, even when compared to their peers in other government courses. However, the authors also note that some students got frustrated at the lack of responses to their blog posts, pointing toward the importance of a receptive audience within a community of learning if this technique is to be further developed. On this same point, Margetts et al. (2009) showed that a mechanism can be built into online environments which simulates the social support and pressure of collective action. Their controlled trial found that among 668 adults, it was those who had received positive feedback from supportive participants who were more likely to go on to sign more online petitions. But a note of caution is sounded by Vissers et al. (2012) to those who assume that online political activity necessarily translates into offline action. Their RCT study on Belgian university students found that learning activities run online on climate change only influenced online behavior and did not change offline behavior.
Yet the evidence that an online environment can develop core political skills is stronger. A study from Hong Kong, China (Chan 2019), looked specifically at the use of a digital storytelling program run through the online platform Facebook for the development of civic identity and skills. Though not explicitly political, we include this RCT, involving 87 16 to 24 year-olds outside the formal education system, as it showed evidence of improvements in relevant skills and dispositions, namely enhanced critical thinking, along with an accompanying decline in ethnocentric views. The article by Kawashima-Ginsberg (2012) also demonstrates how online methods can stimulate political skill development. Using assessment scores to evaluate an iCivics computer-based teaching module, they showed through a clustered RCT of 1526 students in 42 schools in the US, of students aged 12 to 15 years-old, that the program was effective in improving the grades students received from writing a persuasive letter to a newspaper. During the course of our searches, we also came across several papers which help to explain why there have not been more RCTs in Citizenship Education. These largely reflect the more general concerns of applying RCTs to the education research discussed previously, but with specific reference to Citizenship Education. Some of these underline the valid, practical concerns that the demands of running a satisfactory RCT are too exacting and expensive. Bakker and Denters (2012) point out that the ideal of a classical experiment is generally unachievable, as the number of subjects in each of the treatment and control groups really has to be quite large to even out the variance in all relevant characteristics, and this is without considering whether the true unit of analysis should be the collective rather than the individual (the clustering of students within classes and schools should at least be taken into account). In a similar vein, Shek et al. (2012) note that it is very expensive to conduct randomized group trials in an adequate variety of settings to demonstrate the generalizability of a program outside a specific set of conditions. Yet there is more fundamental epistemological and ontological resistance. Mathison (2009) questions whether certain assumptions might be part of a neoliberal ideology of efficiency and commoditization within education, including the notion that accountability is necessarily good if linked to competitive marketplace practices or narrow econometric thinking. Postcolonial critiques, such as that given by Singh et al. (2018), point out that the relationship of the researcher-researched has been compared to that of the colonizer-colonized, particularly when reliant on the types of standardized measures which are a feature of all RCTs. For such reasons, decoloniality has tended to favor the transparency and inclusiveness of qualitative or participatory research praxis. Yet we also found the argument that RCTs can be a part of progressive post-positivism. Shek et al. (2012) suggest in their evaluation of a youth development course in Hong Kong that post-positivism can be understood as embracing the multiplicity of available methods, rather than valorizing certain qualitative approaches, whilst Singh et al. (2018) go on to reject that quantitative paradigms are impermeable to reflexivity and decoloniality and begin to demonstrate how the methodological principles of controlled trials can be more reflectively administered so as to properly acknowledge oppression. Bakker and Denters (2012) note the parallels between experiments and action research as a reason for optimism, in that both actively interfere in reality. This points to a possible path toward rehabilitation for controlled trials if they follow action research tenets to place the disadvantaged group as the primary stakeholder and client, which may involve minimizing the influence of preconceived policy and academic agendas. Bakker and Denters (2012) go on to suggest the design experiment methodology represents a way forward (the term "design" referring to the blueprint of a new instrument that is to be developed during the research process). Stoker and John (2009) similarly indicate that if experimentation is stripped of its black box dogmatism and researchers try to directly observe and understand apparent change, then comparison groups can still play an important role in providing policy makers with the type of evidence they respond to.

Conclusions
The number of control trials which truly address Citizenship Education for political engagement is unsurprisingly small. Not only does the field have a history of institutional abandonment and co-option, but there is some reluctance within the research community to fully embrace controlled trials. This concern is based on a desire to promote the interests of powerless and unrepresented groups, but those who champion controlled trials also share that same goal and see those groups as poorly served by a lack of understanding as to which educational methods really do work to break the cycle of political socialization which reproduces and exacerbates inequalities. Reconciling these epistemological tensions within the field will doubtless be an ongoing theme over the coming years.
It would be premature to draw too concrete conclusions, given the very limited evidence base, but the general picture is one which appears to broadly confirm the existing knowledge in the field rather than revealing new findings, underlining the role of control trials in ensuring that an existing educational method is effective. The starkest gap in the evidence base is geographical, with 17 out of the 25 studies being from the US or UK and only four studies evaluating projects from the global south, with two of these from China. This is particularly important, given that there can be no safe assumptions that findings in one cultural context will stand in another.
The studies identified are quite evenly split between those which aim to improve knowledge and skills and those which seek to change attitudes or behaviors. These two domains do not necessarily cross-pollinate, and many of the studies which showed enhanced cognitive learning did not show alterations to behavioral change, a point made most explicitly by Green et al. (2011). However, the studies do suggest some nuance is necessary with this view, as it seems that the provision of basic knowledge on civic duties, such as how to vote and why it is important, may initiate changes in attitudes and behaviors in circumstances in which this base awareness is lacking (Pang et al. 2013;Syvertsen et al. 2009). Likewise, the teaching of psychosocial or noncognitive skills, even when separated from political education, appears to yield promising results (Holbein 2017). But most of the studies which led to changes in attitudes or behaviors were essentially participatory. The clearest examples of this participatory approach is perhaps Ozer and Douglas (2013) study of a participatory research class and McDevitt and Kiousis (2006) study of simulated political discussions within families. There are also signs that the participatory approach to attitudinal and behavioral change is also applicable to online interventions, with the evidence being that active engagement (as opposed to passively viewing) and peer feedback mechanisms play a similarly critical role online, as they do offline (Smith et al. 2009;Strandberg 2015;Margetts et al. 2009), though whether online engagement translates into offline action remains in doubt (Vissers et al. 2012). The evidence also supports the effectiveness of a whole school approach (Gill et al. 2018) and of the necessity of quality teacher training (Barr et al. 2015).