Is It Working? An Impact Evaluation of the German “Women Professors Program”

: The Women Professors Program, which was initiated in Germany in 2008, aims to increase the proportion of women professors and to promote structural change in favour of gender equality at higher education institutions (HEIs). It is one of the central gender equality policies in higher education in Germany. The present study evaluates the impact of the program by estimating its causal e ﬀ ects on the proportion of women professors. By adopting a quasi-experimental approach and using a unique dataset—a long term census of German HEIs—the study proves that the proportion of women professors increased more than would have been expected in the absence of the program. Although the evaluation includes preliminary estimates of mechanisms driving the described impacts, the integration of context factors and mechanisms into the assessment of the impact of gender equality policies remains a desideratum. The study shows that the program is working, and it contributes to redressing the lack of impact studies on gender equality in science and research.


Introduction
As in many other countries, women are still underrepresented in leadership positions at higher education institutions (HEIs) in Germany.In 2007, the proportion of women among the highest-ranking professors (Grade A according to the She Figures) in Germany was only 12%, which was considerably below the European average of 19% (European Commission 2009, p. 76).Ten years later, this proportion had increased to 20%.The Women Professors Program was initiated in Germany in 2008 with the aim of increasing the proportion of women professors and fostering structural change in favour of gender equality at HEIs (Zippel et al. 2016).One of the central gender equality policies in higher education in Germany, the program is financed jointly by the Federal Government and the Laender (federal states).An amount of 150 million euros was made available for the first and second phases of the program (2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017).Funding was increased to 200 million euros for the third phase (2018)(2019)(2020)(2021)(2022).The program combines structural changes-the implementation of a gender action plan-with financial incentives to appoint women to professorships.The first two program phases have been successfully evaluated (Löther and Glanz 2017;Zimmermann 2012).
The long duration of the program gives rise to the question of its impact on gender equality and on the representation of women in leadership positions.Is the program working?How can the impact of such a gender equality program be measured?
Focusing on the central objectives of the program, this study assesses its impact on the proportion of women professors at HEIs in Germany.In the next section, I elaborate the theoretical and methodological framework of the study by reviewing the literature on impact evaluation and gender equality policies.In Section 3, I outline the program theory and context.This is followed in Section 4 by a description of the research design and data basis of the study.After presenting the empirical results in Section 5, 1 I will discuss them and draw some conclusions regarding the opportunities and challenges of impact evaluations of gender equality programs.My main argument is that the program is working in that the proportion of women professors increased more than would have been expected in the absence of the program, but further investigation is needed to understand the mechanisms driving the impacts.

Gender Equality Policies and Impact Evaluation
Impact evaluations seek to understand the role of an intervention in producing measured and described changes (Fletcher 2015, p. 9).Estimating the counterfactual is the dominant principle to evaluate impact: "Evaluators assess the effects of social programs by comparing information about outcomes for program participants with estimates of what their outcomes would have been had they not participated".(Rossi et al. 2004, p. 234).Randomized experiments are still the gold standard for measuring the counterfactual, but for practical, ethical, or political reasons, quasi-experimental designs are often used to establish causal inference in evaluation studies (Khandker et al. 2012;Steiner et al. 2009).However, evaluation research has gone beyond this approach.Evaluation studies-and the social sciences in general-regard causation as "a probabilistic rather than a deterministic concept" (Steiner et al. 2009, p. 77), and aim to identify differing patterns before and after interventions rather than to perform definitive causal analysis (Stepan-Norris and Kerrissey 2016, p. 229).Instead of attributing outcomes to an intervention, evaluations investigate the way in which an intervention contributed to outcomes (Gates and Dyson 2017, p. 31).
Gates and Dyson 2017) also pointed to the growing acknowledgment "that there are multiple ways to think about causal relationships".They described five different ways of thinking about causality, including a generative logic, which "verifies a theory-based explanation of how causal processes happen by showing how mechanisms work within particular contexts to generate outcome patterns" (Gates and Dyson 2017, p. 36).This reference to mechanisms leads to another critique of impact evaluations.Müller and Albrecht (2016, p. 291) criticized the fact that, to date, impact evaluations had examined the effects of programs, but that there was a lack of knowledge about the mechanisms driving the effects.How the programs work thus remains a black box.Program model evaluation and theory-driven evaluation are approaches to studying the mechanisms that link the elements, context, and outcomes of a program (Astbury and Leeuw 2010;Funnell and Rogers 2011;Rogers 2000).Müller and Albrecht (2016, p. 291) suggest combining both approaches: Thus, from our point of view the future of impact evaluation has to be a mixture in the sense of combining approaches of black box and white box evaluation.Consequently, we suggest an integrative approach of impact evaluation that covers the estimation of overall intervention effects and the identification of the key drivers of effects by using adequate designs and methods of empirical research.
In the present study, I will follow this suggestion by analyzing the impact of the gender equality program using a quasi-experimental design and by exploring how quantitative data can be used to obtain knowledge about the mechanisms that drive the impact.
Evaluations of the impact of gender equality policies in higher education are still scarce (Kaplpazidou Schmidt et al. 2017, p. 27;Stepan-Norris and Kerrissey 2016, p. 228f.;Timmers et al. 2010, p. 720).As noted by Kalev et al. (2006, p. 590), "whereas there has been a great deal of research on the sources of inequality, there has been little on the efficacy of different programs for countering it".One decade later, Stepan-Norris and Kerrissey (2016, p. 228) made a similar observation, namely, that "far fewer studies empirically examine how interventions may matter".Even more striking is the lack of studies measuring quantitative effects (Timmers et al. 2010, p. 722f.) the development policy context often use quasi-experimental designs (Burde and Linden 2012;d'Agostino et al. 2016;Grillos 2018;World Bank n.d.), evaluation studies on gender equality programs in science and research rely mostly on qualitative methods, surveys without control groups and pre-test/post-test measurement, and descriptive and cross-sectional data (Höppel 2016;Munir et al. 2014;Williamson 2016).
The lack of studies on the impact of gender equality interventions in higher education is due partly to the complexity of the educational system.The combination of individual, social, and structural influence factors complicates the detection of causalities (Wroblewski et al. 2007, p. 19).It is difficult to attribute changes in gender equality to programs rather than to "wider contextual trends and factors" (Kaplpazidou Schmidt et al. 2017, p. 27).Data availability and financial and time restrictions, which reduce the possibility of using control groups and pre-post measurement, also contribute to the lack of impact studies.Furthermore, impact studies on gender equality have to deal with success indicators.Gender impact and gender equality cannot be limited to the increase in the proportion of women in leadership positions.Studies on women in senior management, for example, have shown that the increase in the proportion of women in these positions has not necessarily changed structural barriers (Neale and Özkanlı 2010;Peterson 2011).As Peterson (2011, p. 626) noted, "gender equality aspects of a qualitative character have to be taken into consideration to measure real change".
There are very few impact evaluations that endeavor to measure quantitative effects.With regard to effects on the level of individual participants, two evaluations of Austrian programs for the promotion of women scientists formed a control group comprising those applicants who were not funded (see Pohn-Weidinger and Grasenick 2011;Wroblewski et al. 2007, pp. 344ff.).A research design such as this requires a sufficiently large group of rejected applications, as may be the case in individual funding schemes (see Heidler 2016).
With regard to impacts at the level of HEIs, Timmers et al. (2010, p. 732) investigated the efficacy of gender equality measures at all 14 universities in the Netherlands.They examined the correlations between implemented measures-which they classified according to individual, cultural, and structural perspectives-and changes in the Glass Ceiling Index and in the proportion of women among academic staff.In conclusion, they noted that "the relationships between policy measures and the reduction of the glass ceiling and between the policies in the cultural perspective and the increase in the proportion of women among professors reveal that the applied policy measures have been effective" (Timmers et al. 2010).
Stepan-Norris and Kerrissey (2016) evaluated the impact of an intervention-the ADVANCE Institutional Transformation Initiative-implemented at the University of California, Irvine (UCI).In order to estimate the counterfactual, they used as a control group seven other University of California campuses where no initiatives of that scale had been implemented.Using descriptive statistics, t-tests, and regression analysis of data on women's faculty representation, hiring, and retention, they could show that "ADVANCE awards have the potential to increase women's representation over and above what may occur otherwise.We find that although all the UC campuses increased women's representation, UCI under the ADVANCE program made the biggest gains" (Stepan-Norris and Kerrissey 2016).
Contrary to these studies attesting to the efficacy of gender equality policies, an econometric study on the impact of the Athena SWAN charter, an initiative aimed at advancing women's careers, in UK medical schools arrived at a negative conclusion, namely, that: "Despite a general increase in female employment and widespread adoption of the standards of Athena SWAN amongst UK medical schools there is no evidence yet to suggest that either the introduction of the Athena SWAN charter or the announcement of NIHR to tie future funding to Athena SWAN silver status has led to a measurable improvement in the careers of females employed in UK medical schools."(Gregory-Smith 2015, p. 21) The aforementioned study compared medical schools at universities with and without an Athena SWAN award by using panel data on the proportion of women faculty and calculating fixed effects and differences in difference.Gregory-Smith's findings contrast in part with those of Caffrey et al. (2016), who used a realistic evaluation approach to assess the effects of Athena SWAN in academic medicine.Based on in-depth interviews, focus groups, and participant observation, Caffrey et al.'s qualitative study at one HEI found that, although participants reported positive outcomes-for example, the creation of social space to address gender inequity-the program's implementation had reinforced gender inequalities because female staff had to carry out "a disproportionate amount of Athena SWAN work, with potential negative effects on individual women's career progression".Moreover, participants perceived the impact of the program to be "undermined by wider institutional practices, national policies, and societal norms", and early career researchers reported difficulties accessing program initiatives.The more differentiated results may stem from the different research designs and also from the fact that Gregory-Smith did not integrate his impact evaluation into a theoretical framework of gender equality.Without program theory, it remains unclear whether the Athena SWAN charter, which was implemented in the whole university, can increase the percentage of women faculty in the medical schools, as investigated by Gregory-Smith.
The literature review reveals that there is a lack of impact studies on gender equality in science and research, and especially of studies that calculate quantitative effects.With the aim of contributing to filling this gap in research, I will evaluate the quantitative impact of the Women Professors Program in Germany using a quasi-experimental design to estimate effects.Thus, my first research question is: Has the program had an impact on the proportion of women professors?Furthermore, literature about evaluation research and gender equality programs reflect on the limitations of impact studies.To understand whether and in what way a program works, we need to combine different approaches to assess its impact.Exploring the program theory holds the possibility of looking for mechanisms and context factors of the program.Taking these reflections into account, I seek to understand mechanisms and context factors, and to examine how these elements can be integrated into the quantitative research design.Hence, my second research question is: What are the mechanisms that contribute to the effects?My overall aim is to estimate the impact of a gender equality program and to understand the knowledge gains and the limitations of a quantitative impact evaluation of gender equality policies.Rossi et al. (2004, p. 64) define program theory as "the set of assumptions about the manner in which a program relates to the social benefits it is expected to produce and the strategy and tactics the program has adopted to achieve its goals and objectives".The program theory for the Women Professors Program (see Figure 1) shows that the program aims to produce structural changes and to improve gender equality at HEIs by providing start-up funding for professorships filled by newly recruited women.Funding is subject to the positive assessment of a gender action plan. 2 If an external evaluation committee assesses the gender action plan positively, the HEIs have the right to apply for start-up funding (for five years) for up to three tenured professorships filled by newly recruited women.The funding can be used for regular professorships, or for so-called "early appointments to a professorship" (Vorgriffsprofessuren), that is, appointments to professorships that have been created in addition to currently occupied chairs that are due to become vacant in the foreseeable future on the retirement of the incumbent.In the case of the funding for a regular professorship, the funds already allocated for this position in the HEI's budget-which are freed up by the program funding-must be spent on gender equality measures.The aim of the program is to improve gender equality in higher education and to promote structural change.Furthermore, the program seeks to increase the participation of women at all qualification levels, and especially in leadership positions and professorships.The present study focuses on the program objective to "increase the proportion of women professors".I will examine whether changes in women's representation in professorships can be causally attributed to the Women Professors Program.In doing so, I will focus on meso-level effects-that is, effects at the level of the individual universities-and on the macro-level impact, that is, the impact on the higher education system as a whole.Concerning the mechanisms, the program theory presupposes that an increase in the proportion of women professors can be achieved both through the professorships supported by the start-up funding and through the implementation of a gender action plan.The startup funding may give HEIs an incentive to appoint women as professors.The implementation of the gender action plan may have a more indirect effect by raising awareness, increasing transparency in recruitment procedures, and other similar measures.Moreover, start-up funding also has an indirect effect: in the case of regular professorships, the resources freed up by the external funding of the professorship must be invested in gender equality measures and thus contribute to the implementation of the gender action plan.This possible mode of action is of great importance, as the funding of regular professorships is the dominant form of funding in the program, accounting for 60% of the funding in the first program phase (PP I) and 69% of the funding in PP II (Löther and Glanz 2017, p. 32).I will analyze whether the start-up funding has had direct and/or indirect effects on the proportion of women professors.

Program Theory and Context
Contextual factors may foster or hinder the impact of a program.The Women Professors Program is part of a bundle of innovative gender equality efforts initiated since 2006, which also includes the Research-Oriented Standards on Gender Equality of the German Research Foundation (DFG), Germany's leading research funder, and the gender equality standards of the German Excellence Initiative, which is aimed at making Germany a more attractive research location. 33 All three measures support a strategy shift toward more structural change in organisations rather than focusing on women scientists (see Weber 2017;Riegraf and Weber 2013;Simon 2013;Zippel et al. 2016).On the other hand, precarious working and qualification conditions, such as the high proportion of fixed-term contracts, the almost exclusive orientation of scientific careers toward obtaining a professorship, and the intensified competition for professorships, are factors that may The aim of the program is to improve gender equality in higher education and to promote structural change.Furthermore, the program seeks to increase the participation of women at all qualification levels, and especially in leadership positions and professorships.The present study focuses on the program objective to "increase the proportion of women professors".I will examine whether changes in women's representation in professorships can be causally attributed to the Women Professors Program.In doing so, I will focus on meso-level effects-that is, effects at the level of the individual universities-and on the macro-level impact, that is, the impact on the higher education system as a whole.Concerning the mechanisms, the program theory presupposes that an increase in the proportion of women professors can be achieved both through the professorships supported by the start-up funding and through the implementation of a gender action plan.The start-up funding may give HEIs an incentive to appoint women as professors.The implementation of the gender action plan may have a more indirect effect by raising awareness, increasing transparency in recruitment procedures, and other similar measures.Moreover, start-up funding also has an indirect effect: in the case of regular professorships, the resources freed up by the external funding of the professorship must be invested in gender equality measures and thus contribute to the implementation of the gender action plan.This possible mode of action is of great importance, as the funding of regular professorships is the dominant form of funding in the program, accounting for 60% of the funding in the first program phase (PP I) and 69% of the funding in PP II (Löther and Glanz 2017, p. 32).I will analyze whether the start-up funding has had direct and/or indirect effects on the proportion of women professors.
Contextual factors may foster or hinder the impact of a program.The Women Professors Program is part of a bundle of innovative gender equality efforts initiated since 2006, which also includes the Research-Oriented Standards on Gender Equality of the German Research Foundation (DFG), Germany's leading research funder, and the gender equality standards of the German Excellence Initiative, which is aimed at making Germany a more attractive research location. 3All three measures support a strategy shift toward more structural change in organisations rather than focusing on women scientists (see Weber 2017;Riegraf and Weber 2013;Simon 2013;Zippel et al. 2016).On the other hand, precarious working and qualification conditions, such as the high proportion of fixed-term contracts, the almost exclusive orientation of scientific careers toward obtaining a professorship, and the intensified competition for professorships, are factors that may run counter to the objectives of the Women Professors Program.Women are more frequently affected by fixed-term and part-time contracts (see Beaufaÿs and Löther 2017;Courtois and O'Keefe 2015;Nikunen 2014).Furthermore, they have less occupational and non-occupational resources at their disposal with which to strengthen their position in the competition for academic advancement.Funken et al. (2013, p. 50f.) described this situation as a control paradox: for women early career researchers, increased competition together with greater efforts to increase gender equality in German science and research leads to an almost cynical situation where gender equality programs suggest that they have good academic career opportunities, but the intense competition acts as a structural brake on equal opportunities.When interpreting the results of the impact study, these context factors must be taken into consideration.

Research Design and Data Basis
The research approach that I have chosen for the evaluation of the impact of the Women Professors Program is a rigorous impact analysis based on the potential outcome model and the estimation of the counterfactual (Khandker et al. 2012;Meyer et al. 2008;Rossi et al. 2004;Rubin 1979;Wolbring 2014).In order to be able to causally attribute change (here: the increase in the proportion of women professors) to an intervention (here: the Women Professors Program), the results must be compared with the conditions that would be expected if the program had not existed: "The impact measurement must therefore take the counterfactual into account" (Meyer et al. 2008, p. 23f.).Because the counterfactual cannot be observed, evaluators use various methods to estimate it (Khandker et al. 2012;Legewie 2012;Morgan and Winship 2010).Experiments with random assignment to the treatment and control groups are viewed as the "gold standard" for assessing causal effects (Rossi et al. 2004, p. 237).However, in many evaluations, especially in the field of education, this research design is not possible (see Steiner et al. 2009 for experiments in evaluation).
The Women Professors Program was conceptualized as a competitive procedure.Participation of HEIs in the program is determined by self-selection (submission of a gender action plan) and external selection (assessment of the gender action plan by the evaluation committee).Due to these circumstances, the counterfactual was estimated using a quasi-experimental design: At the level of individual HEIs (1), the non-participating HEIs were the control group.The estimate was made by means of preand post-treatment measurements and the calculation of difference-in-differences. I calculated the differences between the proportion of women professors before and at the latest possible date during the intervention 4 for HEIs participating in the program (treatment group) and non-participating HEIs (control group).At the level of the higher education system (2), the counterfactual status was estimated by extrapolating the increase in the proportion of women professors during the years immediately preceding the introduction of the program to the duration of the program.
Information about the participation of HEIs in the program was based on administrative data provided by the DLR Project Management Agency, 5 which implements the program on behalf of the Federal Ministry of Education and Research (BMBF).The ministry published lists of positively evaluated universities after the decisions were made in 2008, 2009, 2013, and 2014.Data on academic staff were obtained from the Federal Statistical Office (Destatis), which provides long time series broken down by individual universities, categories of staff (e.g., professors), and gender.

4
The program was still ongoing when the evaluation took place.5 Project management agencies implement projects in a professional and organizational capacity, for example, by providing advice, administrative handling, and technical support for projects.The contracting authorities are mainly ministries at federal and/or state level.The project management agencies are located at research institutions and other organizations.The DLR Project Management Agency, which is part of the German Aerospace Center (DLR), is an example of such an agency.The treatment group comprised HEIs whose gender action plans had been positively assessed, and who were thus participating in the program.The control group consisted of those HEIs that did not submit a gender action plan or whose plan had not been positively assessed.They were assigned to the control group via a combination of self-selection (non-application) and external selection (non-positive assessment), although the data available 6 did not allow me to distinguish between self-selection and external selection.Taking the two program phases into account, there were four groups in total: 1.
No application or not positively assessed 2.
Positively assessed only in Program Phase I 3.
Positively assessed only in Program Phase II 4.
Positively assessed in both program phases The analysis was carried out both by comparing all four groups, and in a binary comparison of the treatment group of participating HEIs (Groups 2-4 combined) and the control group of non-participating HEIs (Group 1).Unless otherwise stated, the treatment group thus included those HEIs whose gender action plans were positively assessed in Program Phase I and/or II.
The study population comprised all HEIs that could potentially have participated in the program.The 268 member HEIs of the German Rectors' Conference (HRK) were included in the sample as an approximation of the population, supplemented with five HEIs whose gender action plans were assessed positively but who are not members of the HRK (see Löther and Glanz 2017, p. 75f.).Thus, the evaluation covered almost 95% of academic staff and students at all German HEIs.The data for the population of HEIs were therefore available almost in the form of a census survey.Table 1.gives an overview of the population and the participation of HEIs in the Women Professors Program.

Comparison of the Treatment and Control Groups
Due to the long time series of higher education statistics, the data on the proportion of women professors at the individual HEIs were available in a panel design.The data were based on the data for the "professorships" category.This category includes the highest positions of tenured professorships as well as non-tenured junior professorships.As data showing the junior professorships separately were not always available for individual HEIs, this group could not be excluded from the calculation of the proportion of women professors.Using the difference-in-differences approach, I calculated 6 Information on the number of HEIs that applied but were not positively assessed was available differentiated by federal state and HEI type but not by individual HEI.
Soc. Sci. 2019, 8, 116 8 of 18 the difference between the proportion of women professors at t 1 (2007) and t 2 (2015).t 1 was the year before the start of the program; t 2 was the last year for which data were available during the evaluation.A causal effect of the program may be assumed if the proportion of women professors at participating HEIs increased more than that at non-participating HEIs.
Table 2 shows that the proportion of women professors at participating HEIs increased by an average of 6.4 percentage points between 2007 and 2015, whereas at non-participating universities it rose by only 4.6 percentage points, and thus by 1.8 percentage points less than at participating HEIs.HEIs that were successful in both program phases were able to increase the proportion of women professors particularly significantly (by 6.9 percentage points).Thus, an effect of 1.8 percentage points between 2007 and 2015 and an annual average of 0.22 percentage points in increasing the proportion of women professors could be attributed to the Women Professors Program.

Selection Bias
Estimating the counterfactual by comparing treatment and control groups presupposes that there are no systematic differences between the groups that may influence the results.In contrast to random assignment to treatment and control groups, such differences in observed and unobserved characteristics-and thus selection biases-are to be expected in the case of self-selection and external selection.Hence, the possibility of selection bias must be borne in mind in the case of the Women Professors Program.In fact, as can be seen from Table 3, the treatment and control groups did indeed differ in terms of HEI type and size, as well as regional distribution.
The most significant difference between the treatment and control groups was the HEI type and, related thereto, the HEI size, operationalized as the number of students enrolled.Universities 7 were overrepresented among those HEIs whose gender action plans were positively evaluated, and universities are, on average, larger than other HEIs.In addition, HEIs from federal states in which the position of Equal Opportunities Officer is the incumbent's main job participate disproportionately frequently in the program.The reasons for these differences between participating and non-participating HEIs are the financial and human resources available for the preparation of a successful gender action plan and for the administrative implementation of the program (see Löther and Glanz 2017, pp. 23ff.).There are hardly any conspicuous features in the distribution between the western and eastern German states and Berlin.The Glass Ceiling Index (GCI) in 2007 was higher for participating HEIs than for non-participating HEIs.This contradicts the assumption that the HEIs that participate in the program are primarily those that were already pursuing a successful gender equality policy beforehand.

7
Universities have the right to confer doctoral degrees.Universities of applied sciences focus on teaching professional skills.They specialize in specific fields-mainly engineering, computational sciences, business, and social work.The difference-in-differences approach used in this evaluation is a way of statistically controlling selection effects (Khandker et al. 2012, pp. 71-78;Caspari 2009;Legewie 2012, p. 135f.), assuming that the unobserved characteristics are constant over time and do not correlate with the intervention.According to Khandker et al. (2012, p. 76) however, there are doubts about these assumptions.Therefore, a subgroup analysis and a regression analysis were conducted to clarify whether the described differences between the treatment and control groups influenced the effect of the Women Professors Program on the increase in the proportion of women professors.
The subgroup analysis (see Table 4) revealed that the differences between the treatment and control groups did not distort the effect of the Women Professors Program.In almost all subgroups, the proportion of women professors rose more strongly at the participating HEIs (the treatment group) than in the control group.Only in the case of colleges of art and music did the treatment and control groups not differ.In Berlin, the higher increase among non-participating HEIs was probably due to the small number of cases.
The regression analysis (Table 5) confirmed that participation in the Women Professors Program had an independent effect on the increase in the proportion of women professors from 2007 to 2015, irrespective of the type of HEI or the regional distribution.In Model 1, participation in the Women Professors Program resulted in a 1.3 percentage point higher increase in the proportion of women professors (non-standardized regression coefficient).However, this influence was not statistically significant.At the same time, the regression analysis showed that the regional distribution and the proportion of women professors at the beginning of the program exerted a large and statistically significant influence.The proportion of women professors increased at the HEIs in eastern German 8 In most Laender, the equal opportunities officer is elected among members of the HEI, and the incumbent is fully or partially released from her other tasks and duties (time-off model).In some Laender (e.g., Berlin and Lower Saxony), the post of equal opportunities officer is externally advertised, and it is the incumbent's main job (main-job model).9 The Glass Ceiling Index (GCI) is a relative index comparing the proportion of women in lower positions to the proportion of women in top academic positions see European Commission-DG Research (2016, p. 89f.).Here, I calculated the CGI by dividing the proportion of female students by the proportion of female professors: the higher the value, the stronger the glass ceiling effect.
states to a much lesser extent than at the HEIs in the western German states or Berlin.The variable "Eastern German HEI" led to a 2.1 percentage point lower increase (Model 1).HEIs that had a low proportion of women professors in 2007 were able to increase this proportion more strongly by 2015 than HEIs whose proportion of women professors was already high in 2007.The regression analysis thus showed, on the one hand, that the Women Professors Program had an independent effect on changes in the proportion of women professors.On the other hand, the program did not have the strongest influence compared to other variables included in the model.Overall, the variables examined explained only a small proportion of the variance in the changes in the proportion of women professors that existed between the HEIs.
In this study, the regression analysis did not aim to explain changes in the proportion of women professors, but rather to estimate the impact of the Women Professors Program on these changes.The independent influence of the Women Professors Program was confirmed by the regression analysis.Both the regression analysis and the subgroup analysis (Table 4) revealed the particular significance of the program for the eastern German HEIs that participated in the Women Professors Program: the proportion of women professors increased strongly compared to the non-participating HEIs, and the effect of the Women Professors Program was particularly high in this region.

Contextual Factors
Contextual factors that foster or hinder gender equality could not be operationalized in such a way that they could be integrated into the analysis.Data on employment conditions differentiated by individual HEIs were not available.Moreover, as German HEIs typically do not recruit professors from within their own institutions, the employment conditions at a particular HEI-for example, the proportion of temporary or part-time academic staff-cannot be linked to the proportion of women professors at the institution.
The evaluation of reports on the implementation of the German Research Foundation's (DFG) Research-Oriented Standards on Gender Equality (German Research Foundation (Deutsche Forschungsgemeinschaft DFG)) could be considered to be an operationalization of beneficial contextual factors-at least for HEIs with the right to confer doctorates (i.e., universities).In fact, there were correlations between these assessments and participation in the Women Professors Program: in both program phases, universities participating in the Women Professors Program were rated higher on average in the DFG assessment.11However, as the reports assess the implementation of the Standards on Gender Equality, these assessments constitute an outcome variable and cannot be included in the regression analysis as a factor influencing the increase in the proportion of women professors.
The TOTAL E-QUALITY (TEQ) award and certification as a "family-friendly higher education institution" (familiengerechte Hochschule) after successfully undergoing the "Audit familiengerechte Hochschule" offer limited possibilities to assess the influence of other gender equality policies.12These two certificates were included in a second model in the regression analysis (see Table 5).The analysis showed that receipt of the TEQ award or certification as a family-friendly HEI had only a minor influence on the increase in the proportion of women professors.

Mechanisms
With the aim of understanding the mechanisms driving the effects of the program, I examined which program elements produced the described effects.According to the program theory (Figure 1) an increase in the proportion of women professors can be achieved both through the professorships supported by the start-up financing and through the implementation of a gender action plan.A first approach to understanding the mechanisms of the program is to look at the start-up funding.A high influence of this part of the program may be assumed if the proportion of women professors at a HEI increases with the number of funded professorships.
For both program phases, there was a weak to medium correlation (r = 0.285) between the number of funded professorships and the increase in the proportion of women professors.However, the effect was apparent only for regular professorships (r = 0.263); no correlation, and thus no effect (r = 0.079) was observed for so-called "early appointments to a professorship" (Vorgriffsprofessuren).The lack of correlation in the case of these professorships may be due to the lower number of HEIs using this type of appointment (79 HEI vs. 129 HEIs for regular professorships).Furthermore, more than half of the HEIs that appointed women to regular professorships used solely this type of funding, whereas this was the case for only a quarter of the HEIs that appointed women to additional professorships.My main argument is, that this finding suggests that the promotion of appointments has both direct incentive and indirect effects via the financing of gender equality measures and the implementation of the gender action plan.With the available data, it is not possible to weight the two effects.Although the above correlations do not establish causation, they suggest an effect of the start-up funding on the increase in the proportion of women professors.
In the present quantitative part of the evaluation, it was not possible to analyze how exactly the funding works in the HEIs.However, the qualitative part of the evaluation provided some starting points for future research (Löther and Glanz 2017): the appointment procedures followed the usual rules.Only after completion of the appointment procedure and only where the recruitment conformed to the program rules, did the HEIs decide to apply for funding under the Women Professors Program.The funding does not make the HEIs more competitive to recruit women, because it cannot be used to provide higher remuneration or better equipment to the recruited professor.Nevertheless, the recruitment committees might feel some pressure or incentive to hire women in order to obtain program funding.Concerning the indirect effects, the evaluations reveal discursive effects on gender awareness.Moreover, the program strengthens the university management's responsibility for equality.Additionally, many gender actions plans provide for measures for transparent and gender-oriented recruitment procedures.These are ways in which the program might have affected the recruitment of women professors.To obtain an in-depth understanding of how the program works in the HEIs, qualitative data on recruitment procedures and other decision-making processes are needed.

Real and Expected Proportion of Women Professors
The impact of the program at the level of the higher education system was estimated by means of a time series analysis (see Rossi et al. 2004, pp. 291ff.on time series analysis in evaluations), comparing the expected and the actual proportion of women professors.The expected proportion of women professors is the counterfactual situation that would have existed in the absence of the Women Professors Program.
The time series analysis was based on data for the "professorships" category in higher education statistics provided by the Federal Statistical Office (Destatis). 13As the first appointment of women to tenured professorships is funded under the Women Professors Program, the proportion of women professors was calculated without non-tenured junior professorships. 14Whereas the analysis at the level of individual HEIs included only the members of the German Rectors' Conference, data for the time series analysis included the total number of professors at all German state, private and church-run HEIs (universities, universities of applied sciences, colleges of art and music).The proportion of women professors in the two samples did not differ.Soc. Sci. 2019, 8, 116 13 of 18 In order to determine the estimator for the expected proportion of women professors, the average annual increase before the start of the program was extrapolated to the program duration.The year 1985 was chosen as the starting point for the time series because, in that year, an equality mandate ("to eliminate the disadvantages existing for women scientists") was included in the German Higher Education Framework Act.Since then, the promotion of gender equality has been a task of HEIs.Furthermore, 1985 was the year in which the first women's representatives were appointed at German HEIs, namely at the Universities of Hamburg, Kassel, and Oldenburg (Blome et al. 2014, p. 97).The proportion of women in permanent professorships rose from 5.1% in 1985 to 16.0% in 2007 (see Figure 2).The average annual increase was 0.47 percentage points.To calculate the expected proportion of women professors, this value was multiplied by the number of years since the start of the program and added to the initial value in 2007.This resulted in an expected proportion of women professors of 19.7% for the year 2015.Thus, the actual proportion of women professors-22.1%-was2.3 percentage points 15 higher than expected; the average annual was 0.29 percentage points.
tenured professorships is funded under the Women Professors Program, the proportion of women professors was calculated without non-tenured junior professorships. 14Whereas the analysis at the level of individual HEIs included only the members of the German Rectors' Conference, data for the time series analysis included the total number of professors at all German state, private and churchrun HEIs (universities, universities of applied sciences, colleges of art and music).The proportion of women professors in the two samples did not differ.
In order to determine the estimator for the expected proportion of women professors, the average annual increase before the start of the program was extrapolated to the program duration.The year 1985 was chosen as the starting point for the time series because, in that year, an equality mandate ("to eliminate the disadvantages existing for women scientists") was included in the German Higher Education Framework Act.Since then, the promotion of gender equality has been a task of HEIs.Furthermore, 1985 was the year in which the first women's representatives were appointed at German HEIs, namely at the Universities of Hamburg, Kassel, and Oldenburg (Blome et al. 2014, p. 97).The proportion of women in permanent professorships rose from 5.1% in 1985 to 16.0% in 2007 (see Figure 2).The average annual increase was 0.47 percentage points.To calculate the expected proportion of women professors, this value was multiplied by the number of years since the start of the program and added to the initial value in 2007.This resulted in an expected proportion of women professors of 19.7% for the year 2015.Thus, the actual proportion of women professors-22.1%-was2.3 percentage points 15 higher than expected; the average annual was 0.29 percentage points.Retirements could be an interfering factor in these calculations.On average, women professors are younger than men professors and thus, the proportion of men professors is higher among retirees.
13 Available online: https://www.destatis.de/DE/Publikationen/Thematisch/BildungForschungKultur/Hochschulen/PersonalHochschulen.html.The "professors" category includes the salary groups C2-C4 and W1-W3-professors as well as full-time visiting professors.The fact that the figure was 2.3% rather than 2.4% is due to rounding differences.Retirements could be an interfering factor in these calculations.On average, women professors are younger than men professors and thus, the proportion of men professors is higher among retirees.However, this pattern did not affect the time series analysis: retirements of professors peaked in 2005/2006 and dropped dramatically after 2009.As a consequence, one would rather expect a higher increase in the proportion of women professors in the years before the program-with high rates of retirements.In fact, retirements did not disturb the time series analysis in such a way that the impact of the program was lower.
The effects estimated by means of time series analysis are not attributable solely to the Women Professors program.Simultaneous gender equality policies, in particular the DFG's Research-Oriented Gender Equality Standards, may also have contributed to this change.Because the various initiatives and programs were implemented during the same time period, their effects cannot be differentiated in the time series analysis.
Contextual factors that may run counter to the objectives of the Women Professors Program could not be integrated into this analysis either.During the period under review, the proportion of temporary and part-time scientific staff below the level of professor increased, and the ratio of early career researchers to professorships deteriorated (Konsortium Bundesbericht Wissenschaftlicher Nachwuchs 2017).Employment conditions and increasing competition are among the reasons why more women than men end their scientific careers after obtaining a doctorate (Funken et al. 2015).It may be assumed that the proportion of women professors would have increased more had these conditions not existed.However, these contextual factors could not be quantified for the time series analysis.

Discussion and Conclusions
By using long time series data on academic staff differentiated by individual HEIs, it was possible to carry out an impact evaluation of the German Women Professors Program employing a quasi-experimental research design.At the level of individual HEIs, I calculated changes in the proportion of women professors at participating and non-participating HEIs before and during the second program phase, thus using pre-and post-treatment measurements and difference in differences to estimate the counterfactual.The analysis showed that participating HEIs were able to increase the proportion of women professors by an average of 6.4 percentage points between 2007 and 2015, whereas the figure for non-participating HEIs was 4.6 percentage points.Thus, an effect of 1.8 percentage points between 2007 and 2015 and an annual average of 0.22 percentage points in increasing the proportion of women professors can be attributed to the Women Professors Program.Observable selection effects such as type of institution and the region did not influence this result.At the level of the German higher education system as a whole, I estimated the counterfactual by means of time series analysis.This analysis showed that the proportion of women professors rose more strongly than would have been expected based on the annual growth rate during the years preceding the introduction of the program.Concerning my first research question, the impact evaluation demonstrated a causal effect of the Women Professors Program on the increase in the proportion of women professors.
One could argue that an effect of 0.22 percentage points (comparison of participating and non-participating HEIs) and 0.28 percentage points (time series analysis) as an annual average is small compared to the amount of money spent on the program.However, in relation to the average annual increase of 0.76 percentage points since the start of the program, the effect of the program is about one third of the regular increase and thus quite high.Furthermore, the program aims not only to increase the percentage of women professors but also to foster structural changes in favour of gender equality.This study focused on the quantitative effect of the program, but qualitative aspects should be taken into consideration when measuring change towards gender equality (Peterson 2011;Neale and Özkanlı 2010).
Facilitating and impeding contextual factors could not be satisfactorily included in the analysis.Their influence thus remains an open research question and the integration of such factors into the quantitative analysis is a research desideratum.Nevertheless, interactions can be described between the Women Professors Program, the DFG's Research-Oriented Standards on Gender Equality, and the gender equality standards of the German "Excellence Initiative" (see Heidler 2017;Schacherl et al. 2015).Combining the results of the impact evaluation of the Women Professors Program with those of other studies reveals distinct mechanisms of the different initiatives.First, the way in which the initiatives foster gender equality depends on the context, in particular, the type of HEI.University managers attribute the greatest influence on their gender equality policies to the DFG's Standards on Gender Equality, followed by the Women Professors Program and the gender equality standards of the Excellence Initiative.Universities of applied sciences are neither members of the DFG nor are they eligible for funding under the Excellence Initiative.This is why managers of this type of HEIs consider the Women Professors Program, in particular, to be a strong driving force (Schacherl et al. 2015, p. 177).
Furthermore, the program theories of the aforementioned initiatives differ.The Research-Oriented Standards on Gender Equality link gender equality to the award of research funding by the DFG-Germany's leading funding organization.The members of the DFG, the universities, comply with the standards on a voluntary basis.However, the required reporting on gender equality increases the pressure on HEIs that apply for funding.As an effect of the standards, "university administrators and other key players regard successful gender equality policy as being linked to the prestige and competitiveness of the university" and "gender equality has increasingly come to be regarded as a prerequisite for innovative and excellent research" (see Heidler et al. 2017;Heidler and Reichwein 2018, p. 8).The Women Professors Program, on the other hand, exerts an effect through financial incentives and (in the case of regular professorships) the freeing up of financial resources for gender equality measures.HEIs have described the (indirect) provision of financial resources for gender equality and awareness-raising and reflection on gender equality as the most frequent positive changes brought about by the program (Löther and Glanz 2017, p. 64).The present impact analysis shows that start-up funding for the appointment of new women professors has had both direct effects and indirect effects (via the funding of gender equality measures for regular professorships) on increasing the proportion of women professors.
Concerning the second research question, it must be stated that the integration of facilitating and impeding contextual factors into the quantitative analysis remains a matter for further research.Nevertheless, combining the impact evaluation with other studies and research designs revealed the way in which different gender equality initiatives contribute to the increase in the proportion of women professors.By basing the impact evaluation on the program theory, it was possible to assess some mechanisms that link the start-up funding for professorships, gender equality measures, and the increasing proportion of women professors.Future research is needed to deepen our knowledge of the mechanisms that drive the effects of gender equality programs in higher education.In particular, we want to know how exactly these programs work within the HEIs.This research should bear in mind that mechanisms might work differently under specific conditions, characterized, for example, by the institution type, region, and previous experiences in gender equality.Case studies, document analysis, and a survey have provided insights into the different ways in which the Women Professors Program has been adopted at HEIs: depending on the status of the gender equality policy at the institution, the program has been used flexibly to test new instruments and expand activities or to implement measures for the first time and establish gender equality structures (Löther and Glanz 2017).
Another challenge of the impact evaluation was the question as to how the two levels of impact-the increase in the proportion of women professors and structural changes-interact.To further investigate quantitative effects, structural changes, and mechanisms, one must start with an in-depth analysis of the program theory.This analysis enables a better understanding of how, and under what circumstances the program produces specific results in the HEIs (see Gates and Dyson 2017, p. 36;Haunberger and Baumgartner 2017).The impact evaluation presented here offers a basis for these investigations.The analysis shows that it is possible to estimate causal effects of a gender equality policy program using a quasi-experimental research design, thus realizing an evaluation approach that is only rarely applied in the policy field of gender equality in science and research.

2
In the second program phase(2013)(2014)(2015)(2016)(2017), HEIs that had participated in the first phase had to submit documentation of the implementation of the first gender action plan.
Changes in the Proportion of Women Professors at Participating and Non-Participating HEIs 5.1.1.Establishment of Treatment and Control Groups

14
Junior professorships have existed in Germany since 2002. 15

Funding:
The evaluation of the Women Professors Program was funded by the German Federal Ministry ofEducation and Research, grant number 01FP1501 (2015-2017).
. Whereas evaluations of gender equality interventions in 1The results will be published in German in Zeitschrift für Evaluation, issue 1/2019.

Table 1 .
Overview of All HEIs, Submissions, Positive Assessments, and Funding.

Program Phase I Program Phase II Number Percentage of All HEIs (%) Number Percentage of All HEIs (%)
Source: DLR Project Management Agency administrative data (seeLöther and Glanz 2017, p. 23).

Table 2 .
Comparison of Means: Increase in the Proportion of Women Professors between 2007 and 2015 by Participation in the Women Professors Program.
Source: Federal Statistical Office; DLR Project Management Agency administrative data; own calculations.Note: PP I = Program Phase I; PP II = Program Phase II.Significance test: Comparison of four groups: p < 0.05; comparison of participating and non-participating groups: p < 0.05.The differences were significant at the 5% level.

Table 3 .
Characteristics of Participating and Non-Participating HEIs (HEI Type, Regional Distribution, HEI Size, and Proportion of Women Professors).
Note.PP I = Program Phase I; PP II = Program Phase II.

Table 4 .
Comparison of Means: Increase in the Proportion of Women Professors between 2007 and 2015 by participation in the Women Professors Program; Calculated Separately for Types of HEIs, Region, and Equal Opportunities Officer Job Model.

Table 5 .
Linear Regression: Changes in the Proportion of Women Professors between 2007 and 2015.