Implementation and Impact of a Summer Mathematics Bridge Program for Multilingual Learners: Evidence from Randomized Controlled Trials

Chu, Haiwen; Hamburger, Leslie; DePiper, Jill Neumayer

doi:10.3390/educsci16050796

Open AccessArticle

Implementation and Impact of a Summer Mathematics Bridge Program for Multilingual Learners: Evidence from Randomized Controlled Trials

by

Haiwen Chu

^*

,

Leslie Hamburger

and

Jill Neumayer DePiper

^*

WestEd, 730 Harrison Street, San Francisco, CA 94107, USA

^*

Authors to whom correspondence should be addressed.

Educ. Sci. 2026, 16(5), 796; https://doi.org/10.3390/educsci16050796 (registering DOI)

Submission received: 26 January 2026 / Revised: 13 May 2026 / Accepted: 15 May 2026 / Published: 19 May 2026

(This article belongs to the Section Curriculum and Instruction)

Download

Browse Figure

Versions Notes

Abstract

We report results from the implementation and impact evaluations of a three-week summer bridge mathematics program. This program was designed to challenge and support rising ninth grade students and in particular Multilingual Learners. We report the extent to which implementing teachers enacted the activities as intended and note challenges that implementing teachers reported with the ambitious program for mathematics learning in a summer setting. For the impact study, we implemented two experiments with assignment at the student level: a classic randomized controlled trial with 37 students in the analytic sample and a delayed-intervention randomized controlled trial with 114 students in the analytic sample. Our impact analysis was a single-level linear regression model with demographic and prior math achievement covariates. While the impact on student math outcomes was positive, ranging from 0.03 to 0.19 standard deviations, these effects were not statistically significant due to sample attrition.

Keywords:

mathematics learning; summer learning; Multilingual Learners

1. Introduction

The focus of this study is on students who are bureaucratically designated as “English Learners” in the United States, as defined by federal law. Few studies have considered innovative designs for summer learning to accelerate and amplify their mathematics achievement despite rapid demographic shifts, persistent distances in mathematical achievement, and antiquated approaches to mathematical learning.

This group of English Learners is a large and growing proportion of the secondary school population. Nationally, the enrollment in grades 6–12 of English Learners increased by more than half between 2011 and 2021, to reach more than 2.2 million (Irwin et al., 2024). In California, where this study took place, the population of students who are legislatively defined as “long-term” English Learners was more than 200,000 students in the 2023–24 school year, or just under half (47%) of all English Learners in grades 6–12 (English Language Proficiency Assessment, 1999; California Department of Education, n.d.).

While this population growth has been extensive, persistent differences can be observed in standardized achievement outcomes. For eighth graders in the 2023–24 school year, only 1% of more than 23,000 “long-term” English Learners in California met or exceeded standards on the state math test, compared to more than 37% of 216,000 students who had never been classified as an “English Learner” (California Department of Education, n.d.). Nationally, on the 2024 National Assessment of Educational Progress [NAEP] in 8th grade mathematics, the distance in achievement between English Learners (5% proficient or above) and their non-English Learner peers (29% proficient or above) was 24 percentage points (NAEP Data Explorer, n.d.).

Despite their long tenure in U. S. schools, “English Learner”-classified students have not had adequate access within education systems to challenging and well-supported opportunities to learn mathematics. Examining these distances in achievement requires a critical look at the opportunities English Learners are offered to learn challenging mathematics. It has been well-documented how math curriculum materials tend to frame English Learners monolithically and in terms of deficits, rather than focusing on their immense potential to become multilingual participants in learning (de Araujo & Smith, 2022; Chu & Hamburger, 2022; de Araujo et al., 2018). Furthermore, despite concerted efforts across the nation to move toward more ambitious models of learning and teaching, many instructional materials primarily serve as a “delivery mechanism” (Choppin et al., 2022) for reproducing received procedures mechanically. This transmission-focused orientation stands in contrast to developing new approaches and generating mathematical ideas from that exploration, an orientation known as the “thinking device” relationship between teachers and instructional materials, which is further in line with efforts to increase the rigor of mathematical learning opportunities in United States classrooms (e.g., Stein et al., 2009). Given other patterns of course placement amounting to leveled and exclusionary tracking (Umansky, 2016) and negative agency in terms of course placement and access (Thompson, 2017), English Learners are frequently denied the high-challenge, high-support learning that they deserve and require in mathematics.

In a meta-analysis of 75 peer-reviewed studies published between 2000 and 2015 on mathematics teaching and learning with K-12 ELs, authors found only two interventions with students in grades K-12 (de Araujo et al., 2018). Orosco et al. (2013) studied the math problem solving of six second grade students and found that a math comprehension strategy intervention improved students’ abilities to solve complex word problems. Crawford (2013) studied a middle-grades math intervention that focused on math vocabulary and illustrations to conceptualize language in an online program. Analysis of pre- and post-assessments for students in the experimental and comparison groups found that all students made significant gains.

Against this general curriculum and course placement context, we can consider more specifically the summer as a time to offer ambitious learning opportunities. In a recent meta-analysis of mathematics in summer programs, Lynch et al. (2023) were able to estimate the weighted impact of summer math programming to be +0.10 on math achievement, an estimate in line with previous syntheses that drew upon earlier and less methodologically rigorous studies (Cooper et al., 2000; Lauer et al., 2006). Further study of program characteristics showed that a large majority (78%) of the 37 included studies focused on remediation, whereas by contrast those programs that had a mathematics-specific focus (rather than including other academic subject areas) were estimated to have a between-study effective size (+0.18) that was marginally significant. Meanwhile, reiterating the problematic and undemanding nature of many instructional materials, programs that implemented textbook exercises had a between-study effect size of −0.11 that was statistically significant. This meta-analysis did not, however, disaggregate effect sizes for subgroups such as English Learners. It is against the backdrop of mathematics focus in summer programs for English Learners that we developed our summer intervention.

1.1. Intervention Design Principles

In order to counter these prevailing conditions in mathematics education and to better serve Multilingual Learners in particular, we approached the design of a summer bridge program with a more ambitious, socioculturally driven vision of mathematics learning and teaching. Our iterative design, development, and improvement work was further grounded in a design-based research approach that enacted three key principles (Anderson & Shattuck, 2012; Moore et al., 2018; Chu & Hamburger, 2022):

Center learning on cross-cutting concepts that tie together mathematical ideas;
Design for participation through sustained talk and reciprocal interactions and multiple opportunities for growth over time;
Focus purposefully on language to increase student choice and agency.

For two summers leading up to the experiments described in this study, we intensively studied implementation in a small number of settings with field trials involving teachers and a smaller number of students. These cycles of iterative development informed both the design principles above and the practical design of the materials. In those two development summers we tested and iteratively improved the usability and feasibility of the materials by conducting observations of their use and working closely with testing teachers to solicit feedback to enhance the materials as well as the educative nature of the materials to enhance usability by teachers with less intensive in-person support. Below, we unpack these design principles as they were built into the final version of the intervention in the experiments.

1.1.1. Centering on Cross-Cutting Concepts

First, a focus on cross-cutting concepts is even more ambitious than having students solve individual problems by developing their own approaches and methods. Individual problems and cases contribute to students’ growing understanding of critical concepts, such as equivalence and transformation that span different subdomains of mathematics, such as algebra, geometry, and statistics (Cook et al., 2022; Chu & Jackson, 2025). For example, two equations such as x + 2 = 5 and x = 3 are equivalent because they can be transformed into each other by performing the same operation on both sides. Meanwhile, two geometrical figures are similar when they share the same aspect ratio (e.g., 4:3) or a common scale factor transforms one into another. Rather than considering mathematics as a parade of procedures, our approach offers students opportunities to understand how the development of mathematical ideas, procedures, and practices is driven by a core set of generative ideas, such as equivalence and transformation or patterns of growth and change.

This approach to conceptual understanding that emphasizes connections across subdomains of mathematics is compatible with and extends notions of cognitive demand as moving beyond mere memorization or mechanical execution of procedure without connections toward making more connections or doing mathematics in the sense of creating new approaches in the face of substantial ambiguity (Stein et al., 2009). Indeed, the connection potential for students as they explore mathematics is directly related to the framing of instructional materials as more of a “thinking device”—not a means of transmitting established procedures but rather to co-develop them and in the case of cross-cutting concepts to connect them around seemingly disparate domains of mathematics, such as algebra and geometry above.

1.1.2. Designing for Participation

Second, consonant with a sociocultural approach, the development of knowledge is a consequence of—not a prerequisite for—quality learning opportunities (Walqui & van Lier, 2010; Walqui et al., 2025). By participating in activity, students develop knowledge through interaction with their peers, eventually incorporating into their individual repertoires what they first experienced in the social plane. Such quality learning opportunities require careful planning to ensure that all students have meaningful opportunities to participate that change over time as they become more central participants in classroom activity (Hamburger & Chu, 2025, 2019).

Our approach to designing for participation thus unfolds on three instructional scales, ranging from the micro-, meso-, and macro-levels of student experience (Chu & Hamburger, 2022). While the micro-level of interactions focuses on the interface between individual turns and interactions, the meso-level consists on how discrete activities connect with broader sequences of activities. Finally, the macro-level encompasses connections between modules, which may take a week of instruction with the overall course.

At the level of individual turns building into interactions, quality participation consists of sustained talk and reciprocal interactions. Talk is sustained when individual students are able to elaborate in depth and at breadth about their mathematical ideas. Then, as they interact with one another, those interactions are reciprocal when they build on, challenge, revise, and ultimately negotiate shared understandings. These quality interactions are powerful when they are focused on the learning of mathematics. It is these quality interactions that are embedded in bounded classroom activities that purposefully structure students’ turn-taking within small groups, requiring that all students contribute to a shared discussion.

1.1.3. Focusing Purposefully on Language

Finally, in terms of language, our approach to designing the intervention was driven by the idea that educators’ implicit theories of language are often directly enacted within their classroom practice, often to students’ detriment (Heritage et al., 2015). Viewing language as a patterned system of systems and a vehicle for thinking (Vygotsky, 2012), we design for language with the aim of expanding student autonomy over time, while integrating all modes of speaking, listening, reading, and writing harmoniously. We design sequences of activities with the idea that as students develop more sophisticated conceptual understandings and practice, their uses of language will grow more monologic, authoritative, and technical (Derewianka & Jones, 2023). In short, if students understand how language works, they will be able to put language to work for them in learning mathematics (Chu & Evans, 2025).

One key element of integrated language development is that it approaches language as a system of systems for making and sharing meaning, rather than something outside of students’ activity that they need to learn about. The design of the small group activities also fully integrated speaking, listening, reading, and writing in meaningful ways that focused on the exploration of mathematics.

1.2. Intervention Design Components

We built these design principles into the three components of the intervention:

Rich, engineered texts that students read and make sense of with a partner;
Small group activities that require extensive quality interactions with peers;
Writing extension activities.

1.2.1. Rich, Engineered Texts for Partner Reading

For example, in the introductory module “Patterns of Growth and Change,” students read a variety of interesting texts that began by contextualizing patterns in nature, architecture, and clothing. Other texts were more technical in nature, offering students different ways to distinguish between different numerical patterns of growth within sequences.

1.2.2. Small Group Activities Requiring and Supporting Peer Interactions

Students also sorted a wide variety of images into categories for which they generated labels that they wrote on sticky notes. Then, groups of students engaged in problem solving activities about growing geometrical patterns that they modeled as numerical sequences.

1.2.3. Writing Extension Activities

The module culminated with students writing in extended ways about other patterns and sequences as they made connections across a wide variety of different numerical sequences that grew in complexity and the kinds of contextual explanations for growth that could be described in greater detail and explained through explicit links to the underlying context.

Put together, these components addressed the design principles because the focus was on a cross-cutting concept of patterns of growth and change, which is an enduring theme in secondary mathematics, whether in algebra, geometry, or statistics. The activities were designed so that all students could participate and contribute to meaning-making and sharing activities that fostered their growing participation over time. Finally, the approach to language was inductive, starting with students’ everyday notions of patterns of repetition and building toward more sophisticated uses of language to describe change, make predictions, and explain growth in contextualized ways. The language development explicitly built from oral and dialogic uses of language toward more written and monologic ways of using language in increasingly authoritative and technical ways. Similar designs were included for the other two modules of the intervention, which focused on the study of graph theory and cases of equivalence and transformation that cut across algebra, geometry, and statistics (Chu et al., 2025).

1.3. Professional Learning and Educative Materials for Implementing Teachers

In addition, knowing that the ambitious nature of the intervention would require substantial teacher expertise, we integrated educative elements into the teacher-facing materials so that implementing educators would fully understand the shifts they were being asked to enact with students (e.g., Davis et al., 2017). Knowing further that no amount of print materials by themselves would be sufficient to fully support the development of teacher expertise, we designed a three-day in-person workshop in which teachers experienced the very kinds of quality learning experiences they were to facilitate for their students, analyzed how those opportunities enacted important theoretical ideas such as a sociocultural emphasis on participation in activity, and prepared and planned how to implement them.

This professional learning workshop also included an orientation to the educative features of the curriculum, including extensive student work samples, vignettes of student talk and action from previous field trials, and conceptual briefs on key topics such as definitions (Chu et al., 2022), language (Chu & Evans, 2025), and equivalence in school mathematics (Chu & Jackson, 2025). Teachers worked through selected activities that they would be doing with students, to experience the connected design principles of conceptual focus, participation by design, and language focus and to reflect on and plan for how they would implement the activities with students effectively. For example, they experienced activities in which they observed a group of their colleagues unscrambling the terms of a visually growing sequence of geometrical figures before participating in a jigsaw project in which different groups became experts on different patterns that they then taught to others. The same activities were part of the intervention curriculum.

1.4. Research Questions

We pose two connected research questions about the intervention that was a three-week summer bridge program, named Reimagining and Amplifying Mathematics Participation, Understanding and Practices (RAMPUP). RAMPUP was designed to promote conceptual understanding, participation by design, and purposeful focus for English Learners.

To what extent did teachers implement the RAMPUP intervention as designed?
What was the impact of participating in RAMPUP compared to the comparison condition of:
- Participating in business-as-usual remedial mathematics instruction for all students?
- Participating in a delayed cohort who experienced summer learning loss among English Learners?
- Participating in a delayed cohort who experienced summer learning loss for all students?

2. Materials and Methods

2.1. Implementation Study Design and Data

To study implementation of the intervention at a scale of 16 sections at six campuses in two districts, we collected the following triangulated sources of data: attendance records and observations from professional learning workshops, pre- and post-surveys administered to teachers before and after the three-day professional learning workshop, implementation logs, daily teacher reflections, exit interviews, classroom observations, and student work samples.

For the three-day professional learning workshop conducted in each district, we collected attendance of the implementing teachers and took observation notes of their interactions with one another. Each day comprised five hours of activity for all participants and one additional hour for implementing teachers to familiarize themselves with the materials. Teachers who participated in the workshop included teachers who were not implementing that summer but wanted to learn more about the innovative approach to learning that was designed into the intervention, and teachers who may implement the intervention in future summers. All participants took a survey at the beginning and end of the institute, with questions addressing knowledge about instructional design and language development as well as beliefs about English Learners.

During the summer, implementing teachers completed daily logs that listed every activity included in the modules. Each activity within the logs had a response field for teachers to report whether they had implemented the activity as written, modified, or if they did not implement. There were also open response fields for teachers to elaborate and specify any additional activities that they added to supplement the day’s activities for students. Daily reflection questions included in the teacher’s manual were closely tied to content and activities, rather than being generic reflection questions. Each day’s reflection focused on teachers’ observations of student thinking, language development, and other ideas that were surprising to them. We collected both kinds of data daily from teachers through the SurveyMonkey digital platform. At the end of the summer, we conducted an “exit interview” over Zoom with each of the teachers to have them reflect on their experiences and to identify key advice and feedback for future implementation.

In order to understand the nuances of classroom implementation and in particular student talk within small groups, we observed each of the 12 implementing teachers once during the summer. Specifically, we chose to observe a full class period (2–3 h, depending on the site) either in the first module (patterns) or the last module (equivalence). During the class period, individual observers took notes that recorded student and teacher talk verbatim, while taking digital photographs of ephemeral work products or moments, such as a sorting task in progress that would not otherwise be captured in students’ written work. Scratch notes taken in the classroom were timestamped periodically so that we could isolate time segments of meaning activity, either as designed into the intervention modules or added by teachers. It was this segmentation of the class period into distinct activities that we used to corroborate teacher reports from their implementation logs.

Finally, teachers collected all student work, including the booklets they had been supplied and any scratch or graph paper that they had written on, as well as other collaborative products such as posters. These were collated by section and shipped to the research team. For the purposes of this article, we do not engage in a deep analysis of student work, but rather selectively look at student work in the relevant sections to assess the extent to which activities that were reported as implemented can be traced into students’ written work artifacts.

2.2. Impact Study Design and Data

Our original experimental design, adapted from an evaluation conducted by Snipes et al. (2015), was a delayed-intervention randomized controlled trial (RCT). Because of scheduling constraints that we describe below, we ended up implementing two different experimental designs in summer 2024. These designs were necessitated by the real-world requirements and constraints faced by districts, as it was logistically impossible in one district to carry out the delayed-intervention design that we believe is more ethical.

District A had a year-long schedule with a summer break of six full weeks interspersed with holidays that lasted from early June to the middle of July. Therefore, we shifted to a classic RCT in which we compared the summer bridge program to business-as-usual remedial mathematics provided through the Advancement Via Individual Determination [AVID] Algebra Readiness program (AVID, 2023). The AVID program was designed to cover 6th through 8th grade standards in 15 four-hour “units” in order to prepare students with foundational readiness skills. The summer session in District A, however, was 14 days of instruction in two-hour blocks, so in the comparison condition at most half of the intended content would have been covered. We conducted a convenience content review of the published sample unit from the AVID Algebra Readiness program. Using the criteria identified by Choppin et al. (2022), we determined that it was more on the “delivery mechanism” end of the continuum. More broadly, the fact that all of the mathematics standards were at the middle school level reflects the program’s remedial intent when implemented with rising 9th grade students.

In District B, we carried out our delayed-intervention RCT as originally designed, comparing students in the first staggered cohort with students in the second cohort. Within the second cohort of students, the comparison condition was summer learning loss associated with not taking any math classes or participating in summer math learning programs (cf. Snipes et al., 2015). In District B, the intervention was implemented over 13 sessions, each of which was scheduled to be three hours in length.

Because of small sample size in District A, our primary research questions are different for the two districts. In District A, the group of interest is all students, as there were only 24 English Learners in the group of recruited students and three English Learners in the analytic sample. In District B, the primary research question is about the English Learners in the sample, while an exploratory research question is about all students. Both studies were pre-registered at the Registry of Efficacy and Effectiveness Studies (the study in District A is registered as 18643.2v1, and the one in District B is registered as 18643.1v1).

Randomization in both districts was conducted following a procedure adapted from Snipes et al. (2015) to create a spreadsheet for assignments:

First, we established the expected number of students in each of the study sites.
Then for each site, we created a set of unique “slots” equal to the expected students plus 10. We did this separately for English Learners and for students who are English-only or reclassified fluent English proficient at each of the sites.
For each of these subgroups, we created a random variable and assigned a random number to each of the slots. We then determined the median of the random number variable and assigned the intervention condition (RAMPUP in District A, first session in District B) to all slots with values below median and the comparison condition (business-as-usual remedial math in District A, second session in District B) for all values at or above the median.
Finally, as students applied by site determined by their future high school, they were placed in the next available slot and assigned treatment status based upon the slot into which they were placed.

The advantage of this randomization method is two-fold. First, students and their families will immediately learn their status. Second, it enables recruitment to continue up until the beginning of the program. District staff members completed the placement and notification process using a spreadsheet created by the research team. A script in the R statistical software platform, version 4.4.3, is publicly available to generate the randomization spreadsheet for a specified number of sites (randomizeR, 2024). All other statistical analyses were also conducted in R.

District staff recruited a total of 175 students in District A and 253 students in District B to participate in the experiment. Participants were recruited by district staff who went to eighth grade mathematics classes to give informational sessions, and all parents completed a consent process for their children to participate in the intervention, including random assignment. While we targeted a large proportion of students to be English Learners, the only exclusion criterion was we did not accept any students who were mandated to attend credit recovery or remediation classes as determined by local district policy.

Of these students, however, only 24 out of the 175 students in District A were current English Learners, while approximately one third (87) of the 253 students in District B were English Learners. Given these initial sample sizes, we conducted a power analysis at the 0.05 significance level and powered at 0.80, assuming covariates could explain 60% of the variance for different subgroups of students. Based upon 80 students in the English Learner sample from District B, the minimum detectible effect size (MDES) was 0.40. For the full sample in District A of approximately 160 students, the MDES was 0.28. For the full sample of 240 students in District B, the MDES was 0.23 (Dong & Maynard, 2013).

The outcome measure for the study was the Mathematics Initiative Readiness Assessment (MIRA), a reliable measure of readiness for high school mathematics, which was found to have internal consistency of 0.83 as measured by Cronbach’s alpha (Briggs, 2024). The MIRA is a computer-administered test that was given at the beginning and end of the summer session to serve as a pre- and post-test of student mathematics achievement.

2.3. Analytic Approaches

We estimate the impact of the intervention, which was individually assigned to students, on the math outcome measure by conducting a linear regression. The variable of interest is the treatment group (intervention or comparison) and to improve the precision of estimates we include covariates including prior achievement and demographic characteristics.

Our approach to estimating the impact of the intervention is slightly different across the two experiments by design and circumstance. First, due to logistical problems with enrollment rosters across multiple feeder districts that send rising 9th grade students to summer programming in District A, the pre-intervention assessment of MIRA was compromised and unable to be completed. We therefore do not have directly comparable pre-test scores from MIRA for District A, and so relied instead on the state standardized math test, the 8th grade Smarter Balance Assessment Consortium (SBAC) math test both as a covariate and to determine whether groups were equivalent at baseline. For District B, although we had not pre-registered SBAC test scores to establish baseline equivalence, we performed checks where required by attrition. In both districts, the SBAC tests were used as covariates in the final regression model. Because the intervention and comparison groups were concurrently run in District A, the outcome was simply the MIRA post-test score.

For District B, because the experiment is a delayed-intervention RCT, the outcome variable is different for the two groups. For the intervention group (first cohort), the outcome variable is the post-test on the MIRA which was taken at the end of the three-week program. For the comparison group (second cohort) who participated in the intervention in the second window, the outcome variable is the pre-test on the MIRA, which represents the counterfactual condition of no summer mathematics learning. The two cohorts were scheduled in two consecutive three-week windows, so the post-test for the intervention group took place the last two days of the program (a Thursday and Friday) while the pre-test for the comparison group took place the following week on the first two days of the program (Monday or Tuesday).

3. Results

3.1. Implementation Evaluation

3.1.1. Professional Learning Workshops

We report results from each of the data sources before corroborating our claim with each other. In terms of attendance, a total of 27 teachers attended the separate three-day workshops we conducted in two districts. In total, 11 out of the 12 implementing teachers attended all three days of the workshop, and a single teacher in District A missed the first day because of issues securing a substitute. During the observations of professional learning, all elements of the workshops were implemented as designed, as captured by the agenda for each day and noted in presenter or observer logs.

Teacher professional learning included working through many of the activities that students experience, including solving problems that required systematic listing and the construction of cases and arguments. Some implementing teachers struggled with the more open-ended nature of the problems they were to solve, including how to approach complex problems by considering simpler cases and linking cases that build inductively to generate a coherent, meaningful sequence. Other teachers struggled with record-keeping as they worked with manipulatives and tracked growing answers within a sequence of related problems. Although the facilitator tried to name at a high level useful strategies that would also benefit students (e.g., listing systematically, considering simpler cases, shifting representations), observers noted that for a majority of observed teachers these approaches did not appear to be familiar or easy to pick up.

That said, teacher reflection at the end of the three-day workshops did state changes in beliefs across the three focal areas of concepts, participation, and language with a sampling of reflections below, in response to the prompt, “I used to think… and now I know…” adapted from Elmore (2011):

“I used to think that students would become bored with lessons which don’t require exact answers and now I know that students can be challenged and engaged with problems that require thinking rather than achieving the correct answer.”
“I used to think that it is very hard to teach students who are not fluent in English, and now I know if I structure the interactions it gets easier and students might learn more.”
“I used to think I needed to give EL students all materials in their native language to participate, and now I know that EL (and all students) benefit from engagement/discussion in activities that will help them understand the concepts.”

Furthermore, the pre- and post-surveys included multiple-choice knowledge items about quality learning opportunities with English Leaners, and among the 21 respondents who had matched pre- and post-responses, the average percent correct increased by 37 percentage points, from 44% correct to 81% correct. There was substantial variation across questions; however, with by-question improvement ranging from 24% to 48%, this variation reflects greater growth in the role of language in learning (an increase from 19% to 67% correct) as compared to quality learning opportunities in general (an increase from 67% correct to 90% correct). In addition, participants changed in their beliefs as captured by Likert scale responses to statements, with the most movement in response to “English Learners need to build their basic language skills before they can understand disciplinary language.” While just over half (11 out of 21, or 52%) of matched respondents disagreed with that statement at the beginning of the workshop and none strongly disagreed; by the end of the three-day workshop a larger majority disagreed or strongly disagreed (13 out of 21, or 62%), and five respondents (24%) strongly disagreed. In short, there was evidence that the professional learning workshop changed teachers’ knowledge and beliefs.

3.1.2. Classroom Implementation of Intervention

In terms of implementation with students, there was a high degree of reported implementation of the 20–30 activities within each module, with well over 80% of activities reported as “implemented, as written” by teachers (Figure 1), where teachers reported modifications were primarily in the form of supplemental activities or media, such as icebreakers or short videos sourced from streaming services that were about the topic students were learning. Teachers also reported adding physical experiences such as climbing stairs (directly relevant to a problem context) or giving students manipulatives. One teacher wrote, “Students need dominos or real stairs to count combinations. I provided stairs right outside my class for 2 groups and dominos for other two groups. We made all possible combinations for group of 6 dominos/stairs.” (Teacher B4). This statement seems to reflect a belief that students would not be able to construct and inscribe representations of the ideas without first having a physical experience. Other reported modifications were in terms of instructional format, as some teachers reported shifting away from small group discussions and more toward whole-class presentations and discussions.

Teachers’ daily reflection questions generally expressed their desire to challenge students but also their unfamiliarity with this more ambitious, open-ended form of teaching. In some instances, teachers were surprised by the sophistication of student thinking, as with one sorting activity having to do with graphs of alphabetic codes (modulo 26).

Most groups chose to Sort and Label based only on the appearance of the graphs. Only 1 group attempted to make a distinction between the graphs that were shifted by addition as opposed to multiplication. I was particularly impressed because the group was able to confirm that one of the images was definitely shifted by multiplication because there was a point at the origin.
(Teacher B5)

For the most part, teachers faithfully reported student thinking, including trouble that students had and how the teachers tried to address these ambiguities by simplifying the task at hand or doing direct teaching of procedures or approaches. For example, one teacher reported that while the students did share the data and examples, about the number of vertices, edges, and faces of a connected graph, the relationships between those quantities were presented through a video, “After students worked individually, they were allowed to come to the whiteboard, draw their versions and count V/E/F. After the activity we watched YouTube video about Euler’s Formula.” (Teacher B4).

On the other hand, in some groups, students did do the work of discovering the formula:

Before the step it up activity students believed the pattern to be E/2 = V and V + 1 = F. After the activity the students were able to spot that the pattern better fit E − V + 1 = F.
(Teacher B6)

In other cases, although students were given the formula, they were supported to do further investigations and sense-making around the conditions under which the relation applies:

Students checked their math with the Euler’s characteristics to make sure their numbers were correct and asked for help when they were wrong. They also noticed that these numbers were no longer “correct” when the shape was broken into separate pieces. One student got −1 face and was completely amused.
(Teacher B4)

More broadly in terms of participation, teachers generally saw value in group work, such as the small group sorting tasks that generated new categories and labels. There may, however, be some ascriptions of more fixed dispositions and forms of participation, as one teacher described:

The students that are good with leading enjoyed taking the lead and explaining to their group how to do this activity. I saw a leader in each group explaining the process to their classmates.
(Teacher B10)

In terms of classroom observations, we completed 12 complete observations in the summer of 2024, of which eight aligned perfectly with the teachers’ implementation logs. There were four sessions where there were mismatches, ranging from 33% to 75% of observed activities, and all of these mismatches were classified by the observer as “implemented but modified” as opposed to “implemented, as written” reported by the teacher. The actual nature of those modifications was mixed in terms of quality.

For example, in a “Novel Ideas Only” task, students first brainstorm in small groups a list ideas in response to a prompt (in this case, “When I heard the word ‘pattern’ I think of…”) and then as a whole group they are supposed to share so that each group reads the prompt and then their list of ideas, but they do not repeat any ideas that have already been mentioned in order to encourage all groups to listen carefully. Teacher A2 in implementing this task had each small group share out one idea at a time, while still not repeating ideas, which can be much harder for students to follow in terms of flow. While she had noted that task as “implemented, as written,” we determined from the observer notes that it was modified. Because of potential issues with the flow of the activity and not knowing whether a group was done with generating new ideas, this modification did not necessarily improve the task as originally designed and written.

To give another example of modification that may not have carried out the spirit of the activity, teacher B6 was observed to implement one of the readings with the students discussing the reading after the entire passage was complete, in contrast to the modeled directions which had students periodically pause with a partner along the way in order to make sense of the text as it was unfolding. While the teacher marked this task as implemented as written, based upon observer notes we judged that it was modified, with potentially less sense-making opportunity for students.

By contrast, some modifications may have improved learning opportunities. Teacher A2 had students, after they had sorted and labeled different pictures with everyday patterns on them, go to the document camera to share their categories publicly with the whole class, requiring each student to speak. This modification may be beneficial for students, who were developing confidence to present their results to the whole class while also seeing the ideas of all the other groups.

All told, there was complete agreement between observations and logs in a large majority of cases (84% or 57 out of the 68 observed activities). From these data, we can be reasonably confident that teachers made their best efforts to implement the materials as written or intended.

In our coding and discussion of the classroom observation data, we did notice some issues within the quality of implementation, which we categorized as: proceduralizing, note-giving, inconsistency in connecting one activity the following ones, and in many cases no clear modeling of the steps and structure that students were supposed to take within relatively complicated or intricate small group activities.

While the level of discourse between students was relatively high within small groups of students, there was a large amount of inconsistency in terms of mathematical ideas being highlighted and connected to the main idea of the session of the module. This pattern was the case in the lesson observation of teacher B8, who had marked all four observed tasks as “implemented as written.” The observer noted that in the implementation, the teacher did not attend to the higher-level ideas from each task or connect one task to the next through public, whole-class discussion.

By contrast, on the higher end, in one observed session led by teacher B1, students were modeling handshakes in a variety of representations, a total of five different representational approaches were shown by the teacher to the class using a document camera (artistic sketches, circular diagrams with arrows, horizontal rows of “bumps,” arrays of counts, and matrices with checks and double-counting). Connections and correspondences between different approaches, however, were not made, a practice thought to be powerful in developing students’ mathematical understanding and whole-class discussions (Stein et al., 2008). And in the classroom of teacher B5, who had described students’ work in sorting graphs in the context of secret alphabetic codes and invertible functions, the sorting task was observed to be a lively task in which all students contributed and their interactions were both sustained and increasingly reciprocal in terms of building on, challenging, negotiating, and co-constructing ideas and labels for different groups of graphs. These varied in focus, from overall geometrical features to the algebraic form of the underlying functions.

Finally, with regard to the writing extension activities, which were meant to be a culminating synthesis of each module’s activities, drawing upon daily reflections that students had completed throughout, teachers reported that they implemented the writing extension activities with students for a majority of the modules, with the last module most likely to be omitted given logistical issues at the end. We randomly sampled a few sections to read through student work and were able to verify that students did indeed complete writing activities. An initial analysis of the quality of their writing, however, suggested that for many of them the writing was more a flat recount of what they had done, rather than an opportunity for them to make connections across different cases or problems they had explored with an eye to other mathematical topics as we had intended. A more comprehensive analysis of the quality of student writing as related to other implementation variables is outside the scope of this article but could provide more evidence of the quality of the implementation in terms of desired student outcomes.

3.2. Impact Evaluation

Below we present the characteristics of the two conditions in the two districts at baseline (Table 1). We then apply the What Works Clearinghouse [WWC] v5.0 standards for group designs (WWC, 2022) to assess attrition bias and establish, where necessary, baseline equivalence.

3.2.1. District A

Attrition Bias. In District A, as seen in Table 2, total attrition was 79% and differential attrition was 26%, so it is not possible to meet the What Works Clearinghouse (WWC) group design standards without reservations. Below, we address baseline equivalence that can ameliorate this large attrition if the analytic samples are equivalent at baseline (WWC, 2022).

Baseline Equivalence. For the 8th grade Smarter Balanced Assessment Consortium (SBAC) math scaled score, Hedges’ g = −0.24, meaning intervention students had lower achievement at baseline than comparison students. We will make a statistical adjustment by including the SBAC math score as a covariate in the final model. With this adjustment, the study is eligible to meet WWC group design standards with reservations.

Main Impact Analysis. We regressed student achievement outcomes on covariates and group assignment:

Y_{i} = a_{0} + b_{1} T_{i} + \sum b_{C} C_{i} + e_{i}

(1)

In the above equation, i indexes students, Y is the outcome of mathematics achievement, T is a dichotomous variable indicating assignment to the intervention or comparison group, and C is a vector of covariates including prior academic achievement and demographic characteristics including gender and race/ethnicity. The key covariate here is the SBAC math 8th grade scaled score, which we recentered with the sample mean into standard deviation units. Because the sample was predominantly Latino (26 out of 37 students), with seven Asian, one Black, two White, and two or more races students, we only use a dichotomous flag to indicate Latino ethnicity.

The coefficient of interest is on the treatment dummy variable, which is estimated at 0.20 with a standard error of 0.13, meaning that it is not statistically significant (p = 0.14) (see Table 3). The point estimate of 0.20 would be generally considered a small effect, but remains non-significant. SBAC math scores, however, were a significant predictor of the summative outcome of achievement as measured by the MIRA. Indeed, using the adjusted means from the regression model and the pooled standard deviation yields a Hedges’ g of 0.19.

3.2.2. District B English Learners

Attrition. In Table 4, we see how in District B, among English Learners, total attrition was 47% and differential attrition is 5.8%, which lies outside the optimistic boundary required by the secondary mathematics protocol (WWC, 2015). Therefore, the study does not meet WWC group design standards without reservations, and we must assess baseline equivalence.

Baseline Equivalence. Using the baseline measures on the MIRA pre-test (Table 1), we obtain Hedges’ g = 0.02. For the SBAC covariate, however, the two groups are not equivalent at baseline, with Hedges’ g = 0.26, so the study cannot meet WWC Group Design Standards. These differences suggest that while the groups were equivalent on the proximal measure, they were not equivalent on the state standardized measure, with the intervention group scoring higher in a way that cannot be accounted for by statistical adjustments. While this study does not meet standards, we will still estimate the impact on the MIRA outcome using regression-adjusted means.

Main Impact Analysis. We use the same workhorse equation above, but in this case, given the demographic characteristics of the analytic sample, we distinguish three racial-ethnic groups as shown in Table 5. For this analytic sample of English Learners, we also included the English Language Proficiency Assessments for California (ELPAC), rescaled into standard deviation units, as a covariate. Inclusion in the treatment group was not a statistically significant predictor of the outcome variable. Adjusting the means within the regression model yields a Hedges’ g of 0.15.

3.2.3. District B—All Students

Attrition. The analytic sample consisted of 114 students, with 55 in Cohort 1 and 59 in Cohort 2. The attrition across the different phases of the study is shown in Table 6. Total attrition is 45% and differential attrition is only 2.8%, falling within the optimistic but outside of the cautious boundary, as required by the secondary mathematics evidence protocol (WWC, 2015). Therefore, the study cannot meet WWC group design standards without reservations, and we must assess baseline equivalence.

Baseline Equivalence. For the intervention group, Hedges’ g = −0.24 for the MIRA outcome, indicating that at baseline they started substantially lower than the comparison group. However, the two groups were closer to equivalent at baseline on the SBAC measure, with Hedges’ g = 0.06. Therefore, a statistical adjustment will be necessary in order for the study to meet WWC group design standards with reservations.

Main Impact Results. We yet again use our workhorse equation, though in this case there are more ethnic groups because the group of all students is more diverse than the group of English Learners in the district. As shown in Table 7, however, neither cohort assignment nor other covariates have statistically significant coefficients, with the exception of SBAC math score and the Latino demographic variable. A one standard deviation increase in SBAC math was associated with a 0.25 standard deviation increase in the outcome variable. Being Latino was also a statistically significant predictor and associated with lower scores on the final outcome score though the estimate of this coefficient was similar in magnitude (−0.43) to that for Asian students (−0.36).

In Table 8, we summarize the effect sizes across the three experiments that we were able to conduct in our two partner districts, reporting the effect size as Hedges’ g using adjusted means from our regression models. While none of these results are statistically significant, there does appear to be some variation across contexts that might need to be further studied. We note further that the minimum detectible effect sizes (MDES) for experiments of this type with these sample sizes are substantially larger in magnitude, with an analytic sample of 40 students giving a MDES of 0.58 and a sample of 100 students giving a MDES 0.36, assuming a significance level of 0.05, power of 0.80, and covariates explaining 60% of the variance (Dong & Maynard, 2013). These MDES were two to four times larger than the largest effect size we were able to estimate.

4. Discussion

In this discussion, we consider three main questions. The first is interpreting the null results in terms of potential limitations of implementation or research design. Second, we revisit and interpret the results around implementation as connected with the aims of the project. Finally, we consider implications of this study more broadly for the field of enrichment of mathematics learning for Multilingual Learners.

Limitations of the study stemmed from implementation and design. In terms of implementation, there were constraints on resources to observe more intensely across the summer, including during the second module, but these concerns were largely addressed by the comprehensive implementation logs and teacher reflections. Student attrition was also much higher than expected, which lowered the power of the impact estimates. Such attrition may be unavoidable in summer programs that are not mandated (as many remedial programs or credit recovery may be) or which are not prestigious accelerated courses with high school credit attached, compared to general elective credit. We also were not able to fully gauge the treatment contrast within District B for the summer learning loss group. Given the demographics of that district and the proliferation of tutoring and after-school programs in the communities served by that district, it is possible that some students in the delayed cohort would have participated in mathematics learning during the summer. This possibility is less likely for English Learners, as the two cohorts were equivalent at baseline (Hedges’ g = −0.02 for the MIRA outcome), but when the group of all students was considered, the difference was much larger and right at the boundary of what WWC standards consider correctable by statistical adjustment (Hedges’ g = −0.24 for the MIRA outcome).

In terms of design, there may be further differences between the two groups, as for the English Learner cohorts in District B, Hedges’ g = 0.26 for the intervention group versus the comparison group on the SBAC measure. Although we had requested a survey question about this on the pre-intervention MIRA of Cohort 2 because we had anticipated this issue, unfortunately a miscommunication with our vendor meant that the question was deployed to the post-intervention MIRA for both cohorts. Although we attempted afterward to track down the students in Cohort 2, this retrospective data collection proved logistically impossible given district constraints and limited resources once the new school year had begun.

The two districts were partners chosen for convenience given the difficulty of finding research partners in the lingering aftereffects of the COVID-19 pandemic, and furthermore the demographics of the two districts were quite a bit different from each other. While District A had an analytic sample that was 70% Latino (26 out of 37 students), the analytic sample had very few current English Learners (three students, all of whom were Spanish speakers), meaning that it was not possible to estimate the impact of the intervention on the mathematics achievement of English Learners. On the other hand, the language diversity of District B meant that it was not necessarily representative of English Learners nationally, as only 54% of the English Learners (25 out of 46) were Latino Spanish speakers.

While the effect sizes reported as Hedges’ g were not statistically significant, they were in the vicinity of what a systematic review of summer math programs found, which was an average effect size of +0.10 (Lynch et al., 2023). It is possible that if the analytic sample sizes were larger, that the true effect could be detected. An analytic sample of 1600 would yield an MDES of 0.09. Future attempts to scale tests should attend further to ameliorating attrition or recruiting a larger sample of English Learners, which may require districts to lean heavily into multilingual liaisons, as District B did more successfully than District A.

As for the implementation, the RAMPUP summer program was implemented at the activity level by teachers with a relatively low degree of modification. Whether or not the implementation achieved the more ambitious vision of mathematics learning is less certain, as there are indications that the more open-ended, problem-driven, and student-centered approach may have been outside the comfort zone of many teachers’ practice. Although individual activities may have been implemented, it is not certain that they were consistently connected for students to make deeper connections and contrast mathematical cases. That is, teachers may have been willing or even eager to enact a new way of learning and teaching, but some evidence from observations suggests that this new approach was not yet regular practice for them. A further related issue is potentially associated with summer instruction that tends to focus on remediation or acceleration in the sense of pre-teaching future standards from the next grade level rather an ambitious focus on cross-cutting ideas. In this sense, given how many teachers may view their role as aligned to “covering” state standards in mathematics, there may have been a mismatch in terms of the fit of the intervention to the teachers’ aims (Hill et al., 2018). Given the lingering effects of the COVID-19 work, this study informs the complexity of transforming summer learning into a more ambitious space that is not focused on remediation or pre-teaching of future content. While the impacts were not statistically significant, evidence from the professional learning indicates changes in teacher’s knowledge and beliefs about learning mathematics and language for English Learners.

Given the ambition of the program, it may be that teachers need more experience implementing the activities in the curriculum. In exit interviews, two District A teachers (A1 and A2) noted that the second class that they taught each day in the afternoon went more smoothly than the first, a common phenomenon during the school year. In District B, two teachers (B1 and B4) taught the program in both cohorts, and both noted in their exit interviews that the second iteration was smoother. A training period or more concerted practice or rehearsals of facilitation as part of professional learning may benefit future tests of this intervention. Furthermore, on campuses where there were multiple sections being implemented, teachers reported that they collaborated with one another and were better able to make sense of how to facilitate as well as share adjustments that enhanced student experiences. At scale, fostering more professional community around implementation may improve the quality of learning opportunities for students.

Finally, while most of the material is not compatible with the standard 8th or 9th grade curriculum, precisely because we chose a stance of non-remediation and also to not teach ninth grade content in advance, the first module on patterns is adjacent enough to 9th grade content that we plan on modifying it into a week-long introductory unit on ideas about functions and arithmetic and geometric sequences that may be appropriate to test as a bounded ninth grade intervention at the very beginning of the school year. Such a modified experiment that is more within the rhythm of the regular school year may be able to rigorously detect the impact on student achievement outcomes that we found in our observational data from classroom visits.

Author Contributions

Conceptualization, H.C. and L.H.; methodology, H.C. and J.N.D.; software, H.C.; investigation, H.C.; writing—original draft preparation, H.C.; writing—review and editing, L.H. and J.N.D.; funding acquisition, H.C. and L.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Institute of Education Sciences, United States Department of Education, through grant number R305C200008 to WestEd. The opinions expressed are those of the authors and do not represent views of the Institute or the U.S. Department of Education.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Review Board of WestEd (protocol code 2020-10-6, approved 7 May 2024).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Data will be made available upon request.

Conflicts of Interest

The authors declare no conflicts of interest. Although the RAMPUP curriculum is licensed for reuse with attribution and no modification under a Creative Commons license, there is no financial benefit that would accrue to the authors as a result of the RAMPUP curriculum being more widely disseminated or used.

Abbreviations

The following abbreviations are used in this manuscript:

ELPAC	English Language Proficiency Assessments for California
MDES	minimum detectible effect size
MIRA	Mathematics Initiative Readiness Assessment
NAEP	National Assessment of Educational Progress
SBAC	Smarter Balanced Assessment Consortium
WWC	What Works Clearinghouse

References

Anderson, T., & Shattuck, J. (2012). Design-based research. A decade of progress in education research? Educational Researcher, 41(1), 16–25. [Google Scholar] [CrossRef]
AVID. (2023). AVID math summer bridge program: Algebra readiness. AVID. [Google Scholar]
Briggs, D. (2024). Mathematics initiative readiness assessment (MIRA) technical report. Available online: https://sreereg.icpsr.umich.edu/sreereg/subEntry/22203/fileAction?name=MIRA-Technical-Report.pdf (accessed on 1 May 2026).
California Department of Education. (n.d.). California assessment of student performance and progress. California Department of Education.
Choppin, J., Roth-McDuffie, A., Drake, C., & Davis, J. (2022). The role of instructional materials in the relationship between the official and the enacted curriculum. Mathematical Thinking and Learning, 24(2), 123–148. [Google Scholar] [CrossRef]
Chu, H., DePiper, J. N., & Westergard, L. (2025). Multilingual Learners collaborating to connect with networks. Mathematics Teacher: Learning and Teaching PreK-12, 118(8), 610–621. [Google Scholar] [CrossRef]
Chu, H., & Evans, M. (2025). Rethinking language in mathematics for English learners: Connecting theory, research, and practice. National Research and Development Center to Improve Education for Secondary English Learners at WestEd. [Google Scholar]
Chu, H., & Hamburger, L. (2022). Educative curriculum materials for English learners: Varying the intensity of scaffolding. In L. de Oliveira, & R. Westerlund (Eds.), Scaffolding for multilingual Learners in elementary and secondary schools (pp. 181–196). Routledge. [Google Scholar]
Chu, H., & Jackson, B. (2025). Engaging English learners with equivalence as a crosscutting concept in mathematics. National Research and Development Center to Improve Education for Secondary English Learners at WestEd. [Google Scholar]
Chu, H., Tran, T., & Hamburger, L. (2022). Redefining approaches for engaging English learners with mathematical ideas. National Research and Development Center to Improve Education for Secondary English Learners at WestEd. [Google Scholar]
Cook, J. P., Reed, Z., & Littlewood, E. (2022). An initial framework for analyzing students’ reasoning with equivalence across mathematical domains. Journal of Mathematical Behavior, 66, 100935. [Google Scholar] [CrossRef]
Cooper, H., Charlton, K., Valentine, J. C., Muhlenbruck, L., & Borman, G. (2000). Making the most of summer school: A meta-analytic and narrative review. Monographs of the Society for Research in Child Development, 65(1), i-127. [Google Scholar] [PubMed]
Crawford, L. (2013). Effects of an online mathematics curriculum for English language learners. Computers in the Schools, 30, 248–270. [Google Scholar] [CrossRef]
Davis, E., Palincsar, A., Smith, P., Arias, A., & Kademian, S. (2017). Educative curriculum materials: Update, impact, and implications for research and design. Educational Researcher, 46(6), 293–304. [Google Scholar] [CrossRef]
de Araujo, Z., Roberts, S., Willey, C., & Zahner, W. (2018). English learners in K-12 mathematics education: A review of the literature. Review of Educational Research, 88(6), 879–919. [Google Scholar] [CrossRef]
de Araujo, Z., & Smith, E. (2022). Examining English language learners’ needs through the lens of algebra curriculum materials. Educational Studies in Mathematics, 109, 65–87. [Google Scholar] [CrossRef]
Derewianka, B., & Jones, P. (2023). Teaching language in context (3rd ed.). Oxford University Press. [Google Scholar]
Dong, N., & Maynard, R. (2013). PowerUp!: A tool for calculating minimum detectable effect sizes and sample size requirements for experimental and quasi-experimental designs. Journal of Research on Educational Effectiveness, 6, 24–67. [Google Scholar] [CrossRef]
Elmore, R. (Ed.). (2011). I used to think… and now I think: Twenty leading educators reflect on the work of school reform. Harvard Education Press. [Google Scholar]
English Language Proficiency Assessment, California Education Code § 3.5. (1999). Available online: https://leginfo.legislature.ca.gov/faces/codes_displayText.xhtml?lawCode=EDC&division=1.&title=1.&part=1.&chapter=3.&article=3.5 (accessed on 24 April 2025).
Hamburger, L., & Chu, H. (2019). Making slope a less slippery concept for English learners: Redesigning mathematics instruction with rich interactions. In A. Walqui, & G. Bunch (Eds.), Amplifying the curriculum: Designing quality learning opportunities for English learners (pp. 115–137). Teachers College Press. [Google Scholar]
Hamburger, L., & Chu, H. (2025). Developing conceptual understandings of mean as point of balance with multilingual learners: Redesigning mathematical learning with quality interactions. In A. Walqui, G. Bunch, & P. Mueller (Eds.), Amplifying the curriculum: Designing quality learning opportunities for multilingual learners (2nd ed., pp. 157–178). Teachers College Press. [Google Scholar]
Heritage, M., Walqui, A., & Linquanti, R. (2015). English language learners and new standards: Developing language, content knowledge, and analytical practices in the classroom. Harvard Education Press. [Google Scholar]
Hill, H., Corey, D., & Jacob, R. (2018). Dividing by zero: Exploring null results in a mathematics professional development program. Teachers College Record, 120, 1–42. [Google Scholar] [CrossRef]
Irwin, V., Wang, K., Jung, J., Kessler, E., Tezil, T., Alhassani, S., Filbey, A., Dilig, R., & Bullock Mann, F. (2024). The condition of education 2024 (NCES 2024-144). National Center for Education Statistics. Available online: https://nces.ed.gov/pubs2024/2024144.pdf (accessed on 26 April 2024).
Lauer, P., Akiba, M., Wilkerson, S., Apthorp, H., Snow, D., & Martin-Glenn, M. (2006). Out of school-time programs: A meta-analysis of effects for at-risk students. Review of Educational Research, 76(2), 275–313. [Google Scholar] [CrossRef]
Lynch, K., An, L., & Mancenido, Z. (2023). The impact of summer math programs on student mathematics achievement. Review of Educational Research, 93(2), 275–315. [Google Scholar] [CrossRef]
Moore, J., Schleppegrell, M., & Palincsar, A. (2018). Discovering disciplinary linguistic knowledge with English learners and their teachers: Applying systemic functional linguistics concepts through design-based research. TESOL Quarterly, 52, 1022–1049. [Google Scholar] [CrossRef]
NAEP Data Explorer. (n.d.). The nation’s report card. Available online: https://www.nationsreportcard.gov/ndecore/landing (accessed on 26 April 2024).
Orosco, M. J., Swanson, L. H., O’Connor, R., & Lussier, C. (2013). The effects of dynamic strategic math on English language learners’ word problem solving. Journal of Special Education, 47(2), 96–107. [Google Scholar] [CrossRef]
randomizeR. (2024). Available online: https://github.com/HaiwenChu/siteR/blob/main/randomizeR (accessed on 1 May 2024).
Snipes, J., Huang, C.-W., Jaquet, K., & Finkelstein, N. (2015). The effects of the elevate math summer program on math achievement and algebra readiness (REL 2015–096). U.S. Department of Education, Institute of Education Sciences, National Center for Education Evaluation and Regional Assistance, Regional Educational Laboratory West. Available online: http://ies.ed.gov/ncee/edlabs (accessed on 15 January 2024).
Stein, M., Engle, R., Smith, M., & Hughes, E. (2008). Orchestrating productive mathematical discussions: Five practices for helping teachers move beyond show and tell. Mathematical Thinking and Learning, 10, 313–340. [Google Scholar] [CrossRef]
Stein, M., Smith, M., Henningsen, M., & Silver, E. (2009). Implementing standards-based mathematics instruction: A casebook for professional development (2nd ed.). Teachers College Press. [Google Scholar]
Thompson, K. (2017). What blocks the gate? Exploring current and former English learners’ math course-taking in secondary school. American Educational Research Journal, 54(4), 757–798. [Google Scholar] [CrossRef]
Umansky, I. (2016). Leveled and exclusionary tracking: English learners’ access to academic content in middle school. American Educational Research Journal, 53(6), 1792–1833. [Google Scholar] [CrossRef]
Vygotsky, L. (2012). Thought and language. MIT Press. [Google Scholar]
Walqui, A., Bunch, G., & Mueller, P. (Eds.). (2025). Amplifying the curriculum: Designing quality learning opportunities for Multilingual Learners (2nd ed.). Teachers College Press. [Google Scholar]
Walqui, A., & van Lier, L. (2010). Scaffolding the academic success of adolescent English language learners: A pedagogy of promise. WestEd. [Google Scholar]
What Works Clearinghouse. (2015). Review protocol for secondary mathematics, version 3.1. Available online: https://ies.ed.gov/ncee/WWC/Docs/ReferenceResources/wwc_sm_protocol_v3.1.pdf (accessed on 26 April 2024).
What Works Clearinghouse. (2022). Procedures and standards handbook version 5.0. Available online: https://ies.ed.gov/ncee/wwc/Docs/referenceresources/Final_WWC-HandbookVer5.0-0-508.pdf (accessed on 26 April 2024).

Figure 1. Teachers reported implementing, as written, a large majority of activities.

Table 1. Baseline characteristics of samples in Districts A and B.

		District
		A		B
		All Students		English Learners		All Students
		Int.	Comp.	Int.	Comp.	Int.	Comp.
N		25	11	24	22	55	59
Ethnicity
	Latino	76%	64%	64%	50%	45%	50%
	Asian	24%	19%	31%	27%	36%	18%
	White	0%	9%	5%	23%	7%	19%
	Black	0%	9%	0%	0%	5%	2%
	Other	4%	0%	0%	0%	7%	10%
Gender		0%	0%	0%	0%	0%	0%
	Male	56%	55%	54%	64%	65%	41%
SBAC math
	M	2563	2592	2511	2482	2539	2533
	SD	116	122	128	95	112	100
MIRA pre-
	M	N/A	N/A	−0.83	−0.84	−0.81	−0.64
	SD	N/A	N/A	0.74	0.66	0.74	0.64

Note. District A did not administer MIRA as a pre-test.

Table 2. Attrition in District A.

Group	Assigned	Enrolled	Post
Intervention	86	43	26
Comparison	89	53	11

Table 3. Regression model for all students, District A.

	Estimate	SE	95% CI	t Value	Pr(>\|t\|)
(Intercept)	−0.26	0.16	[−0.57, 0.05]	−1.642	0.11
Group: treatment	0.20	0.13	[−0.05, 0.45]	1.5	0.14
Gender: male	0.12	0.12	[−0.12, 0.36]	0.97	0.34
Ethnicity: Latino	−0.22	0.16	[−0.53, 0.09]	−1.4	0.17
SBAC math	0.38	0.07	[0.24, 0.52]	5.21	0	***

Note. N = 37. *** = p < 0.001.

Table 4. English Learner attrition Learner in District B.

Cohort	Assigned	Pre	Post	Matched
1	43	37	28	24
2	44	31	22	22

Table 5. Regression model for English Learners.

	Estimate	SE	95% CI	t Value	Pr(>\|t\|)
(Intercept)	−0.43	0.31	[−1.04, 0.18]	−1.39	0.17
Cohort	0.04	0.21	[−0.37, 0.45]	0.2	0.84
Gender: male	−0.02	0.21	[−0.43, 0.39]	0.08	0.94
Ethnicity: Asian	−0.28	0.35	[−0.97, 0.41]	−0.8	0.42
Ethnicity: Latino	−0.37	0.31	[−0.98, 0.24]	−1.19	0.24
SBAC math	0.31	0.11	[0.09, 0.53]	2.91	0.005	**
ELPAC	−2.46	2.1	[−6.58, 1.66]	−1.18	0.25

Note. N = 46. ** = p < 0.01.

Table 6. Attrition among all students in District B.

Cohort	Assigned	Pre	Post	Matched
1	126	82	61	55
2	127	86	61	59

Table 7. Regression model for all students in District B.

	Estimate	SE	95% CI	t Value	Pr(>\|t\|)
(Intercept)	−0.37	0.19	[−0.74, 0.002]	−1.93	0.06	.
Cohort	0.02	0.13	[−0.23, 0.27]	0.17	0.86
Gender: male	0.08	0.13	[−0.17, 0.33]	0.65	0.52
Ethnicity: Black	−0.06	0.37	[−0.79, 0.67]	−0.17	0.86
Ethnicity: Asian	−0.36	0.21	[−0.77, 0.05]	−1.69	0.09	.
Ethnicity: Latino	−0.43	0.19	[−0.8, −0.06]	−2.23	0.03	*
Ethnicity: Two or more races	−0.19	0.27	[−0.72, 0.34]	−0.7	0.49
SBAC math	0.25	0.06	[0.13, 0.37]	3.95	0.00	***

Note. N = 114. *** = p < 0.001, * = p < 0.05, . = p < 0.10.

Table 8. Summary of adjusted means and effect sizes across experiments in Districts A and B.

			Intervention		Comparison		Pooled SD	Hedges’ g
Dist.	Group	N	M	SD	M	SD	Pooled SD	Hedges’ g
A	All students	37	−0.19	0.44	−0.28	0.47	0.45	0.19
B	English Learners	46	−0.79	0.37	−0.84	0.38	0.34	0.15
B	All students	114	−0.63	0.29	−0.64	0.29	0.29	0.03

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chu, H.; Hamburger, L.; DePiper, J.N. Implementation and Impact of a Summer Mathematics Bridge Program for Multilingual Learners: Evidence from Randomized Controlled Trials. Educ. Sci. 2026, 16, 796. https://doi.org/10.3390/educsci16050796

AMA Style

Chu H, Hamburger L, DePiper JN. Implementation and Impact of a Summer Mathematics Bridge Program for Multilingual Learners: Evidence from Randomized Controlled Trials. Education Sciences. 2026; 16(5):796. https://doi.org/10.3390/educsci16050796

Chicago/Turabian Style

Chu, Haiwen, Leslie Hamburger, and Jill Neumayer DePiper. 2026. "Implementation and Impact of a Summer Mathematics Bridge Program for Multilingual Learners: Evidence from Randomized Controlled Trials" Education Sciences 16, no. 5: 796. https://doi.org/10.3390/educsci16050796

APA Style

Chu, H., Hamburger, L., & DePiper, J. N. (2026). Implementation and Impact of a Summer Mathematics Bridge Program for Multilingual Learners: Evidence from Randomized Controlled Trials. Education Sciences, 16(5), 796. https://doi.org/10.3390/educsci16050796

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Implementation and Impact of a Summer Mathematics Bridge Program for Multilingual Learners: Evidence from Randomized Controlled Trials

Abstract

1. Introduction

1.1. Intervention Design Principles

1.1.1. Centering on Cross-Cutting Concepts

1.1.2. Designing for Participation

1.1.3. Focusing Purposefully on Language

1.2. Intervention Design Components

1.2.1. Rich, Engineered Texts for Partner Reading

1.2.2. Small Group Activities Requiring and Supporting Peer Interactions

1.2.3. Writing Extension Activities

1.3. Professional Learning and Educative Materials for Implementing Teachers

1.4. Research Questions

2. Materials and Methods

2.1. Implementation Study Design and Data

2.2. Impact Study Design and Data

2.3. Analytic Approaches

3. Results

3.1. Implementation Evaluation

3.1.1. Professional Learning Workshops

3.1.2. Classroom Implementation of Intervention

3.2. Impact Evaluation

3.2.1. District A

3.2.2. District B English Learners

3.2.3. District B—All Students

4. Discussion

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI