How Teaching Practices Relate to Early Mathematics Competencies: A Non-Linear Modeling Perspective

Dong, Yixiao; Clements, Douglas H.; Mulcahy, Christina; Sarama, Julie

doi:10.3390/educsci15091175

Open AccessArticle

How Teaching Practices Relate to Early Mathematics Competencies: A Non-Linear Modeling Perspective

¹

Department of Education, University of California Santa Barbara, Santa Barbara, CA 93106, USA

²

Marsico Institute for Early Learning, University of Denver, 1999 E. Evans Ave., Denver, CO 80208, USA

^*

Author to whom correspondence should be addressed.

Educ. Sci. 2025, 15(9), 1175; https://doi.org/10.3390/educsci15091175

Submission received: 4 July 2025 / Revised: 27 August 2025 / Accepted: 3 September 2025 / Published: 8 September 2025

(This article belongs to the Special Issue Early Mathematics Education with a Focus on the Teacher and the Teaching)

Download

Browse Figures

Review Reports Versions Notes

Abstract

The significance of children’s mathematical competence during the early years is well established; however, the methods for developing such competencies remain less understood. Specifically, there is a need to identify what constitutes high-quality educational environments and effective instruction. Both the study and promotion of high-quality educational environments and teaching, through coaching and other professional development initiatives, necessitate the use of observational instruments that are reliable, efficient, and valid, including content, internal, external, and consequential validity. Moreover, domain-specific measures are essential, as general quality measures often fail to adequately assess curriculum content, scope, or sequence, and they do not reliably predict improvements in children’s learning outcomes. This study employed innovative analytical techniques to evaluate the scoring and interpretation of an existing domain-specific observational measure: the Classroom Observation of Early Mathematics Environment and Teaching (COEMET). We applied non-linear modeling approaches (i.e., Random Forest [RF] and Generalized Additive Models [GAMs]) to investigate and provide a comprehensive overview of the relationships between COEMET’s measures—at both the scale and item levels—of teachers’ practices and children’s mathematical competencies. The study first employed the RF machine learning method to identify the most important COEMET items for prediction, followed by the use of GAMs to depict the non-linear relationships between COEMET predictors and the outcome variable. The analysis revealed that certain teaching practices, as indicated by the COEMET items, exhibited non-linear and even non-monotonic associations with children’s mathematical competencies.

Keywords:

early childhood education; classroom observation; assessment; learning trajectories; early mathematics

1. Introduction

The significance of children’s proficiency in early mathematics is gaining widespread recognition. The mathematical knowledge that children possess upon their entry into kindergarten serves as the most reliable predictor of high school graduation (McCoy et al., 2017; Watts et al., 2014), and a recent meta-analysis has demonstrated that early mathematics significantly influences mathematical development throughout a student’s educational journey (Liu et al., 2025). Mathematics is increasingly integral to economies and cultures; however, certain countries, including the United States, have seen limited improvements in mathematics education (http://ncee.org/pisa-2018-lessons, accessed on 1 June 2023). Disparities in mathematical understanding manifest as early on as at the preschool and primary grade levels (Gerofsky, 2015; OECD, 2014). Global interest and concern are further driven by a specific apprehension regarding children who have not been afforded adequate opportunities for learning (Bachman et al., 2015; Gripton & Williams, 2023; McCoy et al., 2018). Furthermore, early mathematics predicts and supports children’s development in other domains, such as reading (Duncan & Magnuson, 2011) and oral language (Sarama et al., 2012). Math plays a central role in the sciences—throughout the grades (English, 2023). It further transfers to executive function (Farran et al., 2011; Nesbitt & Farran, 2021) and social skills (Chernyak et al., 2022).

A caveat is of importance: positive effects depend on positive early mathematical experiences. Such experiences require high-quality educational environments and instruction (e.g., Burchinal et al., 2016; Carr et al., 2019; Clements et al., 2011; Cummings et al., 2010; Hill et al., 2005; Konstantopoulos, 2011; National Research Council, 2009; Pohle et al., 2022; Ran et al., 2022; Sanders & Horn, 1998; Sanders & Rivers, 1996; Scalise et al., 2025; S. P. Wright et al., 1997), although some studies show a weak relationship (e.g., Ottmar et al., 2013). To both study and promote high-quality educational environments and teaching requires observational instruments that are reliable, efficient, and valid, including content, internal, external, and consequential aspects of validity (Dong & Clements, 2025).

When educational systems incorporate general observational classroom quality measures, such as the Early Childhood Environment Rating Scale (ECERS) and the Classroom Assessment Scoring System (CLASS), there is a notable enhancement in classroom quality (Bassok et al., 2019; Bassok et al., 2021; Weiland & Rosada, 2022). Furthermore, in intervention studies that link professional development with these quality measures, teachers exhibit significant improvements in comparison to those within control groups (Hamre et al., 2012; Pianta et al., 2017; Weiland & Rosada, 2022). These measures also facilitate comparisons among various groups or systems that advocate for equity and contribute to ongoing research endeavors (Bos et al., 2022; Weiland & Rosada, 2022).

However, these measures have limitations. They do not predict advancements in learning and development (Brunsek et al., 2017; Finders et al., 2021; National Academies of Sciences, Engineering, and Medicine, 2024; Perlman et al., 2016; Watts et al., 2018). This is particularly evident within the content domains. General measures fail to assess curriculum content, scope, or sequence, and do not predict improvements in children’s learning (McCormick et al., 2021; National Academies of Sciences, Engineering, and Medicine, 2024; Weiland & Rosada, 2022). Furthermore, effective curricula are also domain-specific, following research-validated learning trajectories (Clements & Sarama, 2008, 2011, 2024; National Academies of Sciences, Engineering, and Medicine, 2024; Phillips et al., 2017). Consequently, there is a pressing need for domain-specific data to bolster research and facilitate educational improvement (McCormick et al., 2021; Maier et al., 2022).

The Classroom Observation of Early Mathematics Environment and Teaching (COEMET) is a researcher-developed and evaluated measure. The COEMET has been used in examinations of teacher knowledge and practices in early math (e.g., Whittaker et al., 2020; Orcan-Kacan et al., 2023) and in assessing the effects and sustainability of early interventions (Botvin et al., 2024; Clements et al., 2013; Clements et al., 2020; Engel et al., 2024; Rosenfeld et al., 2019; Sarama & Clements, 2021). For example, COEMET scores partially mediated the effects of an early math intervention (Clements et al., 2011; Sarama et al., 2012). The measure was also utilized to discover that African American students particularly benefited from certain instructional practices, including teacher expectations and sensitivity to students’ developmental needs through the use of learning trajectories (Schenke et al., 2017). On the other hand, without intervention, Black or Hispanic kindergartners spent less time on math, had fewer opportunities to work in small groups and engage in hands-on math learning with manipulatives, and were more hindered by classroom-management strategies (Attaway et al., 2025).

Finally, a critical review of nine observation instruments for mathematics education identified three with desirable characteristics (Kilday & Kinzie, 2009). Two of these addressed the primary grades (K-3); COEMET was the only one suitable for preschool. Further research was recommended (Kilday & Kinzie, 2009). Additionally, the field of research in early mathematics education has expanded considerably beyond the extant research at the time of the COEMET’s creation, circa 2000 (e.g., Bognar et al., 2025; Clements et al., 2023; Clements & Sarama, in press; Elia et al., 2023; Mesiti et al., 2024; Yıldız et al., 2025). Therefore, further evaluating and improving COEMET constitute a valuable research goal.

2. Current Study

This study examines the non-linear relationships between teachers’ practices in early mathematics classrooms and children’s mathematical competencies. Teaching practices were assessed using the COEMET instrument. Unlike typical scale-level score analyses and linear modeling, this research focuses on the non-linear relations between item-level COEMET data and students’ early mathematics performance, as COEMET items provide more detailed and specific insights into teaching practices than composite scores. For example, the classroom culture (CC) scale score in COEMET quantifies the overall classroom culture environment, while individual CC items (e.g., Q1: “Teacher actively interacted”) specifically capture teaching practices that reflect classroom culture. Notably, we are not advocating for discarding scale-level scores or analyses, as they remain important and useful in many research contexts—particularly when a reliable single score is needed to reflect teaching quality.

Moreover, while the current COEMET tool is useful, all measurement tools should be continually critiqued and refined. This study aims to contribute to that process by providing a new empirical basis for future revisions. To achieve this, we employed two innovative non-linear modeling techniques: random forest (RF) and generalized additive models (GAMs). Although both are non-linear approaches, they serve distinct purposes in our research.

We first applied RF—a machine learning-based method (Breiman, 2001)—to identify the most important COEMET items among the 28 available in relation to early mathematics outcomes. Next, we used GAMs (Wood, 2017) to model the precise non-linear relations between the selected COEMET items (each representing a specific teaching practice) and the outcome variable. A more detailed discussion of how each method addresses our research questions can be found in the following Methods section. The overarching research goal is addressed through three specific questions:

What are the most important COEMET indicators for predicting mathematical competencies, while controlling for pretest performance and intervention condition?

How do the refined COEMET items, when analyzed using non-linear modeling, incrementally predict the early mathematics outcome compared to scale-level scores and linear models?

Which COEMET items exhibit non-linear relationships with early mathematics competency, and what do these non-linear relationships look like?

3. Methods

3.1. Participants

The research data used in this study, with permission from the researchers, were derived from the TRIAD study, a longitudinal, large-scale cluster randomized controlled trial of the Building Blocks early mathematics intervention (Clements et al., 2011; Sarama et al., 2012). Specifically, the data from the Pre-K year, including two time points (pretest in the Fall and post-test in the Spring), were used in the current analysis. A total of 1035 children and 106 teachers from two cities on the East Coast of the U.S. participated, with an average classroom size of 12.86 (SD = 2.18). Of these classrooms, 72 (67.9%) received the TRIAD intervention, while 34 (32.1%) were in the control group. The mean age of children in the fall of their Pre-K year was 52 months (SD = 4.09), and 51% were female. In terms of ethnicity, 53% of children identified as African–American, 22% as Hispanic, 19% as White non-Hispanic, 4% as Asian/Pacific Islander, and 2% as Native American.

3.2. Intervention

Technology-enhanced, Research-based, Instruction, Assessment, and Professional Development (TRIAD) is a research-based and validated model of scaling up successful interventions (Sarama et al., 2008). Implementation of the TRIAD intervention in this study involved 106 preschool teachers in 42 schools. Intervention teachers participated in 13 days of professional development over two years, introducing early mathematics learning trajectories and the Building Blocks curriculum (Clements & Sarama, 2013). This involved all three components of the learning trajectory for each major topic, the goal (the mathematics and “big ideas” of mathematics for children to learn), the developmental progressions of levels of thinking relevant to preschool, and the instruction (the environment, activities, and teaching strategies) to promote each level (details in Clements et al., 2011).

3.3. Procedure

Teachers and administrations requested no data collection beyond a Teacher Questionnaire in the first professional development year (a “gentle introduction”) to the TRIAD intervention, so both COEMET and child assessment data were collected in the second year, early in Fall and Spring (Clements et al., 2011).

3.4. Measures

3.4.1. Research-Based Early Mathematics Assessment

The outcome variable, children’s mathematical competencies, was assessed using the Research-Based Early Mathematics Assessment (REMA; Clements et al., 2008/2025; Clements et al., 2017). REMA is an interview-based assessment designed to capture the core mathematical abilities of children aged 3 to 8 years. It covers a broad range of topics in early mathematics, with Part A including number and operation competencies (e.g., counting, subitizing, addition, and subtraction) and Part B including geometry (shape recognition, comparison, and composition), spatial, geometric measurement, and patterning and algebraic thinking. Currently, REMA has two validated and equated forms: a full version (Clements et al., 2008) and a short form (Dong et al., 2021). REMA’s validity for use with students from various demographic groups (e.g., gender, race/ethnicity, and English language learners) is also supported (Dong et al., 2023).

In the TRIAD study, the original full REMA, consisting of 225 items divided into numeracy and geometry sections, was used. Within each section, items were ordered by their difficulty parameter, derived from Rasch modeling. The administration of REMA followed a start-basal-stop rule, with each child beginning the assessment at a predetermined entry point based on their grade level. A basal level was determined when a child correctly answered at least three consecutive items. Regardless of grade level, the assessment was discontinued after the child reached the last item or made three consecutive errors in either Part A or Part B. Item correctness and the students’ observable strategies for solving REMA questions were collected. Strategies were recoded into ordinal data based on their level of strategic sophistication. Both correctness and strategy code were used for scoring through Rasch modeling. The Rasch person reliability for the current TRIAD sample was 0.88, with a separation of 2.67 (a value greater than 2.0 is recommended; B. D. Wright & Masters, 1982), indicating that REMA generated reliable scores and was able to differentiate children with varying levels of early-math ability.

3.4.2. Classroom Observation of Early Mathematics—Environment and Teaching

The COEMET is a 28-item, research-based classroom observation tool (Sarama & Clements, 2019; Sarama et al., 2016), designed to assess the quality of the mathematics environment and teaching in early childhood classrooms (see Table 1 for detailed item content). All COEMET items, except for four (Q1, Q2, Q4, and Q15), are rated on a 5-point Likert scale (1 = strongly disagree, 5 = strongly agree). The four non-Likert items reflect the percentage of time that the “teacher actively interacted with students” (Q1), “other staff interacted with students” (Q2), “students used math software” (Q4) and “teacher was actively involved in the activity” (Q15). These items are rated on five categories (1 = 0%, 2 = 1–25%, 3 = 26–50%, 4 = 51–75%, and 5 = 76–100%). Given that each COEMET item includes five response categories, they were treated as continuous variables in the modeling analyses, following recommendations by Harpe (2015).

The tool is organized into two sections: classroom culture, which includes 9 items, and specific math activity (SMA), which contains 19 items. The CC section evaluates the overall classroom environment, including teacher–child interactions, the teacher’s personal characteristics, the availability and use of math-related materials, and children’s mathematical work. The CC section is completed once per observation, based on the entire observation period. In contrast, an SMA section is completed for each distinct, substantive mathematics activity observed (up to 12 in total). SMA scores are later averaged across the observed activities for each SMA item.

The COEMET is designed to be administered for no less than half of the classroom day during which trained observers rate the quality of each structured math activity with an individual SMA form using a series of Likert-scaled items (e.g., “the teacher began by engaging and focusing children’s mathematical thinking”). We generated an overall instructional quality score by averaging these items for each observed activity and used the total number of observed math activities to construct a measure of instructional quantity.

The current study uses both item-level and subscale COEMET composite scores (generated through principal component analysis) from the Year 2 Spring Pre-K observation. For this time point and sample, the SMA subscale demonstrates a Cronbach’s alpha of 0.94 and a Coefficient H of 0.90, while the CC subscale shows a Cronbach’s alpha of 0.73 and a Coefficient H of 0.97. Inter-rater reliability for COEMET was assessed through simultaneous classroom visits conducted by pairs of observers, covering 10% of all observations with rotating pair memberships, yielding a reliability coefficient of 0.80 in the TRIAD sample (Clements et al., 2011). Notably, 99% of the disagreements between observers were of the same polarity. For example, if one observer selected “agree,” the other chose “strongly agree”, which indicates a high level of consistency in ratings. These coefficients collectively indicate the good reliability of both COEMET subscale scores.

3.5. Overview of Analysis

The main analyses conducted in the current study all fall within the non-linear modeling framework. To address the first research question, we performed a predictive analysis using all 28 COEMET items, along with the treatment condition (0 = control; 1 = TRIAD intervention) and REMA pretest scores, to predict post-REMA scores in the Pre-K year. Given the focus of this research on classroom practices and outcomes, REMA scores for both sections were aggregated to the classroom level. This also helps preserve the model parsimony in the subsequent non-linear analyses. Methodologically, using a conventional regression approach with 30 predictors could present certain challenges, such as overfitting, modeling efficiency, and potential multicollinearity. Thus, we applied RF regression modeling (Breiman, 2001), a machine learning-based approach that is less sensitive to multicollinearity than conventional linear regression, to estimate the importance of predictors in predicting the outcome variable. This modeling approach captures the non-linear statistical relationships between predictors and the outcome variable (Auret & Aldrich, 2012), and is generally robust in handling many predictors within a single model, as well as handling multicollinearity among them. According to the predictor importance measures calculated by random forest (Archer & Kimes, 2008), we ranked and selected the predictors with the highest positive importance in predicting REMA post-scores.

With the refined set of COEMET items, we then conducted generalized additive models (Wood, 2017), a statistical method that allows non-linear relationships between predictor variables and outcome variables, to address the final two research questions. For question two, we performed a series of GAMs to calculate and compare the variances explained across models. In predicting the same outcome variable, the baseline model contains only the REMA pretest score and intervention condition variables. These two predictors were included in all the following GAMs. Model 1 includes the two additional scale-level COEMET scores; Model 2 contains the refined list of COEMET items instead of the scale-level scores, but all the relations between predictors and outcomes are constrained to be linear; and Model 3 includes the same predictors as Model 2 but allows for non-linear relations.

To address the final research question and obtain a more precise understanding of the non-linear relationships between teaching practices and children’s early math competencies, we optimized the GAM configuration based on the previous modeling results. Specifically, some predictors were set to have non-linear relationships with the outcome, while the others maintained a linear relationship. Smooth term plots were generated for each of the non-linear relationships between the predictors with non-linear terms and the outcome variables.

All modeling and visualization were performed using R 4.4.2 (R Core Team, 2024), with the three primary packages being random forest (Breiman et al., 2018), mgcv (Wood, 2023), and ggplot2 (Wickham, 2023). Notably, the Restricted Maximum Likelihood estimator was incorporated into the GAMs to improve model stability.

4. Results

4.1. Preliminary Descriptive Analysis of COEMET Item Scores in the Spring of the Pre-K Year

Table 1 summarizes the means and standard deviations of the COEMET item scores across intervention conditions. As shown, teachers in the TRIAD intervention group demonstrated higher average scores than the control group across all 28 COEMET items. Specifically, the largest mean score differences were observed for the two COEMET items: “Q4: Students used math software” (ΔM = 2.5) and “Q27: Observed, listened, and took notes” (ΔM = 1.1). Additionally, the COEMET scores of the intervention group exhibited lower variance than the control group, with the exception of “Q15: Percent teacher involved in activity.” At the scale level, the TRIAD intervention classrooms also showed higher scores in both the classroom culture (ΔM = 0.96, t = 4.81, p < 0.001, Hedges’ g = 1.06) and specific math activity (ΔM = 0.66, t = 2.32, p = 0.024, Hedges’ g = 0.67) subscales. Taken together, the observed differences in the means and variances of the COEMET item and scale scores between the two groups suggest that the intervention had a substantial impact on teachers’ classroom practices.

Statistically, using both the intervention condition and COEMET item scores as predictors may have introduced potential multicollinearity in the subsequent prediction models. Therefore, this finding also supported our decision to use RF modeling to address the following research questions.

4.2. The Importance of COEMET Indicators in Predicting Early Math Competencies

We aimed to identify the most important COEMET indicators in predicting children’s math competencies using random forest regression modeling. As mentioned earlier, a total of 30 predictors (i.e., 28 COEMET items, the intervention condition, and the pretest) were included in the RF model. The model produced an importance score for each predictor. To clarify, the importance measure in RF represents the contribution of each predictor to the predictive performance of the model (Breiman, 2001), which differs from the coefficient strength in a regular regression model. In RF, two typical importance measures are used: Mean Decrease in Accuracy (MDA) and Mean Decrease in Impurity (MDI). In our study, the importance scores presented in Figure 1 are the MDA score, as previous research has shown that MDA is more stable and robust than MDI (Nicodemus, 2011; Strobl et al., 2007). To further enhance reliability and mitigate model overfitting, we also applied 10-fold cross-validation alongside the RF regression model. Additionally, we set the RF model to use 100 trees, based on the sample size (i.e., 106 classrooms), as smaller datasets typically require fewer trees to achieve sufficient estimation accuracy.

The MDA importance scores ranged from −1.42 to 8.10, and all 30 predictors are ranked according to their importance in Figure 1. As shown, the treatment and pretest were the two most important predictors, which is consistent with common findings in educational intervention research. Among the COEMET items, practices such as “Supported listeners’ understanding” (Q23), “Observed, listened, and took notes” (Q27), “Student math work or thinking on display” (Q6), and “Adapted tasks to accommodate a range of abilities” (Q28) were identified as highly important. However, the top 10 most important COEMET indicators (i.e., Q23, Q27, Q6, Q28, Q19, Q18, Q14, Q25, Q21, and Q17) had similar importance scores, suggesting that focusing on only a limited number of COEMET indicators may not be ideal when identifying key practices for predicting children’s mathematical learning.

Among the COEMET items, 11 out of the 28 showed zero or even negative importance. These items were excluded from further modeling analyses to improve model parsimony. The excluded COEMET items (i.e., teaching practice indicators) do not necessarily indicate irrelevance to the early math outcome; instead, they suggest that these items do not meaningfully contribute to predicting early math outcomes beyond the other items retained in the model.

4.3. Incremental Prediction of Early Math Competencies with Non-Linear Modeling and COEMET Items

Using the 17 retained COEMET items, while controlling for intervention condition and children’s pretest scores, we conducted a series of GAMs to calculate and compare the variances explained in children’s post-math competency across these models. Table 2 summarizes the predictors, smooth terms, and prediction performance indices (adjusted R-squared and proportions of deviance explained). The smooth terms represent the non-linear relationships between predictors and the outcome, and the higher adjusted R-squared and deviance explained the indicate better predictive performance of the model. Models with no smooth terms indicate a linear relation, even if the statistics are generated via GAMs.

The baseline model explained over half the variance in the math outcome (adjusted R-squared = 0.51, deviance explained = 52%), demonstrating the effectiveness of the TRIAD intervention (B = 0.48, SE = 0.08, t = 6.10, p < 0.001), as well as the predictive power of the pretest (B = 0.55, SE = 0.09, t = 6.32, p < 0.001). Model 1 added the two COEMET scale scores (i.e., CC and SMA) to the baseline model, resulting in minimal changes in both model performance measures. Both scale-level predictors showed non-significant associations with the outcome (CC: B = 0.03, SE = 0.05, t = 0.65, p = 0.52; SMA: B = 0.03, SE = 0.04, t = 0.80, p = 0.43). These results suggest that when classroom practice indicators are aggregated into scale-level measures, they may have limited predictive power for children’s math competencies. Model 2 replaced the COEMET scale scores with the refined set of items. This model explained 64.4% of the deviance, which was higher than that of the previous models. However, the adjusted R-squared value remained at a similar level, likely due to the penalty for increased model complexity with more predictors.

Thus far, none of the models have incorporated non-linear terms. In Model 3, we added smooth terms for predictors with sufficient variance and unique values, including most SMA items and the pretest. In contrast, the intervention condition had only two levels, and CC was completed once per observation session, so these predictors had limited unique values for modeling non-linear relationships with the outcome. As shown in Table 2, the inclusion of non-linear terms substantially boosted the variance explained in children’s post-test math competencies. This improvement is reflected in both the adjusted R-squared value (0.72), which penalizes model complexity, and the deviance explained (83.4%), demonstrating the incremental prediction of early math competencies after incorporating COEMET items and non-linear relationships.

4.4. Non-Linear Associations Between Classroom Practices and Early Math Competency

To more accurately represent the relations between classroom practices and early mathematical competencies, we further adjusted the model specification in Model 3. Specifically, Model 3 provides an estimate of the estimated degrees of freedom (EDF) for each smooth term included, with a value over 1 indicating potential non-linearity and a value close to 1 indicating a linear relationship. In GAM 3, COEMET indicators 14, 15, 19, 20, 27, and 28 showed an EDF of 1, so these predictors were modeled as linearly related to the outcome variable, yielding Model 4. Model 4 explained the same amount of variance in the outcome as Model 3, but with more optimal specification regarding the linear versus non-linear relationships between each predictor and the outcome. Specific predictor-level coefficients from Model 4 are summarized in Table 3. As shown, the results are divided into two sections: the top panel summarizes the linear coefficients, and the bottom panel presents the results of the non-linear smooth terms. Figure 2 visualizes the non-linear relationships of these predictors with the outcome variables.

Beyond the intervention and pretest, we found that the practices “Management strategies enhanced quality” (Q14, B = 0.34, SE = 0.11, t = 3.19, p = 0.003) and “Facilitated students’ responding” (Q20, B = 0.29, SE = 0.12, t = 2.47, p = 0.018) significantly contributed to children’s math learning. Additionally, “Teaching strategies developmentally appropriate” (Q16, F [3.71] = 4.29, p = 0.007) and “Encouraged students to listen/evaluate the thinking of others” (Q21, F [3.17] = 3.65, p = 0.018) presented non-linear associations with children’s mathematical learning. From the smooth plots in Figure 2, we found that both Q16 and Q21 exhibited an inverted U-shape, indicating that these predictors have the strongest effects on the math outcome at their moderate values.

5. Discussion and Implications

Educational change depends on what teachers do and think—it is as simple and as complex as that (Fullan, 1982, p. 107).

Studying, promoting, and supporting high-quality educational environments and teaching requires observational instruments that are domain-specific, efficient, nuanced, reliable, and valid. The COEMET has been utilized in examinations of teacher knowledge and practices, as well as in interventions. To contribute to the field and establish a new empirical foundation for potential improvements of the measure, we employed new analytical techniques to explore the relationships between the COEMET’s assessment of teachers’ practices in early mathematics classrooms and children’s mathematical competencies.

Random forest regression modeling confirmed that treatment and REMA pretest scores were the two most significant predictors, with importance defined as their contribution to predictive performance. Figure 1 shows the ranking. Among the COEMET items, “Supported listeners’ understanding” (Q23), “Observed, listened, and took notes” (Q27), “Student math work or thinking on display” (Q6), and “Adapted tasks to accommodate a range of abilities” (Q28) were identified as highly important. However, the items identified among the top ten most important indicators had similar importance scores, suggesting that a comprehensive view of teaching practices predicting child outcomes is warranted.

Replacing the aggregate COEMET scores with a refined selection of individual COEMET items, along with the inclusion of smooth terms (i.e., non-linear terms), substantially increased the amount of variance explained in early mathematics outcomes. Such a finding implies the incremental contribution of item-level scores and non-linear relationships in predicting children’s early mathematical competencies. While scale-level variables remain useful and important, specific teaching practices also warrant attention in order to obtain a comprehensive understanding of how teachers influence children’s skill development.

Although identifying the most parsimonious RF model resulted in the elimination of 11 items due to low or negative importance, each item correlated with children’s achievement and was positively influenced by the TRIAD training (recall that importance refers to its contribution to predicting child learning, not coefficient strength). These items may exhibit low RF importance measures for several reasons, including collinearity or potential nonmonotonicity. Removing these 11 items in future iterations of the measure may improve modeling efficiency; however, further investigation is necessary to confirm and understand why the items did not perform well in the prediction. In future work, we plan to cross-validate these findings using additional COEMET datasets and examine whether any items are highly correlated and could potentially be consolidated with another item that demonstrates greater predictive power. Additionally, we will explore whether these items are only relevant for predicting outcomes for specific subgroups, as previous research using COEMET has shown that certain instructional practices captured by the COEMET are particularly important for certain subgroups. Furthermore, we aim to investigate the theoretical significance of each item and how effectively the item, as written and explained, captures the theoretical construct. Moreover, critical components of early math instruction, such as positive early experiences with math, may not be immediately apparent in a post-test but can significantly influence children’s long-term math outcomes. Additional analyses and reflections, including focus groups with COEMET assessors, will need to be conducted to determine whether the items require revision (in terms of phrasing, examples, or even assessor training), recoding, or deletion.

5.1. Non-Linear Relations Between Teaching Practices and Early Mathematics Learning

The current study investigated the relationship between teaching practices and early mathematics competencies using a non-linear modeling approach. While this method provides novel and valuable insights beyond those offered by traditional linear models, it also presents certain limitations—most notably, the potential risk of overfitting to the specific dataset used, which may constrain the generalizability of the findings. Although we applied certain procedures (e.g., cross-validation in the RF models; calculating adjusted R-squared in GAMs) to mitigate this risk, further analyses or replication of the current findings using external datasets would still enhance our understanding of the generalizability of these results.

The non-linearity and nonmonotonicity of important items, if supported by analyses of other datasets that include COEMET, may also indicate the need to revise or recode the Likert scale structure. For instance, Q21 “Encouraged students to listen/evaluate the thinking of others” suggests that excessive unguided peer discussion is not ideal, and scoring based on an inverted “U,” with moderate frequencies yielding higher post-test REMA scores, may provide information that is more valid and useful. This is theoretically justified, especially considering the historical context. The COEMET was developed from the extant research from a quarter of a century ago and first applied and evaluated in the TRIAD study soon thereafter. At that time, some of the pedagogical strategies measured, such as peer-to-peer math talk, were not common practice. Therefore, their early use was understandably linearly related to teaching effectiveness. The subsequent decades saw increased focus on early math education, so it is possible that some teachers apply practices like encouraging student discussion beyond a useful amount, limiting time for other activities, such as teachers summarizing and connecting the discussion to mathematics. That is why a ‘Goldilocks’ coding scheme (too little, too much, and just right) may provide more valuable insights. Further quantitative and qualitative analyses could confirm and expand on these hypotheses.

The non-linear result for item Q16, “Teaching strategies developmentally appropriate,” is harder to interpret. We suspect the issue may not be with the content of the item itself, but with a specific phrase we used: “developmentally appropriate.” Despite the explanatory bullets attached to that item (strategies matched instructional goals, strategies provided an appropriate level of support, strategies maintained all children’s engagement with the mathematical ideas), that phrase might have led observers to interpret “developmentally appropriate” within a Goldilocks framework. That is, for decades, the U.S. organization that created and promoted the construct, the National Association for the Education of Young Children (NAEYC), in a 1986 position statement (“Developmentally appropriate practice in early childhood programs,” or DAP), suggested it. During that time, most early childhood educators interpreted that as a dictum to avoid activities requiring levels of thinking beyond preschoolers and, especially, refrain from imposing practices from later grades of school, for example, arithmetic worksheets. NAEYC has since worked to correct such connotations and made research-based updates to DAP to concepts of “challenging and achievable.” However, the tendency to avoid challenging content for preschoolers may persist in the perceptions of observers with a long history in early childhood education. As a result, they might believe they cannot agree or strongly agree with what they unconsciously perceive as difficult content and activities. One possible solution is to keep the content of A16 but remove that phrase.

5.2. Potential Revisions to COEMET

In considering potential revisions to COEMET, we recognize that classroom observation measures serve multiple purposes, including both research and practical applications (e.g., professional development, training, and instructional coaching). The COEMET has already contributed to professional development in research projects, and improvements in it would increase its efficacy for pre-service and in-service training and coaching. For example, in previous projects (Clements et al., 2011; Sarama et al., 2012), researchers and coaches usually believed the training went well, that is, that both teachers and coaches understood the target teaching practices. However, COEMET observations showed that specific practices were not implemented as specified in the training. Thus, teachers—and COEMET observers—lacked knowledge of the training content in situ. Similarly, the observers gained more understanding of teachers’ interpretations of the professional development and that their implementation of practices from the training were not straightforward. Furthermore, the COEMET, and specifically the individual items, provides an invaluable lens for instructional coaches providing support in early childhood mathematics. While training coaches for the previously mentioned projects and others, we found that instructional coaches tended to provide broad overviews that teachers were “doing it” or not, regarding desired math instruction. However, grounding coach training in the specific practices captured by individual COEMET items provides a focus for their classroom observations of classroom math instruction, leading coaches to uncover areas of strength and growth, and ultimately, more productive coaching conversations. The identification of COEMET items that are highly predictive of higher math outcomes may provide a focus for which items to prioritize while coaching.

In summary, use of the COEMET may help interventionists reveal incomplete understandings of professional development sessions, as well as challenges in implementation. It may facilitate communication among all parties in the project about what is important to the researchers, generating a feedback loop regarding what was understood and not. Coaches could use this information as a basis for future visits with their teachers. As a boundary object, then, the COEMET may provide substantial formative assessment for intervention projects.

Praxis generally benefits from both aggregated and individual item scores; however, researchers may, for a large-scale CRT, require a single score reflecting the quality of instruction derived from as brief an instrument as possible, or from smaller subscales and even item scores. Thus, any revisions to the COEMET should take into account how best to serve each purpose.

Author Contributions

Conceptualization, Y.D., D.H.C., C.M. and J.S.; methodology, Y.D.; software, Y.D.; validation, Y.D., D.H.C., C.M. and J.S.; formal analysis, Y.D.; investigation, Y.D., D.H.C., C.M. and J.S.; resources, D.H.C. and J.S.; data curation, Y.D.; writing—original draft preparation, Y.D., D.H.C., C.M. and J.S.; writing—review and editing, Y.D., D.H.C., C.M. and J.S.; visualization, Y.D.; supervision, D.H.C. and J.S.; project administration, Y.D., D.H.C., C.M. and J.S.; funding acquisition, D.H.C. and J.S. All authors have read and agreed to the published version of the manuscript.

Funding

This paper was supported in part by the Institute of Educational Sciences (U.S. Department of Education) under Grants No. R305A190395, R305A220102, R305A110188; National Science Foundation, under grant No. DRK-12 2300606; and Tools Competition/Georgia State University. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the funding agencies.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by Social and Behavioral Sciences Institutional Review Board (protocol code IRB00003128 and date of 3 April 2013).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data presented in this study are currently not publicly available. Please reach out to the corresponding author (Dr. Douglas H. Clements) for potential access.

Conflicts of Interest

The authors declare no conflict of interest.

References

Archer, K. J., & Kimes, R. V. (2008). Empirical characterization of random forest variable importance measures. Computational Statistics & Data Analysis, 52(4), 2249–2260. [Google Scholar] [CrossRef]
Attaway, D. S., Engel, M., Jacob, R., Erickson, A., & Claessens, A. (2025). Understanding mathematics instruction in Kindergarten. The Elementary School Journal, 125(3), 518–547. [Google Scholar] [CrossRef]
Auret, L., & Aldrich, C. (2012). Interpretation of non-linear relationships between process variables by use of random forests. Minerals Engineering, 35, 27–42. [Google Scholar] [CrossRef]
Bachman, H. J., Votruba-Drzal, E., El Nokali, N. E., & Castle Heatly, M. (2015). Opportunities for learning math in elementary school: Implications for SES disparities in procedural and conceptual math skills. American Educational Research Journal, 52(5), 894–923. [Google Scholar] [CrossRef]
Bassok, D., Dee, T. S., & Latham, S. (2019). The effects of accountability incentives in early childhood education. Journal of Policy Analysis and Management, 38(4), 838–866. [Google Scholar] [CrossRef]
Bassok, D., Magouirk, P., & Markowitz, A. J. (2021). Systemwide quality improvement in early childhood education: Evidence from Louisiana. AERA Open, 7, 23328584211011610. [Google Scholar] [CrossRef]
Bognar, B., Mužar Horvat, S., & Jukić Matić, L. (2025). Characteristics of effective elementary mathematics instruction: A scoping review of experimental studies. Education Sciences, 15(1), 76. [Google Scholar] [CrossRef]
Bos, S. E., Powell, S. R., Maddox, S. A., & Doabler, C. T. (2022). A synthesis of the conceptualization and measurement of implementation fidelity in mathematics intervention research. Journal of Learning Disabilities, 32(1), 1–21. [Google Scholar] [CrossRef]
Botvin, C. M., Jenkins, J. M., Carr, R. C., Dodge, K. A., Clements, D. H., Sarama, J., & Watts, T. W. (2024). Can peers help sustain the positive effects of an early childhood mathematics intervention? Early Childhood Research Quarterly, 67, 159–169. [Google Scholar] [CrossRef]
Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32. [Google Scholar] [CrossRef]
Breiman, L., Cutler, A., Liaw, A., Wiener, M., & Liaw, M. A. (2018). Package ‘randomforest’ (Version 4.6-14) [R package]. University of California. [Google Scholar]
Brunsek, A., Perlman, M., Falenchuk, O., McMullen, E., Fletcher, B., & Shah, P. S. (2017). The relationship between the Early Childhood Environment Rating Scale and its revised form and child outcomes: A systematic review and meta-analysis. PLoS ONE, 12(6), e0178512. [Google Scholar] [CrossRef]
Burchinal, M. R., Zaslow, M., & Tarullo, L. (2016). Quality thresholds, features, and dosage in early care and education: Secondary data analyses of child outcomes. Monographs of the Society for Research in Child Development, 81(2), 1–126. [Google Scholar]
Carr, R. C., Mokrova, I. L., Vernon-Feagans, L., & Burchinal, M. R. (2019). Cumulative classroom quality during pre-kindergarten and kindergarten and children’s language, literacy, and mathematics skills. Early Childhood Research Quarterly, 47, 218–228. [Google Scholar] [CrossRef]
Chernyak, N., Harris, P. L., & Cordes, S. (2022). A counting intervention promotes fair sharing in preschoolers. Child Development, 93(5), 1365–1379. [Google Scholar] [CrossRef] [PubMed]
Clements, D. H., Lizcano, R., & Sarama, J. (2023). Research and pedagogies for early math. Education Sciences, 13(839), 839. [Google Scholar] [CrossRef]
Clements, D. H., & Sarama, J. (2008). Experimental evaluation of the effects of a research-based preschool mathematics curriculum. American Educational Research Journal, 45(2), 443–494. [Google Scholar] [CrossRef]
Clements, D. H., & Sarama, J. (2011). Early childhood mathematics intervention. Science, 333(6045), 968–970. [Google Scholar] [CrossRef]
Clements, D. H., & Sarama, J. (2013). Building Blocks, Volumes 1 and 2. McGraw-Hill Education. [Google Scholar]
Clements, D. H., & Sarama, J. (2024). Systematic review of learning trajectories in early mathematics. ZDM–Mathematics Education, 57(4), 637–650. [Google Scholar] [CrossRef]
Clements, D. H., & Sarama, J. (in press). Learning and teaching early math: The learning trajectories approach (4th ed.). Routledge.
Clements, D. H., Sarama, J., Layzer, C., Unlu, F., & Fesler, L. (2020). Effects on mathematics and executive function of a mathematics and play intervention versus mathematics alone. Journal for Research in Mathematics Education, 51(3), 301–333. [Google Scholar] [CrossRef]
Clements, D. H., Sarama, J., & Liu, X. H. (2008). Development of a measure of early mathematics achievement using the Rasch model: The Research-Based Early Maths Assessment. Educational Psychology, 28(4), 457–482. [Google Scholar] [CrossRef]
Clements, D. H., Sarama, J., Spitler, M. E., Lange, A. A., & Wolfe, C. B. (2011). Mathematics learned by young children in an intervention based on learning trajectories: A large-scale cluster randomized trial. Journal for Research in Mathematics Education, 42(2), 127–166. [Google Scholar] [CrossRef]
Clements, D. H., Sarama, J., Wolfe, C. B., & Day-Hess, C. A. (2008/2025). REMA—Research-based early mathematics assessment. Kennedy Institute, University of Denver. [Google Scholar]
Clements, D. H., Sarama, J., Wolfe, C. B., & Day-Hess, C. A. (2017). REMA-SF—Research-based early mathematics assessment short form. Kennedy Institute, University of Denver. [Google Scholar]
Clements, D. H., Sarama, J., Wolfe, C. B., & Spitler, M. E. (2013). Longitudinal evaluation of a scale-up model for teaching mathematics with trajectories and technologies: Persistence of effects in the third year. American Educational Research Journal, 50(4), 812–850. [Google Scholar] [CrossRef]
Cummings, T., Farran, D. C., Hofer, K. G., Bilbrey, C., & Lipsey, M. W. (2010, June 28–30). Starting a chain reaction: Encouraging teachers to support children’s talk about mathematics. IES Research Conference, Washington, DC, USA. [Google Scholar]
Dong, Y., & Clements, D. H. (2025). Consequential validity of early childhood assessments. In O. Saracho (Ed.), Research methods for studying young children. Emerald Group Publishing. [Google Scholar]
Dong, Y., Clements, D. H., Day-Hess, C. A., Sarama, J., & Dumas, D. (2021). Measuring early childhood mathematical cognition: Validating and equating two forms of the Research-Based Early Mathematics Assessment. Journal of Psychoeducational Assessment, 39(8), 983–998. [Google Scholar] [CrossRef]
Dong, Y., Dumas, D., Clements, D. H., Day-Hess, C. A., & Sarama, J. (2023). Evaluating the consequential validity of the Research-Based Early Mathematics Assessment. Journal of Psychoeducational Assessment, 41(5), 575–582. [Google Scholar] [CrossRef]
Duncan, G. J., & Magnuson, K. (2011). The nature and impact of early achievement skills, attention skills, and behavior problems. In G. J. Duncan, & R. Murnane (Eds.), Whither opportunity? Rising inequality and the uncertain life chances of low-income children (pp. 47–70). Sage. [Google Scholar]
Elia, I., Baccaglini-Frank, A., Levenson, E., Matsuo, N., Feza, N., & Lisarelli, G. (2023). Early childhood mathematics education research: Overview of latest developments and looking ahead. Annales de Didactique et de Sciences Cognitives. Revue Internationale de Didactique des Mathématiques, 28, 75–129. [Google Scholar] [CrossRef]
Engel, M., Jacob, R., Hart Erickson, A., Mattera, S., Shaw Attaway, D., & Claessens, A. (2024). The alignment of P–3 math instruction. AERA Open, 10, 23328584241281483. [Google Scholar] [CrossRef]
English, L. D. (2023). Ways of thinking in STEM-based problem solving. ZDM—Mathematics Education, 55(7), 1219–1230. [Google Scholar] [CrossRef]
Farran, D. C., Lipsey, M. W., & Wilson, S. J. (2011, November 7–8). Curriculum and pedagogy: Effective math instruction and curricula. Early Childhood Math Conference, Berkeley, CA, USA. [Google Scholar]
Finders, J. K., Budrevich, A., Duncan, R. J., Purpura, D. J., Elicker, J., & Schmitt, S. A. (2021). Variability in preschool CLASS scores and children’s school readiness. AERA Open, 7, 23328584211038938. [Google Scholar] [CrossRef]
Fullan, M. G. (1982). The meaning of educational change. Teachers College Press. [Google Scholar]
Gerofsky, P. R. (2015). Why Asian preschool children mathematically outperform preschool children from other countries. Western Undergraduate Psychology Journal, 3(1), 1–8. Available online: http://ir.lib.uwo.ca/wupj/vol3/iss1/11 (accessed on 1 June 2025).
Gripton, C., & Williams, H. J. (2023). The principles for appropriate pedagogy in early mathematics: Exploration, apprenticeship and sense-making: Part 2. Mathematics Teaching, 286, 5–7. Available online: https://www.atm.org.uk/write/MediaUploads/Journals/MT286/02.pdf (accessed on 1 June 2025).
Hamre, B. K., Pianta, R. C., Burchinal, M. R., Field, S., LoCasale-Crouch, J., Downer, J. T., Howes, C., LaParo, K., & Scott-Little, C. (2012). A course on effective teacher-child interactions: Effects on teacher beliefs, knowledge, and observed practice. American Educational Research Journal, 49(1), 88–123. [Google Scholar] [CrossRef]
Harpe, S. E. (2015). How to analyze Likert and other rating scale data. Currents in pharmacy teaching and learning, 7(6), 836–850. [Google Scholar] [CrossRef]
Hill, H. C., Rowan, B., & Ball, D. L. (2005). Effects of teachers’ mathematical knowledge for teaching on student achievement. American Educational Research Journal, 42(2), 371–406. [Google Scholar] [CrossRef]
Kilday, C. R., & Kinzie, M. B. (2009). An analysis of instruments that measure the quality of mathematics teaching in early childhood. Early Childhood Education Journal, 36(4), 365–372. [Google Scholar] [CrossRef]
Konstantopoulos, S. (2011). Teacher effects in early grades: Evidence from a randomized study. Teachers College Record, 113(7), 1541–1565. [Google Scholar] [CrossRef]
Liu, Y., Peng, P., & Yan, X. (2025). Early numeracy and mathematics development: A longitudinal meta-analysis on the predictive nature of early numeracy. Journal of Educational Psychology, 117(6), 863–883. [Google Scholar] [CrossRef]
Maier, M. F., McCormick, M. P., Xia, S., Hsueh, J., Weiland, C., Morales, A., Boni, M., Tonachel, M., Sachs, J., & Snow, C. (2022). Content-rich instruction and cognitive demand in prek: Using systematic observations to predict child gains. Early Childhood Research Quarterly, 60(1), 96–109. [Google Scholar] [CrossRef]
McCormick, M. P., Mattera, S. K., Maier, M. F., Xia, S., Jacob, R., & Morris, P. A. (2021). Different settings, different patterns of impacts: Effects of a Pre-K math intervention in a mixed-delivery system. Early Childhood Research Quarterly, 58, 136–154. [Google Scholar] [CrossRef]
McCoy, D. C., Salhi, C., Yoshikawa, H., Black, M., Britto, P., & Fink, G. (2018). Home- and center-based learning opportunities for preschoolers in low- and middle-income countries. Children and Youth Services Review, 88, 44–56. [Google Scholar] [CrossRef]
McCoy, D. C., Yoshikawa, H., Ziol-Guest, K. M., Duncan, G. J., Schindler, H. S., Magnuson, K., Yang, R., Koepp, A., & Shonkoff, J. P. (2017). Impacts of early childhood education on medium- and long-term educational outcomes. Educational Researcher, 46(8), 474–487. [Google Scholar] [CrossRef]
Mesiti, C., Seah, W. T., Kaur, B., Pearn, C., Jones, A., Cameron, S., Every, E., & Copping, K. (Eds.). (2024). Research in mathematics education in Australasia 2020–2023. Springer Nature Singapore. [Google Scholar] [CrossRef]
National Academies of Sciences, Engineering, and Medicine. (2024). A new vision for high-quality preschool curriculum. The National Academies Press. [Google Scholar] [CrossRef]
National Research Council. (2009). Mathematics learning in early childhood: Paths toward excellence and equity. National Academy Press. [Google Scholar] [CrossRef]
Nesbitt, K. T., & Farran, D. C. (2021). Effects of prekindergarten curricula: Tools of the Mind as a case study. Monographs of the Society for Research in Child Development, 86(1), 7–119. [Google Scholar] [CrossRef] [PubMed]
Nicodemus, K. K. (2011). Letter to the Editor: On the stability and ranking of predictors from random forest variable importance measures. Briefings in Bioinformatics, 12(4), 369–373. [Google Scholar] [CrossRef] [PubMed]
OECD. (2014). Strong performers and successful reformers in education—Lessons from PISA 2012 for the United States. OECD Publishing. [Google Scholar] [CrossRef]
Orcan-Kacan, M., Dedeoglu-Aktug, N., & Alpaslan, M. M. (2023). Teachers’ mathematics pedagogical content knowledge and quality of early mathematics instruction in Turkey. South African Journal of Education, 43(4), 1–19. [Google Scholar] [CrossRef]
Ottmar, E. R., Decker, L. E., Cameron, C. E., Curby, T. W., & Rimm-Kaufman, S. E. (2013). Classroom instructional quality, exposure to mathematics instruction and mathematics achievement in fifth grade. Learning Environments Research, 17(2), 243–262. [Google Scholar] [CrossRef]
Perlman, M., Falenchuk, O., Fletcher, B., McMullen, E., Beyene, J., & Shah, P. S. (2016). A systematic review and meta-analysis of a measure of staff/child interaction quality (the classroom assessment scoring system) in early childhood education and care settings and child outcomes. PLoS ONE, 11(12), e0167660. [Google Scholar] [CrossRef]
Phillips, D. A., Lipsey, M. W., Dodge, K. A., Haskins, R., Bassok, D., Burchinal, M. R., Duncan, G. J., Dynarski, M., Magnuson, K. A., & Weiland, C. (Eds.). (2017). The current state of scientific knowledge on pre-kindergarten effects. Brookings Institution and Duke University. [Google Scholar]
Pianta, R., Hamre, B., Downer, J., Burchinal, M., Williford, A., Locasale-Crouch, J., Howes, C., La Paro, K., & Scott-Little, C. (2017). Early childhood professional development: Coaching and coursework effects on indicators of children’s school readiness. Early Education and Development, 28(8), 956–975. [Google Scholar] [CrossRef]
Pohle, L., Hosoya, G., Pohle, J., & Jenßen, L. (2022). The relationship between early childhood teachers’ instructional quality and children’s mathematics development. Learning and Instruction, 82(1), 1–12. [Google Scholar] [CrossRef]
Ran, H., Secada, W. G., Rhoads, C. H., Schoen, R., Tazaz, A., & Liu, X. (2022, April 22–26). The long-term effects of cognitively guided instruction on elementary students’ mathematics achievement. American Educational Research Association, San Deigo, CA, USA. [Google Scholar]
R Core Team. (2024). R: A language and environment for statistical computing (Version 4.4.2) [Computer software]. R Foundation for Statistical Computing. Available online: https://www.R-project.org/ (accessed on 1 June 2025).
Rosenfeld, D., Dominguez, X., Llorente, C., Pasnik, S., Moorthy, S., Hupert, N., Gerard, S., & Vidiksis, R. (2019). A curriculum supplement that integrates transmedia to promote early math learning: A randomized controlled trial of a PBS KIDS intervention. Early Childhood Research Quarterly, 49, 241–253. [Google Scholar] [CrossRef]
Sanders, W. L., & Horn, S. P. (1998). Research findings from the Tennessee Value-Added Assessment System (TVAAS) database: Implications for educational evaluation and research. Journal of Personnel Evaluation in Education, 12(3), 247–256. [Google Scholar] [CrossRef]
Sanders, W. L., & Rivers, J. C. (1996). Cumulative and residual effects of teachers on future student academic achievement (Research Progress Report) [not online]. University of Tennessee Value-Added Research and Assessment Center. [Google Scholar]
Sarama, J., & Clements, D. H. (2019). COEMET: The classroom observation of early mathematics environment and teaching instrument. University of Denver. [Google Scholar]
Sarama, J., & Clements, D. H. (2021). Long-range impact of a scale-up model on mathematics teaching and learning: Persistence, sustainability, and diffusion. Journal of Cognitive Education and Psychology, 20(2), 112–122. [Google Scholar] [CrossRef]
Sarama, J., Clements, D. H., Starkey, P., Klein, A., & Wakeley, A. (2008). Scaling up the implementation of a pre-kindergarten mathematics curriculum: Teaching for understanding with trajectories and technologies. Journal of Research on Educational Effectiveness, 1(1), 89–119. [Google Scholar] [CrossRef]
Sarama, J., Clements, D. H., Wolfe, C. B., & Spitler, M. E. (2016). Professional development in early mathematics: Effects of an intervention based on learning trajectories on teachers’ practices. NOMAD Nordic Studies in Mathematics Education, 21(4), 29–55. [Google Scholar] [CrossRef]
Sarama, J., Lange, A., Clements, D. H., & Wolfe, C. B. (2012). The impacts of an early mathematics curriculum on emerging literacy and language. Early Childhood Research Quarterly, 27(3), 489–502. [Google Scholar] [CrossRef]
Scalise, N. R., Gladstone, J. R., & Miller-Cotto, D. (2025). Maximizing math achievement: Strategies from the science of learning. Journal of Experimental Child Psychology, 257, 106281. [Google Scholar] [CrossRef] [PubMed]
Schenke, K., Nguyen, T., Watts, T. W., Sarama, J., & Clements, D. H. (2017). Differential effects of the classroom on African American and non-African American’s mathematics achievement. Journal of Educational Psychology, 109(6), 794–811. [Google Scholar] [CrossRef] [PubMed]
Strobl, C., Boulesteix, A.-L., Zeileis, A., & Hothorn, T. (2007). Bias in random forest variable importance measures: Illustrations, sources and a solution. BMC Bioinformatics, 8(1), 25. [Google Scholar] [CrossRef]
Watts, T. W., Duncan, G. J., Siegler, R. S., & Davis-Kean, P. E. (2014). What’s past is prologue: Relations between early mathematics knowledge and high school achievement. Educational Researcher, 43(7), 352–360. [Google Scholar] [CrossRef]
Watts, T. W., Gandhi, J., Ibrahim, D. A., Masucci, M. D., & Raver, C. C. (2018). The Chicago School Readiness Project: Examining the long-term impacts of an early childhood intervention. PLoS ONE, 13, e0200144. [Google Scholar] [CrossRef]
Weiland, C., & Rosada, P. G. (2022). WIdely used measures of pre-k classroom quality: What we know, gaps in the field, and promising new directions. MDRC. Available online: https://www.mdrc.org/sites/default/files/Widely_Used_Measures.pdf (accessed on 1 June 2025).
Whittaker, J. V., Kinzie, M. B., Vitiello, V., DeCoster, J., Mulcahy, C., & Barton, E. A. (2020). Impacts of an early childhood mathematics and science intervention on teaching practices and child outcomes. Journal of Research on Educational Effectiveness, 13(2), 177–212. [Google Scholar] [CrossRef]
Wickham, H. (2023). ggplot2: Create elegant data visualisations using the grammar of graphics (Version 3.4.4) [R package]. Available online: https://CRAN.R-project.org/package=ggplot2 (accessed on 1 June 2025).
Wood, S. N. (2017). Generalized additive models: An introduction with R. Chapman and Hall/CRC. [Google Scholar]
Wood, S. N. (2023). mgcv: Mixed GAM computation vehicle with automatic smoothness estimation (Version 1.9-0) [R package]. Available online: https://CRAN.R-project.org/package=mgcv (accessed on 1 June 2025).
Wright, B. D., & Masters, G. N. (1982). Rating scale analysis. MESA Press. [Google Scholar]
Wright, S. P., Horn, S. P., & Sanders, W. L. (1997). Teacher and classroom context effects on student achievement: Implications for teacher evaluation. Journal of Personnel Evaluation in Education, 11(1), 57–67. [Google Scholar] [CrossRef]
Yıldız, E., Koca, Ö., & Elaldı, Ş. (2025). Effectiveness of early intervention programs in developing early mathematical skills: A meta-Analysis. Journal of Theoretical Educational Science, 18(1), 54–80. [Google Scholar] [CrossRef]

Figure 1. Ordered random forest importance scores for predictors.

Figure 2. Smooth term plots depicting non-linear relationships between predictors and the mathematical outcome.

Table 1. Summary of COEMET item means and standard deviations across intervention conditions.

		Control		Intervention
		Mean	SD	Mean	SD
Classroom Culture
1.	Teacher actively interacted	4.79	0.48	4.99	0.12
2.	Other staff interacted	4.13	1.07	4.67	0.67
3.	Used teachable moments	3.29	1.14	3.79	0.84
4.	Students used math software	1.88	1.56	4.36	1.19
5.	Environment showed signs of math	3.39	1.03	3.94	0.71
6.	Student math work or thinking on display	2.97	1.29	3.43	0.98
7.	Teacher knowledgeable about math	3.79	0.74	3.99	0.59
8.	Teacher showed she believed math learning can be enjoyable	3.71	0.87	3.99	0.64
9.	Teacher showed curiosity/enthusiasm for math	3.53	0.99	3.85	0.87
Specific Math Activity
10.	Teacher understanding	3.91	0.61	4.00	0.43
11.	Content developmentally appropriate	3.87	0.63	3.96	0.47
12.	Engage mathematical thinking	3.57	0.72	3.90	0.46
13.	Pace appropriate for developmental level	3.77	0.68	3.96	0.37
14.	Management strategies enhanced quality	3.56	0.85	3.94	0.45
15.	Percent teacher involved in activity	4.82	0.51	4.68	0.65
16.	Teaching strategies developmentally appropriate	3.80	0.72	3.95	0.48
17.	High but realistic expectations of students	3.65	0.82	3.94	0.51
18.	Acknowledged or reinforced effort of students	3.86	0.66	4.01	0.34
19.	Asked students to share ideas	3.43	0.94	3.75	0.62
20.	Facilitated students’ responding	3.58	0.83	3.90	0.50
21.	Encouraged students to listen/evaluate thinking of others	3.23	0.89	3.53	0.81
22.	Supported describers thinking	3.56	0.80	3.80	0.58
23.	Supported listeners understanding	2.93	1.08	3.43	0.84
24.	Just enough support provided	3.61	0.88	3.93	0.42
25.	Elaborated math ideas of students	3.19	1.00	3.62	0.70
26.	Encouraged mathematical reflection	3.27	0.91	3.46	0.70
27.	Observed, listened and took notes	2.34	0.82	3.49	0.74
28.	Adapted tasks to accommodate range of abilities	3.43	0.88	3.64	0.69

Table 2. Variance in early mathematics outcomes explained by generalized additive models.

Models	Predictors	Smooth Terms	Adjusted R-Squared	Deviance Explained
Baseline Model	Pretest + Intervention	None	0.51	52.00%
Model 1	Pretest + Intervention + CC + SMA	None	0.51	53.90%
Model 2	Pretest + Intervention + 17 COEMET Items	None	0.50	64.40%
Model 3	Pretest + Intervention + 17 COEMET Items	Q14–17, Q19–Q23, Q25, Q27, and Pretest	0.72	83.40%
Model 4	Pretest + Intervention + 17 COEMET Items	Q16–17, Q21–Q23, and Pretest	0.72	83.40%

Table 3. Summary of linear and non-linear coefficients for Model 4.

Predictors	B	SE	t	p
COEMET Q4	−0.04	0.03	−1.42	0.163
COEMET Q5	−0.06	0.06	−0.90	0.374
COEMET Q6	−0.01	0.04	−0.24	0.811
COEMET Q7	0.10	0.08	1.30	0.200
COEMET Q14	0.34	0.11	3.19	0.003
COEMET Q15	−0.04	0.05	−0.84	0.407
COEMET Q18	0.07	0.13	0.55	0.585
COEMET Q19	−0.14	0.08	−1.76	0.087
COEMET Q20	0.29	0.12	2.47	0.018
COEMET Q25	0.07	0.07	0.98	0.331
COEMET Q27	−0.05	0.05	−0.96	0.346
COEMET Q28	−0.03	0.06	−0.54	0.590
Intervention	0.56	0.11	5.00	<0.001
Approximate significance of smooth terms
	EDF	Ref. DF	F	p
s(COEMET Q16)	3.08	3.71	4.29	0.007
s(COEMET Q17)	2.18	2.67	1.46	0.284
s(COEMET Q21)	2.64	3.17	3.65	0.018
s(COEMET Q22)	1.48	1.78	0.63	0.608
s(COEMET Q23)	1.79	2.19	0.79	0.456
s(Pretest)	2.05	2.60	8.75	<0.001

Notes. EDF = estimated degrees of freedom; Ref. DF = reference degrees of freedom for the F-tests.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Dong, Y.; Clements, D.H.; Mulcahy, C.; Sarama, J. How Teaching Practices Relate to Early Mathematics Competencies: A Non-Linear Modeling Perspective. Educ. Sci. 2025, 15, 1175. https://doi.org/10.3390/educsci15091175

AMA Style

Dong Y, Clements DH, Mulcahy C, Sarama J. How Teaching Practices Relate to Early Mathematics Competencies: A Non-Linear Modeling Perspective. Education Sciences. 2025; 15(9):1175. https://doi.org/10.3390/educsci15091175

Chicago/Turabian Style

Dong, Yixiao, Douglas H. Clements, Christina Mulcahy, and Julie Sarama. 2025. "How Teaching Practices Relate to Early Mathematics Competencies: A Non-Linear Modeling Perspective" Education Sciences 15, no. 9: 1175. https://doi.org/10.3390/educsci15091175

APA Style

Dong, Y., Clements, D. H., Mulcahy, C., & Sarama, J. (2025). How Teaching Practices Relate to Early Mathematics Competencies: A Non-Linear Modeling Perspective. Education Sciences, 15(9), 1175. https://doi.org/10.3390/educsci15091175

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

How Teaching Practices Relate to Early Mathematics Competencies: A Non-Linear Modeling Perspective

Abstract

1. Introduction

2. Current Study

3. Methods

3.1. Participants

3.2. Intervention

3.3. Procedure

3.4. Measures

3.4.1. Research-Based Early Mathematics Assessment

3.4.2. Classroom Observation of Early Mathematics—Environment and Teaching

3.5. Overview of Analysis

4. Results

4.1. Preliminary Descriptive Analysis of COEMET Item Scores in the Spring of the Pre-K Year

4.2. The Importance of COEMET Indicators in Predicting Early Math Competencies

4.3. Incremental Prediction of Early Math Competencies with Non-Linear Modeling and COEMET Items

4.4. Non-Linear Associations Between Classroom Practices and Early Math Competency

5. Discussion and Implications

5.1. Non-Linear Relations Between Teaching Practices and Early Mathematics Learning

5.2. Potential Revisions to COEMET

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI