1. Introduction
In the economics literature, the ‘education production function’ has served as a general framework for analyzing various factors affecting the teaching and learning process for students (
Hanushek, 2020). In the context of this theory, the production of education is understood as the result of academic variables (e.g., academic achievement, graduation rate, grades, school attendance) and various inputs (e.g., the school, the family, the teachers, the environment) (
Hanushek, 2020). Academic achievement has been studied on the basis of family characteristics, school traits and social class or socioeconomic status (
De Moll et al., 2024;
Ma et al., 2022;
Sangsawang & Yang, 2025); however, traditional student achievement studies mainly concentrate on individual characteristics and have not paid full attention to the effects of geographical context (
Sajjad et al., 2022;
Wei et al., 2018).
In this regard, recent studies have shown that the spatial environment or spatial neighborhood, mainly moderated by social context, produces effects on performance (
Berkowitz et al., 2017;
Burdick-Will, 2018;
Gimenez et al., 2018). It is important to note that in spatial econometrics, the spatial environment is considered as the set of neighborhood relationships and dependencies that exist between different geographic locations (
Zangger, 2019). We follow a spatial econometrics approach to analyze and quantify these interactions, considering that observations (schools) in one location are influenced by their neighbors and that this influence decreases with distance.
While there is extensive literature on the determinants of school achievement, there has been relatively little research about spatial or neighborhood effects in this regard, although it has been argued that there are important variations in academic achievement between geographical zones (
Lounkaew, 2013;
Zhao et al., 2017). In fact,
Zangger (
2015) indicated that there are not yet any clear, robust research results about the spatial pattern of educational results associated with school and residential socioeconomic segregation; thus, the discussion remains open. Notwithstanding recent contributions in the field of higher education, such as the study by
Benz (
2025), the body of literature concerning primary schooling remains notably limited.
Authors such as
Van Ham et al. (
2012) have also indicated that academic performance may be subject to environmental influence—that is, effects within the same neighborhood on students—above individual or family traits. We thus consider it relevant to include the spatial dimension in the analysis of student achievement, considering environmental, socioeconomic and demographic factors, as well as taking the spatial spillover effects between educational establishments in different geographic areas into account. This is particularly crucial if we consider the school as an important element of the neighborhood and the territory, reflecting the differences in socioeconomic, racial and/or resource distributions between neighborhoods (
Wei et al., 2018).
This point is relevant in the Chilean context, since students’ results on the standardized test for the System to Measure Quality in Education
1 (SIMCE) show recurring inequalities in scores at the geographic level, which are mainly attributable to the high level of segregation of schools by income (
Contreras & Macías, 2002;
Fernandez & Hauri, 2016;
Puga, 2011). There is also little empirical evidence relating the spatial component with students’ performance and educational outcome inequalities, taking the concentration of groups such as indigenous peoples and immigrants into consideration (
McEwan, 2004;
Webb et al., 2018). In fact, studies in Chile have typically focused on performance determinants associated with socioeconomic or establishment factors, without incorporating environmental factors (
Bravo et al., 2015;
Canales & Maldonado, 2018;
McEwan, 2003). Authors such as
Frankenberg (
2013) have indicated that residential segregation and its contribution to school segregation are tied together with the neighborhood effect on educational performance, leading to educational inequalities. Thus, given that geography and space are relevant factors explaining various problems in the social sciences such as educational inequalities (
Boterman & Walraven, 2022;
Nijman & Wei, 2020), this study contributes to this line or research from a geographical–spatial perspective.
This is very interesting because most studies have been conducted in North American or European contexts, where students are legally assigned to schools based on their place of residence (
Levy, 2021;
Nieuwenhuis & Hooimeijer, 2016). However, the situation is different in Chile due to the implementation of the School Admission System (SAE), which allows students to apply to any school, regardless of the distance from their home. Therefore, our study is conducted in the context of student spatial mobility flows.
We thus aimed to determine the influence of geographical context on academic achievement in primary education establishments in Chile, considering the relevant variables in line with the literature. For this purpose, we use a spatial econometrics methodology that involves estimating a spatial panel data model, a recent approach for the study of educational phenomena. We use a spatial auto-regressive (SAR) dynamic model, allowing us to model the spatial effect with regard to the educational performance of the establishments and their neighbors via the average scores obtained in the reading and mathematics sections of the SIMCE, considering environmental variables such as academic climate and sociodemographic traits. In turn, this enables us to determine the spatial differences across the territory, i.e., to identify patterns of inequality. It is important to acknowledge that it is crucial to study spatial inequality, as it influences different childhood developmental stages and future trajectories (
Cazzuffi et al., 2020).
The results demonstrate the existence of spatial effects between schools regarding both reading and math. The analysis of indirect effects allows us to show that the transmission of these effects arises mainly from establishments’ grouping in neighborhoods by socioeconomic group. This finding is relevant, considering the residential and academic segregation of the educational system. Other important variables producing diffusion effects are academic climate associated with neighborhood climate, school size linked to population density and the segregation of schools receiving migrant students.
2. Background
Regarding student characteristics, it has been shown that there are differences in results between indigenous and non-indigenous students, one of the main causes of which is the socioeconomic composition and segregation of these groups (
Dean, 2018;
McEwan, 2004). Similarly, regarding immigrant students, systematic academic achievement gaps have been found in primary education (
Makarova et al., 2021;
Volante et al., 2019). Therefore, we consider how the average scores of standardized tests by educational establishment are the product of sociodemographic factors of their students (including differences by ethnicity and country of origin), as well as geographical, environmental and economic factors.
Traditional studies have typically focused on individual characteristics, and have thus been critiqued for not paying sufficient attention to the effects of geographical and social contexts on academic achievement (
Thrupp, 2001;
Zangger, 2015,
2019).
Cervini (
2009) suggested that ignoring these factors leads to erroneous interpretations of school system functions. Educational institutions also reflect segregated neighborhoods and, in turn, the education received by children reinforces segregation (
Owens, 2020;
Taylor et al., 2023). Over the past three decades, a growing body of literature has examined the impacts of neighborhood effects on educational outcomes, particularly through mechanisms such as social integration, peer influence and spatial segregation (
Jencks & Mayer, 1990;
Galster, 2012;
Sampson et al., 2002;
Zangger, 2015).
For
Ainsworth (
2002), neighborhood characteristics influence educational achievement, and the mechanisms that influence these outcomes are related to collective socialization, social control, social capital, perceptions of opportunity and institutional characteristics. In a meta-analysis,
Nieuwenhuis and Hooimeijer (
2016) showed that the relationship between neighborhood and educational achievement is a function of neighborhood poverty, the educational climate, the proportion of ethnic or migrant groups and the social structure in the neighborhood. However, in Chile, parents are not required to enroll their children in schools located in their neighborhoods; this creates a differentiating effect and could potentially generate externalities, which we estimate in this study.
These factors may deepen patterns of academic inequality. This makes it relevant to include the spatial dimension, as it helps to understand the impact of the social environment on educational inequality (
Burger, 2019;
Zangger, 2015).
Gordon and Monastiriotis (
2016a,
2016b) considered that residential segregation is reflected in school composition through the disposition of resources and preferences. The effect of urban segregation and its impact on school choice also results in the socioeconomic segregation of educational establishments (
Kosunen et al., 2020). As these effects of these spatial aspects on Chilean school performance have not been assessed in existing studies on academic achievement, we thus focus on studying the spatial or neighborhood effect on standardized test scores.
Spatial econometrics is a methodological approach which allows phenomena to be modeled without differentiating between horizontal and vertical dependencies, which are closely tied to geography (
Dong & Harris, 2015). Therefore, we follow a spatial econometrics approach in this study. In this regard, we highlight the studies by
Gordon and Monastiriotis (
2016a,
2016b) in the UK, who aimed to explain academic performance from the spatial dimension and found that there are significant effects at the school level related with ethnicity and social class. Unlike them, we consider other relevant variables such as academic climate and origin (immigrants vs. native-born).
Recent studies have shown that residential and school structures are not independent but rather mutually and recursively shape each other. This implies that family decisions about where to live and which school to choose are associated with the perceived quality of a neighborhood’s schools, while neighborhood characteristics influence a school’s composition (
Rich & Owens, 2023). For example,
Yin et al. (
2025) analyzed the existence of a phenomenon they call educational gentrification in the case of China. The authors showed that educational gentrification can be understood as a mechanism that operationalizes the neighborhood effect on academic performance, where families with greater economic capital strategically relocate to areas with high educational quality, creating unequal school environments. This can influence spatial outcomes, given that spillovers come not only from nearby neighborhoods but also from aggregated residential mobility dynamics.
Other studies, such as that conducted by
Gresch et al. (
2023) in Germany, have analyzed how the socio-spatial structure plays a role in educational trajectories, highlighting two levels of analysis: the socioeconomic composition of the neighborhood and regional structures. According to the authors, these conditions influence educational opportunities through mechanisms such as social interaction, access to institutions and school segregation. For
Leemann et al. (
2022), residential segregation is a structural antecedent of school inequality, where the spatial dimension allows us to understand how unequal access to schools reproduces and legitimizes social inequalities. Along these same lines, authors such as
Bayard et al. (
2022) have analyzed how different spatial contexts within Switzerland affect educational inequalities, incorporating spatial analysis to better capture how environments influence educational outcomes. The authors showed that the residential environment has differentiated effects on educational performance.
From another perspective,
Angioloni and Ames (
2015) investigated the relationships between racial diversity, performance and school location in the state of Georgia, USA, finding negative effects regarding the proportion of socio-economically disadvantaged students and positive effects for racial diversity. In a similar line,
Matlock et al. (
2014) analyzed educational performance in the state of Arkansas, USA, finding negative effects for ethnic group concentration. Meanwhile,
Laliberté (
2021) found that belonging to a better zone led to positive educational performance benefits in Canada. However, there is no evidence for developing countries based on spatial econometric models, which would enable the identification of important spillover effects affecting academic achievement.
Although some recent studies, such as that of
Otero et al. (
2023), have provided evidence of neighborhood effects in Chile using administrative data, the methodologies used do not explicitly involve spatial methods. This article builds on this line, applying spatial econometric models to identify indirect and direct effects on academic performance. Our work is also differentiated from others not only because it captures spatial spillovers, but also because it decreases bias from omitted relevant variables and includes an educational establishment-level analysis.
3. Materials and Methods
In this study, we perform an analysis at the level of educational establishments, allowing us to perceive the spatial effect of academic performance at the territory or neighborhood level. We used a spatial econometrics approach to model interdependence and the multiplier effect of the geographical context, given that it allows us to establish a spatial relationship and relax the assumption of independence between the variables (
Anselin, 2016;
Elhorst, 2014;
LeSage & Pace, 2009).
Specifically, in line with
Anselin et al. (
2008) and
Gibbons and Overman (
2012), we propose the use of a spatial econometric model to explain schools’ academic performance and their spatial interactions. In particular, we use a dynamic spatial auto-regressive SAR panel model with a spatio-temporal effect, as defined in Equation (1):
where
is the standardized score in reading or mathematics for establishment i at time
;
captures the temporal effect of
;
captures the spatio-temporal relation;
captures the spatial dependence effect between schools;
represents the independent variables applied, which are associated with the educational production function;
is the fixed non-observable effect per educational establishment; and
is the error of the normally distributed estimation. The quasi-likelihood method was used for estimation, as well as the procedure presented by
Belotti et al. (
2017).
One relevant aspect of the application is
, representing the special weighting matrix, which allows us to model the spatial dependences between schools, thus being closely tied to the geographic context (
Dong & Harris, 2015;
Owen et al., 2016). The matrix is defined via geographical and distance criteria, as exogenously proposed by
Harris et al. (
2011). We used the matrix
and applied the k-nearest neighbors criterion given that due to the schools’ dispersion and distribution, it was important to ensure a minimum number of neighbors and avoid irrelevant spatial relations (
Haining, 2003;
Harris et al., 2011). The main model used the criterion
= 4 based on the existing literature, with a value between 3 and 5 neighbors commonly used in educational contexts (
Ghosh, 2010;
Jaya et al., 2018). To increase the robustness of the results, we analyzed the number of links, estimating the results with different numbers of neighbors and varying distances (in kilometers). Tests for serial autocorrelation, heteroskedasticity and endogeneity were also performed.
Data
For our principal data source, we considered the Education Quality Measurement System (SIMCE) tests applied by the Education Ministry. We collected applications from the 2014–2017 time period for reading and mathematics in 4th grade (Students between 9 and 10 years old), considering that early-level studies help to promote measures with greater public education policy impact (
Baulos & Heckman, 2022). For administrative-type data and school geo-referencing, we used data from the Official Directory of Establishments, which was used to create a balanced panel (as detailed in
Table 1).
The variables used in the study are described in
Table 2, which were selected based on the literature previously reviewed in the Introduction and Background Sections, and due to their impacts on segregation within and between schools. It should be noted that this selection is based on the Chilean context, where students may enroll in schools other than those in their home neighborhood.
The variables and main descriptive statistics are provided in
Table 3, revealing interesting data regarding school composition. For instance, 51% are public and municipally operated, 43% are charter schools (private schools with vouchers) and 6% are fully private schools. Regarding the socioeconomic composition of schools, we can see that 62% have students from the first two income quintiles, 22% have students from the middle (i.e., third) income quintile, and 15% are from the fourth and fifth income quintiles. The establishments also have almost 19% indigenous student enrollment and 2.5% immigrant student enrollment on average. It should also be mentioned that regarding the compositions of different groups, high maximum values (Max) indicate the existence of groups concentrated in specific establishments.
4. Results
First, we analyzed the number of links generated between educational establishments and their distances to develop the matrix (
Table 4). We can note that for the case of
km, the average number of neighbors for schools was
; therefore, in the analysis, we used
and
for robustness. We also carried out estimation using the inverse distance to ensure consistency and identification of the spatial dependence parameter (
Appendix A,
Table A1). We observed that there was a high volume of links between schools when broadly increasing the coverage radius, which was considered inappropriate as the distance matrix may incorrectly capture the spatial relationships (
Harris et al., 2011). In all cases, the spatial weights in the matrix were not negative and non-stochastic; furthermore, the matrix was normalized by rows, allowing them to function as a weighted average. This allowed us to measure the average impact of neighbors on schools (
Elhorst, 2014).
To explore potential geographical effects on standardized test results, we calculated the Global Moran’s index (
Chen, 2013).
Table 5 shows the presence of positive, significant spatial dependences between schools for the SIMCE reading and math tests. This fact reinforces that there is no random spatial distribution; rather, a significant association exists for school results between neighbors, especially noting that the coefficient rises with decreasing distance between the schools.
The problem with the previous test is that it does not allow us to identify the spatial structure. We thus performed the Lagrange Multiplier (LM) test (
Table 6), which ultimately enabled us to reject the hypothesis that there is no spatial auto-correlation in the error for both reading and math scores in the SIMCE Test. This was the case for both the residual spatial dependence and the lag.
We also analyzed the presence of significant accumulations, in line with
LeSage and Pace (
2009), via the Getis-Ord New-Gi* test.
Figure 1 shows the results in graphic form, allowing us to identify school clusters according to performance correlations compared with their neighbors and that persist over time regarding the SIMCE reading and math results. Notably, high segregation in the results by national region can be observed.
Table 7 presents the estimated models’ results for both reading and math tests. We compare the results obtained with the non-spatial fixed effect (FE) models, FE SAR and FE dynamic SAR models. Our principal model for estimation is the FE dynamic SAR, as the SAR model is nested within dynamic SAR according to the LR test (reading:
;
; math:
;
). Furthermore, assuming that
, the used dynamic SAR model produced consistent results, similar to those obtained with a traditional SAR model (
Belotti et al., 2017).
We should mention that for the model estimation, we used robust standard errors, considering their robustness vis-à-vis problems such as serial autocorrelation and heteroskedasticity as a reference (
Hoechle, 2007). This was carried out considering that in the non-spatial panel, we could identify the existence of problems such as heteroskedasticity via the Wald test (reading:
1.1 × 10
7;
p = 0.000; math:
.9 × 10
7,
p = 0.000), and serial autocorrelation via the Durbin–Watson test (reading
;
; math
).
In the estimates, significant controls are shown according to the signs expected in line with the traditional educational performance literature. As this study aimed to analyze the effects of the neighborhood, we focus on details regarding the direct effects. Direct and indirect effects are summarized in
Table 8. We identified that the estimated ρ coefficient is positive and significant in all cases, demonstrating the existence of a spatial association of academic performance for schools, in both the reading and math tests. Regarding consistency, we checked the robustness of the results via various spatial weighting matrices, and noted that there were no significant changes in the spatial association parameter and the controls when using the closest k-neighbors or inverse distance (
Appendix A,
Table A1).
We also determined that the variables used were not endogenous for the case of reading or mathematics scores via the Durbin–Wu–Hausman test (
Appendix A,
Table A2). Intuitively—and in accordance with the literature—those related to the choice between neighborhood and school can be considered potentially endogenous, such as socioeconomic condition. However, as mentioned by
Hedman et al. (
2011), these problems tend to be more likely in scenarios where individuals are free to choose where to live, and not in the context of socioeconomic segregation. On the other hand, in line with the study on peer effects by
Boucher et al. (
2014), we identified class size to be potentially endogenous. However, we verified the non-existence of endogeneity in all cases.
We should mention that the interpretation of impacts was carried out in direct and indirect terms. In the case of Equation (1), the effects were obtained based on partial derivation of the expected value for
(
Elhorst, 2014). We also broke down the short- and long-term effects (
Belotti et al., 2017). Therefore, assuming that
, we obtained short-term effects from
as in a typical SAR model. In the case of long-term effects, by assuming
we obtained effects from
. We looked to
LeSage and Pace (
2009) and
Golgher and Voss (
2016) as references when interpreting these effects. Thus, a direct effect refers to the impact of an explanatory variable on the school, while an indirect effect refers to the impact produced by the control variable on the neighboring schools due to being within the same neighborhood or territory. For the estimation of effects, we obtained robust standard errors through a Monte Carlo simulation (
LeSage & Pace, 2009).
5. Discussion
For analysis of the control variable estimates, we initially observed a transmission effect between municipal public schools which shared neighborhoods. In the case of reading, we can see a direct negative effect, implying the formation of low–low clusters for municipal public schools. This situation did not arise for mathematics, with no significant indirect effects appearing by school type. These differences have been documented, and are mainly attributable to the neighborhood income composition. However, some authors have indicated that the varying results between public and private schools are usually present only at a weak level (
Smith et al., 2019).
Second, we noted important transmission effects between socio-economic income groups, relating our results to an empirical approximation of residential segregation and its effects on school composition. These results appeared in both reading and math. We also noted that the long-term transmission effect for socioeconomic groups 1 and 2 (low income) is negative while, for socioeconomic groups 4 and 5 (high income), it is positive. This fact is attributable to the creation of low–low and high–high clusters due to income levels in the population. In fact, our findings also affirm the existence of indirect short-term effects for the lower-level socioeconomic groups, which are reflected in the reading and math scores, and which are more precise due to the assumption that in the estimation. This result reveals the importance of the effects of income segregation on school and neighborhood composition in the Chilean case, especially among those with low-income levels.
Authors such as
Zangger (
2015) and
Naidoo et al. (
2014) have suggested that geographical context effects seem to be mediated by variables related to social integration or income segregation, as evidenced in the Chilean case for both reading and math. Studies such as those by
Gordon and Monastiriotis (
2016a,
2016b) and
Frankenberg (
2013) have concluded that there exist relationships between residential segregation, school segregation and academic achievement, mainly due to the unequal distribution of children with varying socioeconomic traits between schools, which translates into dissimilar educational outcomes. This finding is relevant as a moderator for school-level educational results in Chile, considering the fact that income segregation is a relevant characteristic of the school system. We can also note that school size indirectly affects students’ results, as was observed for both reading and math in this study. Furthermore, population density has been documented to act as a moderator for students’ results by some authors (
Wei et al., 2018).
Third, regarding the concentration of vulnerable indigenous and immigrant populations, we noted the existence of indirect negative effects (for both reading and math) only in the case of the proportion of immigrant students in schools; in contrast, we found no evidence of indirect effects from indigenous population concentrations. The indirect effects of racial composition and its connections, for instance, with poverty, have been documented in the literature as weakening academic performance (
Garo et al., 2018). This finding is relevant given the high levels of immigration to Chile in the last 15 years, with the integration of these students creates creating migratory policy challenges for the country.
Another relevant finding is that the academic climate from the perspectives of students and teachers alike has indirect effects on performance. The significance of this variable is consistent with the prior literature (
Maxwell, 2016;
Parhiala et al., 2018;
Ruiz et al., 2018), demonstrating the relevance of academic climate to student achievement, specifically regarding the results of standardized SIMCE tests in Chile. We could consider that the academic climate is associated with the territory and is specific to the neighborhood where a school is located (
Ruiz et al., 2018); however, this approximation may be imprecise if we consider that students and teachers do not necessarily live in the neighborhoods around the schools. It would therefore be interesting to assess the relationships between the neighborhood climate and student performance components in the Chilean case in future research, especially given that there is no extant empirical evidence along these lines.
Finally, as illustrated in
Figure 1 and through the estimation results, we affirm that the geographical dimension is linked with educational inequality in Chile, such that the distribution of geographical welfare affects access to schools with better resources, thus leading to the reproduction of inequalities in student performance, as previously indicated by
Frankenberg (
2013) and
Hamnett and Butler (
2011).
6. Conclusions
The presented results confirm the existence of associations and spatial dependence in students’ academic performance at the school level, illustrating inequalities in performance on standardized tests—namely, the Education Quality Measurement System (SIMCE) tests—across the Chilean territory. We also demonstrated that there is a spatial diffusion effect between schools, which is mainly explained by concentrations of socioeconomic groups. These results align with those reported by
Lei (
2018) in the case of China, who also found that there is a positive externality between educational institutions in the neighborhood and socioeconomic level with respect to reading and math scores.
We also provided evidence for the impact of geographical segregation on schools with higher concentrations of indigenous and immigrant students, as these generate direct effects on reading tests, as well as indirect effects in the case of math and a larger proportion of immigrant students. In this regard,
Wei et al. (
2018) indicated that there is a direct relationship between academic achievement and the proportions of students by ethnicity according to geographical area; in turn, there exist relationships between the neighborhood, socioeconomic conditions and academic results. In our case, this is explained by the method for assigning spots and enrollment in public schools by the authorities, as local education departments tend to assign immigrant children to schools which already have immigrant students.
These outcomes demonstrate the importance of formulating public policies that take geography into consideration as a key aspect, particularly regarding academic supply and allocation. The results presented herein underscore how geographical and sociodemographic characteristics, marked by economic segregation, ultimately impact student results on standardized tests and lead to spillover effects between educational establishments. As such, there should be more heterogenous school contexts with less segregated schools, which would significantly benefit more vulnerable or lower-income students.