Soy Expansion, Environment, and Human Development: An Analysis across Brazilian Municipalities

: In the last decades, Brazil has become one of the largest soybean producers and exporters in the world. Although dedicated policies have been implemented since the 1960s, the recent rapid transition towards an agricultural system largely based on soy has had a strong impact on the country’s socio-economic structure—not only in terms of land and labour markets but also on its diverse ecosystems. According to the extant literature, soy has had a beneﬁcial impact on local human development, measured by the human development index (HDI) of the municipalities. However, there is a lack of empirical studies assessing the impact of soy expansion on the single dimensions of the HDI (longevity, education, and income) to disentangle the indirect effects of socio-environmental change while controlling for other local dynamics. To ﬁll this gap, we applied econometric methods to a novel dataset combining municipal-level data on soy production with socio-economic and environmental data for the period 1991–2010. Our ﬁndings conﬁrm the positive relation between soy expansion and the HDI at local level, but this relation differs between different HDI dimensions. The marginal beneﬁts of soy expansion are increasing for the income dimension but decreasing for education and longevity. On the other hand, changes in soy productivity (a proxy for agricultural intensiﬁcation) have a more complex impact on the HDI and its dimensions, but in general its marginal beneﬁts are decreasing over time. Further research could expand the time series once more up-to-date information becomes available.


Introduction
The expansion of agricultural commodities production in developing and emerging countries has had a strong impact on their socio-economic and environmental systems [1][2][3]. In Latin America, the commodity boom over the last decades has produced radical changes in the organization of agri-food supply chains, the establishment of global trade partnerships, and the living conditions of rural communities [4][5][6][7][8]. Soy production has become the main farming activity in terms of both planted surface and economic value. In this context, Brazil has been leading the rush to the exploitation of this activity by implementing dedicated policies already at the beginning of the 1960s [9,10], which have radically changed the agri-food sector in the subsequent decades.
These changes have had an influence not only on the Brazilian economy but also on the social and human development of Brazilian communities, as well as on the environment [11]. There is an open discussion on the influence of soy expansion on Brazilian

Soy Expansion and the HDI in Brazil
With an area of 8.5 million km 2 , Brazil is the fifth largest country in the world, and in 2018, it was the fourth country by agricultural production after China, India, and the US, accounting for around 5% of the world's total [24]. Overall, 36.1% of its land is publicly owned (with 6.4% officially "undesignated"), 44.2% is private, and 16.6% is unregistered or with unknown tenure [25]. According to [26], based on satellite images from 2018, 7.8% of the national territory is used for agriculture, 12.9% for pasture, and 1.0% for planted forest.
Soy began to be cultivated in southern Brazil already in 1900, but it was only in the 1960s that it became an economically important crop [10]. The dramatic expansion of soy can be appreciated by observing the total production, which in the last 50 years (1969-2019) has grown from 1.0 to 114.2 million tons per year, while the cultivated area jumped from 0.9 to 35.8 million hectares [27]. In spatial terms, soybean cultivation began in the South of the country and then expanded towards the North (see Figure 1), also reaching the Amazon biome [28]. Today, a large portion of the country's production is from the Brazilian savannah (Cerrado) [29]. Brazil includes five Federal macro-regions, of which the South has high HDI and a longer tradition of soy growing; the Northeast and the North are the poorest ones, have a lower HDI, and have been the main areas for soy expansion in the past two decades; the Southeast is the most industrialized region and has a high HDI; and the Centre-West has seen an improvement of its HDI and is currently the main region for soybean cultivation in Brazil [27,30]. The five macro-regions are identified by means of thicker borders in the maps in Figure 2. Currently, the soybean complex (grain, oil, and bran) is considered the main agricultural activity of the country, both for its territorial incidence (35.8 million hectares in 2019, equivalent to 56.1% of the area used for temporary crops) [27] and for its economiccommercial importance (14.5% of the total Brazilian exports and one third of the agribusiness exhibits of 2019) [31]. In addition, CEPEA [32] estimates that the soybean chain accounted for 2.2% of the national GDP in 2017. The area cultivated with soybeans in Brazil exceeds the area of Germany in size, and the country has solidified itself as the largest producer and exporter of soybeans in the world [33].
If soybean is considered strategic at national level, it is even more significant for the municipalities where it is grown. There are currently 5570 municipalities in Brazil, which are classified as either urban or rural following administrative criteria. Every municipality includes necessarily an urban area: in some cases, like capitals and metropolitan municipalities, 100% of their area is classified as urban, while there are no fully "rural" municipalities [34]. Based on the 2017 Agricultural Census, soybeans are cultivated in 2427 municipalities (43.7%), and in 337 of them, the soybean area accounts for more than 30% of the total area of the municipality [27]. In addition, soy production accounts for more than 30% of the municipality's GDP in 310 municipalities [35]. Soy also has a relevant political role in the Brazilian regions, being a symbol of prestige and providing political power to the largest rural producers [10,36,37]. While official statistics provide a clear overview of the economic impact of this commodity, the literature on soy expansion in Brazil has highlighted that its effects are not limited to the economic sphere. Many studies have analysed the social and environmental changes produced by soybean expansion in Brazil, but their findings are not consensual. Garrett and Rausch [13] suggest that this phenomenon has had a negative effect on natural capital stocks and rural equity, producing radical changes in the labour market, but has also improved the rural population's well-being by generating revenues that have been reinvested in health and education programmes. VanWey et al. [38] argue that the economic growth generated by soy expansion has mainly favoured local elites or foreign investors, but at the same time, they highlight the positive spill-over effects in terms of education and healthcare. Weinhold et al. [39] confirm the positive impact of soy production on household income in certain Brazilian regions (e.g., the Amazon region) but also shed light on the inequality problem and the association of this phenomenon with environmental issues such as deforestations, which is particularly damaging for specific social and ethnic groups.
Some studies have specifically focused on the HDI for measuring the effects of soy expansion on the Brazilian society. As briefly illustrated in the introduction, the HDI was created by the United Nation Development Program in 1990 [17,40] to measure the performance of different countries in terms of human development. This index focuses on three dimensions (longevity, education, and income) which are assessed using four indicators: life expectancy at birth (longevity dimension); expected years of schooling and mean years of schooling (education dimension); and income per capita (income dimension). The indicator, as well as its dimensions, ranges from zero to one, where lower values indicate lower levels of human development.
In Brazil, the HDI is calculated up to the municipal level. The HDI of Brazilian municipalities, as well as its three dimensions, have been improving between 1991 and 2010 (the 2020 Demographic Census would have provided more up-to-date HDI values, but the data collection process has been postponed to 2022 due to the COVID-19 pandemic and lack of financial resources. Figure 2 illustrates this evolution by means of maps. At the national level, the Brazilian HDI was equal to 0.493 in 1991, ranking Brazil below the average for Latin America and the Caribbean [30]. Most Brazilian municipalities (85.8%) had a very low HDI score, and no one had a "high" or "very high" HDI, using the UNDP classification. In 2000, the national HDI increased to 0.612 ("medium level"), but still 70% of the municipalities were in the "low" and "very low" range, and only 2.4% were presenting a "high" HDI. Most of the municipalities that improved their scores between 1991 and 2000 were in the South, Southeast, and Centre-West regions of the country, while the municipalities in the North and Northeast regions continued to show a "very low" HDI. In 2010, most of the municipalities with a "very low" HDI in 2000 had improved their scores, although regional differences remained significant (the HDI of the Southern municipalities is much higher than that of the Northern municipalities) [30].
Amongst the studies focusing on the relationship between soy expansion and human development in Brazil, Arvor et al. [16] found a positive relationship between soy production and the HDI at municipal level. However, the magnitude of this relationship is decreasing through the years, while the environmental conflicts are increasing because of the "schizophrenic Brazilian domestic policies" [16], where efforts to improve environmental governance are undermined by interventions threatening the country's ecosystems (new dams, roads, agricultural expansion, amnesty for illegal deforestation, etc.).
Lima-de-Oliveira and Alonso [20] suggest that regional inequalities are not related to the soybean boom and highlight the importance of local and national policies in redistributing resources derived from soy production. Martinelli et al. [19] complement this discourse by pointing out that the debate between positive and negative impacts of soy expansion should concentrate on regional differences and on the main economic and institutional structures characterizing Brazilian regions. According to these authors, as well as other authors in the literature [13,41], soy is an engine for local development, but this is mostly true if we focus predominantly on economic aspects, while its impact on social and environmental aspects is les straightforward. Indeed, soy expansion boosts local income because it is an important economic activity [31,32] but may also intensify income inequality [19]. For example, the expansion of this commodity is related to the level of education and life expectancy, but its effects on these dimensions are ambiguous since the resulting investments in health and education are not always enough to produce basic services to be offered to the rural population [1,35,42,43]. Moreover, the environmental impact of intensified agricultural production and the concentration of land and other resources deriving from soy expansion can generate negative spill-over effects on life expectancy, as well as education. Hence, we might observe decreasing and even negative marginal returns in terms of human development after a certain level of production and productivity is achieved.

Data
Our analysis relies on three datasets, including: It is worth mentioning that the HDI in our dataset is not calculated according to the most recent protocol, which was only developed by the UNDP in 2010 [21]. In particular, in more recent human development reports, the simple average of the three HDI dimensions (longevity, education, and income) was replaced by their geometric mean; the minimum life expectancy used to standardise the longevity indicator was reduced from 25 to 20 years; the education dimension, previously represented by the weighted mean between the adult literacy rate (2/3) and the gross enrolment index (1/3), was replaced by a simple average between the mean years of schooling and the expected years of schooling indexes; the GDP per capita was replaced by the GNI per capita as a proxy for income; and its maximum, used for standardising the income indicator, was increased from $40,000 to $75,000 [21,45]. Therefore, our HDI is a simple average of its three dimensions and, most important, the education dimension is based on adult literacy and gross enrolment. However, the calculation method is constant along the period considered; therefore, yearly values can be directly compared, and changes can be calculated.
The three above datasets were merged into a single panel dataset where the observations are represented by the municipalities. Only a limited number of observations (12 out of 5570) were lost because of administrative changes in municipalities through the years. From the dataset on soy-related variables, only the values relative to the census years 1991, 2000, and 2010 were retained, in line with the data available for the HDI and its dimensions. Although this is a panel dataset, it only includes three observations per unit (two for the variables measured by the Agricultural Census) which become two (or one) when the variation between years is calculated.
We do not include the variables from the Agricultural Census in the main analysis for three reasons: (1) the measurement years (2006 and 2017) did not correspond to the years when the HDI was measured (1991, 2000, and 2010); (2) their inclusion would reduce the sample size further by eliminating one more year; and (3) they are endogenous: they are likely to be impacted by soy expansion and to impact human development (longevity) in turn; therefore, they cannot be included jointly with soy-related variables. Nevertheless, to contextualise our main model, as a first step, we will describe the relationship between soy and environmental variables using pairwise correlation indexes, associating the year 2000 with 2007, and the year 2010 with 2017. If more observations were available for the same years, a multilevel modelling approach (e.g., two-stage least squares regression) could be adopted.
Before estimating our models, we calculated new soy-related and socio-environmental indicators at the municipality level. We did not use the economic value of the soy harvest because we could not isolate how much of its variation was due to changes in soy prices in space and time. Instead, we retained the total area of land used for soy in the municipality, and to account for the huge variability in population size between municipalities, we complemented this information by calculating the soy area per capita (using the census population). We also calculated soy productivity in tons per hectare. Then, we obtained the change in these variables, as well as in socio-environmental variables, the HDI, and its three dimensions between years. Changes were calculated as absolute rather than relative changes to avoid losing observations when the initial value was zero. Our emphasis on soy area per capita and on productivity, as well as the use of econometric models, differs from the strategy adopted in previous studies. Indeed, it provides an overview of the complex shape of the relationship between key variables and does not rely on significant though arbitrary thresholds for testing differences (e.g., Arvor et al. [16] compare "soybean areas" with "non-soybean areas" based on null production; Martinelli et al. [19] adopt a threshold of 300 hectares-1st quartile of the median soy area in 1991). For the municipalities with zero soy area, productivity cannot be calculated; in this case, productivity was assigned a value of zero, which is equivalent to estimating the effect for "soybean municipalities" only.
Due to different municipality sizes and levels of soy expansion, many soy-related variables (including those identifying farming practices) were very skewed and required transformation before being used in a regression model [46]. These are: the total area used (median 2500 hectares, average 10,602 hectares, range from 1 to 608,000 hectares considering only "soybean municipalities"); the area per capita (0.25 hectares, 1.03 hectares, 0-54.20 hectares); the area of soil corrector application (6318 hectares, 19,860 hectares, 0-939,700 hectares); and the area of pesticide use (1792 hectares, 5963 hectares, 0-248,251 hectares). For these variables, as well as for the size of population, a logarithmic transformation was used. Instead, the change in the total area used (median of 0 hectares, average of 1230 hectares, range from −121,485 to 265,002 hectares) and the change in the area per capita (0 hectares, 0.13 hectares, −26.01 to 54.20 hectares) were transformed using the squared root ( Synthetic statistics for all the variables appearing in the analysis are provided in Table 1 below. Synthetic statistics by year are available in Table A1 in the Appendix A.

Theoretical Model and Modelling Strategy
Our theoretical model assumes that the change in the HDI is a function of its baseline value, of the baseline values of soy-related variables, of the baseline value of other geographical and socio-economic variables, and of the change in soy-related variables between periods. Including the baseline HDI value allows us to test if there is convergence or divergence across municipalities, expending the idea of economic convergence [47][48][49] to human development. The inclusion of variables describing the sectoral structure of the municipal economy allows us to control for the impact on human development of changes in sectors different from agriculture and is a relevant addition compared to the previous literature. Unfortunately, these variables are only available for 2000 and 2010; therefore, their value at the end rather than at the beginning of the period is used to avoid losing information. Preliminary analyses show that the results do not change significantly. The theoretical model has, thus, the following form: where i identifies a municipality, t a year of observation, and ∆ a variation; reg identifies time-invariant macro-regional dummies, Soy i,t time and municipality-specific soy variables, geo i,t geographical characteristics (population and rurality), and eco i,t economic characteristics (inequality and economic sectors). This theoretical model is first used to explain the changes in the overall HDI and then replicated for each of its dimensions (longevity, education, and income). The only explanatory variable that changes in different models is the initial value of the HDI, replaced by each of its dimensions in turn.
In terms of estimation strategy, we considered three model typologies [23]: (1)  (2) random-effects panel models; and (3) fixedeffects panel models, assuming fixed effects at the municipality level [50]. Concerning panel models, the Hausman test rejected the hypothesis that individual-level effects are adequately modelled by random effects (p = 0.000); therefore, in the following, we only present the fixed-effect ones. To appreciate the impact of time-invariant characteristics, and since the two remaining models measure different aspects of the same process, the OLS models are also maintained and are presented in Section 5.
As mentioned above, a two-stage model [37] could also be estimated where soy expansion affects socio-environmental variables and this impacts human development (or, at least, some of its dimensions) in turn. Since we had just two observations for environmental variables, and the years do not correspond with the rest of the dataset, we decided not to pursue this strategy. The empirical OLS model takes the following form: where βs are the coefficients, the sums indicate that more than one variable is included in each category, I t are dummy variables for time, and i,t is the additive error term.
The fixed-effect model assumes a simplified form compared to the theoretical model (1), more precisely: This is because this model assesses the effects of a deviation of the covariates from their averages along the period within the same municipality on a deviation of the HDI for the same municipality in the same period. As a result, the variations are replaced by levels, and time-invariant covariates are omitted. Estimating a fixed-effect panel model is equivalent to using OLS to estimate the following empirical model: where again, βs are the coefficients, the sums indicate that more than one variable is included in each category, overlines indicate averages along the three years considered, and i,t is the additive error term. G i,t is the Gini coefficient (the only economic variable maintained: the other ones are omitted to avoid losing observations).
Interaction terms between time and the changes in soy variables are included to account for changes in the coefficient during different periods (e.g., decreasing marginal impact of soy on human development between the 1990s and the 2000s). A join test on the coefficients for time shows that time-fixed effects must be included in the fixed-effects model, and this is done as shown in (4).
Before running the actual estimates, all the variables were checked for multicollinearity, and those presenting this problem were excluded from the models, namely: the GDP per capita (which is a dimension of the HDI); the share of employees in the agricultural sector; and the total area used for soy in the municipality (only its variation between years was maintained in the OLS model). Other specifications, such as interactions between covariates different from soy and time, or interactions between income and inequality, were also tested, but they either generated multicollinearity issues or caused the model to become less parsimonious, while the results did not change significantly compared to the specification presented. Income and agricultural employment are not shown in Table 1.
The analyses were implemented with Stata15, using the command xtreg for the fixedeffect models [51].

Results
This section presents the model results. As a preliminary step, we provided evidence on the relationship between soy expansion and the socio-environmental variables detected from the Agricultural Census in 2006 and 2017, namely, the use of pesticides, limestone and other correctors of soil pH, and the prevalence of family farms in the municipality.
Correlation coefficients between these variables are shown in Table 2. Most of them were positive and significant, meaning that a higher initial value of soy variables (area per capita, productivity per hectare, and total area), or larger changes in these variables, were related to higher use, or larger increase in the use, of the above products. The only exception was the correlation between the "change in the share of farms that applied soil amendments" and the changes in soy variables. The correlation between all the soyrelated variables and the incidence of family farms (number and area) was also negative, suggesting that soy expansion is associated to the growth of non-family farms.
Overall, correlation was stronger between the levels of soy-related and socioenvironmental variables rather than between variations or between levels and variations, meaning that there was a decoupling in the rapidity of change. Exceptions are represented by the changes in the areas where pesticides and limestone and other correctors of soil pH were applied at the level of municipality, which were strongly positively correlated not only with the initial incidence of soy but also with the change in the area used for soy (0.420 and 0.572). The largest absolute correlation coefficients were observed between the area used for soy and the share of area where pesticides were applied (0.595); between the former and the absolute area where pesticides were applied (0.589); and between soy productivity and the share of area where pesticides were applied (0.585).  In the following, we present the relationship between human development and soy expansion. The distributions of key variables measuring these phenomena among Brazilian municipalities are plotted separately for each year, in Figure 3 below. There is a clear upward trend in the average HDI at municipal level, from 0.381 in 1991, to 0.523 in 2000, to 0.659 in 2010. These values differ from those presented in Section 2 because they represent an average of the HDI in the single Brazilian municipalities in the years mentioned, while those in Section 2 refer to the HDI at national level. The same trend is observed for each of its dimensions, although with different nuances: longevity starts from the highest position and maintains it despite increasing the least; education starts from a very low value and increases significantly but remains the lowest. The upward trend is also observable in the soy variables, primarily productivity, although the skewness of the distributions (even after transformation) makes the box plots more difficult to interpret. The jump in the total area of soy in the municipalities with lower values is particularly evident in 2010. We first present the OLS models, whose estimates are provided in Table 3. The model for longevity has the strongest explanatory power (R 2 = 0.504), while the one for education performs relatively worse (R 2 = 0.293).
Compared to the North Federal region (baseline), the Northeast was the only region to show a slower increase in the HDI, as well as its longevity and income dimensions. In turn, the South performed relatively better, especially in terms of education, less for longevity. The coefficient for time showed that, compared to 1991-2000 (baseline), the improvement in human development in 2000-2010 was larger ceteris paribus, especially in the education dimension.
Concerning soy-related variables, a larger increase in the area of soy per capita in the municipality was related to a larger increase in the overall HDI and its education dimension. In turn, an increase in the total agricultural area used for growing soy resulted in a marginal decrease in the HDI and its education dimension. In both cases, the relationship with other HDI dimensions was non-significant. An increase in productivity was positively related to an increase in the HDI and all of its dimensions, particularly for education and less for longevity. Introducing interaction effects between soy variables and time allowed us to assess if there were changes in the size and direction of impact in the decade 2000-2010 compared to the previous one. The impact of the change in total soy area did not vary significantly depending on the decade. Instead, the impact of a change in the area per capita in 2000-2010 (calculated as the difference between the coefficients for this variable with and without interactions) was smaller compared to 1991-2000 for the education dimension, larger for the income dimension, and still non-significant for longevity. Notably, in 2000-2010, the net coefficient for productivity was negative for both the overall HDI and its education and longevity dimensions, meaning that an increase in productivity was related to a reduction in the HDI and its education and longevity dimensions but to an increase in income, although smaller than in the first decade.
An initially higher value of the municipal HDI resulted in smaller improvements in this indicator. The same dynamic could be observed for each of the three dimensions.
The initial values of the soy area per capita and of soy productivity (total area omitted for multicollinearity) had a positive and significant impact only on the income dimension of the HDI, while productivity had a negative and significant impact on longevity, meaning that this dimension tended to increase less in municipalities starting with higher soy productivity.
We included variables in the model to control for the structure of the population and of the economy of the municipality. The initial population size was negatively related to changes in human development and its education and income dimensions (the coefficient for longevity was non-significant); the share of rural population was positively related to changes in the overall HDI and its income dimension but negatively related to changes in longevity and especially education. Larger income inequality (measured by the Gini index) had a positive impact on education and longevity and negative on income, but its impact on the overall HDI was non-significant.
Finally, the relationship between the shares of employment and the change in the HDI was positive and significant for all the economic sectors with the exception of public employment (non-significant), and it was only marginally significant for mining. The primary sector was omitted for multicollinearity; thus, we can hypothesise a negative correlation between its size and human development. Interestingly, the share of employed people had the largest positive effect among economic variables, while the share of selfemployed people had a marginally negative effect. This could be due to the presence of precarious forms of self-employment. Notes: All coefficients are presented multiplied by 10 3 to allow the readers to observe small differences between them. Number of observations in each model: 11,116. Significance levels: * 0.10, ** 0.05, *** 0.01.
As a final step, we discuss the estimates of the fixed-effects panel models, presented in Table 4. While OLS models illustrate the effects of variation between municipalities, fixed-effects models show the impact of a deviation from the average of a covariate within the same municipality along the period considered. This is the reason why the levels of the HDI and of soy-related variables were included, not their changes (otherwise, we would be measuring the effects of changes in the changes). The shares of employees by sector were omitted to increase the length of the panel to three periods and because municipal characteristics were implicitly considered by the fixed effects. Regional dummies were omitted as they are time-invariant. These models fit the data better than the OLS models, except the one for the income dimension. Differently from the OLS models, the best performance, indicated by the R 2 within, was achieved by the model for the overall HDI, followed by the model for education.
The coefficients for time-fixed effects confirmed the increase in human development over time; the improvement was maximum for the education dimension and minimum for income; ceteris paribus, the improvements in 2010 were about twice those in 2000 (compared to the 1991 baseline). Interaction effects between time and soy variables were also included. Other control variables were the population size, the share of rural population, and inequality. An increase in the size of municipal population resulted in a reduction in human development (in line with the OLS models), especially its income dimension, while there was no significant effect on education. Equally, an increase in the share of rural population had a clearly negative effect on the HDI and all its dimensions, primarily education. The OLS models showed instead that an initially larger share of rural population results in a larger increase in the HDI and its income component but did not say anything on the impact of a change in this share. Instead, increase in inequality resulted in increased human development, longevity, and especially income; however, the impact of inequality on education was negative.
Concerning soy variables, an increase in the area of soy per capita was related to an increase in the overall HDI, education, and to a lesser extent, longevity but to a decrease in income. This is in line with the above models. However, when interaction effects with time were considered, we observe that the impact on education and longevity of an increase in the soy area per capita became progressively smaller, especially in 2010 and for longevity. Instead, the effect on income turned from being negative to being positive in 2010 and increased further in 2010. Therefore, the marginal benefit of soy expansion was decreasing for education and longevity, in line with the previous literature [16,19], but increasing for income. The impact of changes in productivity was more complex and varied with time: initially positive and then negative for the overall HDI; initially negative and then largely positive for education; initially small but positive and then increasingly negative for longevity; and always positive but decreasing for income.

Discussion
By considering the 1991-2010 time period, we investigated the effects of soy expansion on the HDI of Brazilian municipalities, focusing on the single dimensions of this index and on the indirect impact of environmental change. Soy expansion has had a strong influence on the Brazilian economy; moreover, it has radically modified the socio-economic structure of rural areas and their environment. While its economic effects have been acknowledged in the literature, there is a lack of consensus on its social and environmental impact. We tried to address this gap by applying panel data econometric models to municipal-level data for understanding the evolution of the HDI while looking at its single dimensions and controlling for location-specific factors.
Our correlation analysis showed that there is a strong link between soy expansion and the use of agrochemical and soil correctors. The only exception, i.e., the decrease in the share of farms applying soil correctors, is probably because the demand for such treatment is not present in all cultivation regions, being more intense in the early years of soybean cultivation in the Brazilian Cerrado [10]. In addition to environmental concerns related to the use of agrochemicals, the negative relationship between soy expansion and family farms suggests that this commodity can have negative social impacts. Indeed, family farms are, on average, smaller than corporate farms: according to the 2017 Agricultural Census, in Brazil the average size of family farms (identified based on Federal Law n. 11,326) is 20.8 hectares, compared to 69.2 hectares for all types of farms [27]. Moreover, their presence improves food security and allows a fairer distribution of agricultural income [52,53].
The relationship between soy expansion and the use of large amounts of agrochemicals, and the potential negative effect of the latter on human health, have been discussed at aggregate level by Arvor et al. [16]. In line with their arguments, we expected the increased use of agrochemicals and the reduction in the number of family farms to have a negative effect on the HDI, primarily on its longevity dimension.
Our models confirmed the positive relation between soy expansion and human development identified in previous studies [16,20]. This relationship was observed both in Brazil overall (OLS model), and within the same municipality (fixed-effect model). However, we found that education was less influenced by soy compared to the other dimensions of the HDI. Soy expansion led to an increase in education and longevity but with decreasing marginal benefits over time.
The difference in performance between the Northeast and the South of Brazil remains relevant, confirming that overcoming this regional gap requires time. The Northeast is characterised by less fertile soils and higher poverty and represents, along with the North, the latest frontier of soy expansion [29]; instead, the South presents the most developed agriculture and the longest tradition of soy growing [54].
As expected from the theory on economic convergence [47][48][49], an initial higher value of the municipal HDI results in smaller improvements in this indicator, suggesting that there is a "catch-up" effect in terms of human development, as already observed by Limade-Oliveira and Alonso [20], and that municipalities with initially higher HDI benefit relatively less from investments in soy expansion. Moreover, we found evidence of a progressive decoupling (in the 2000s compared to the 1990s) between the HDI and the soy variables, as also observed at a macro level by Arvor et al. [16].
We identified a significant relationship between rurality, a municipality's population size, and its human development trajectory. Small and rural municipalities experienced a larger improvement in the HDI, driven by income. However, if the population, and the rural population in particular, grew faster than the economy, human development was negatively affected, especially its education dimension. The GDP per capita was omitted from the models for multicollinearity, but the Gini index, closely linked to the latter, was found to be negatively correlated with the income dimension of the HDI, confirming the convergence hypothesis and suggesting the validity at local level of the Kuznets curve in emerging countries such as Brazil [55,56]. Inequality is likely to increase during phases of economic development [55], and education and longevity are likely to expand in parallel.
Previous studies [16,20] reached contrasting findings about the impact of soy expansion on inequality; unfortunately, our model does not allow us to test this specific relationship.
We found that increases in soy productivity had a more complex impact on the municipal HDI than other soy variables. Notably, the marginal benefits for human development of productivity improvements were decreasing, and even became negative in the 2000s. This was probably due to the intensive use of agrochemicals to boost productivity, with negative spill-over effects on the environment, and to the increasing concentration of farm income at local level. Our analysis showed that higher productivity is likely to be achieved through larger use of pesticides and limestone and other correctors of soil pH, which could negatively impact health and, thus, longevity. Such dynamics suggest that due to the decreasing and even negative marginal benefits of productivity increase, in the longer-term, soy expansion only benefits income.
Shortly, our analysis confirms that the marginal benefits of soy expansion for local human development are decreasing, including for the income dimension of the HDI. The impact on longevity of increasing productivity has already been negative since 2000, probably due to the harmful health effects of intensification [19]. The publication of the results of the 2022 Census will allow us to check if this trend has continued and if the marginal benefits of soy expansion have reduced further or have even become negative.

Conclusions
Our findings corroborate the idea that soy expansion is a multifaceted phenomenon that could have a relevant impact on the economy of a country and the livelihoods of rural households. Something that has emerged from the literature, and in this work as well, is that intensification of production can be detrimental to soil quality, increasing the need of using agrochemicals and creating additional environmental problems. There is a need for better planning of soy expansion that accounts for the quality of soil to avoid depletion and dramatic drops in yields (and abandonment of unproductive land) in future decades. Furthermore, soy expansion generates land concentration through reduction in the number of small family farms and the increase in the number of large farm corporations focused on exporting. The negative effects on some dimensions of the HDI, as well as our preliminary analysis based on the data of the Agricultural Census, suggest that food security is threatened too.
Limitations of this work include the short length of the time series. Further research could build on these results and expand the timeframe of the analysis when more recent HDI values are published. Two-stage models could also be estimated, assessing the effects of soy expansion on a number of environmental variables at the local level (not only the use of agrochemicals and soil correctors but also wildfires and deforestation, among others) and the effect of these environmental variables on human development, in turn. Finally, spill-over effects between municipalities could be included.

Conflicts of Interest:
The authors declare that they have no known competing financial interests or personal relationships that could have influenced the work reported in this paper.