Modeling Yield of Irrigated and Rainfed Bean in Central and Southern Sinaloa State, Mexico, Based on Essential Climate Variables

: The goal was to model irrigated (IBY) and rainfed (RBY) bean yields in central (Culiacán) and southern (Rosario) Sinaloa state as a function of the essential climate variables soil moisture, temperature, reference evapotranspiration, and precipitation. For Sinaloa, for the period 1982–2013 (October–March), the following were calculated: (a) temperatures, (b) average degree days for the bean, (c) cumulative reference evapotranspiration, and (d) cumulative effective precipitation. For essential climate variables, (e) daily soil moisture obtained from the European Space Agency and (f) IBY and RBY from the Agrifood and Fisheries Information Service were used. Multiple linear re-gressions were significant for predicting IBY–RBY (dependent variables) as a function of essential climate variables (independent variables). The four models obtained were significantly predictive: IBY–Culiacán (Pearson correlation (PC) = 0.590 > Pearson critical correlation (CPC) = |0.349|), RBY– Culiacán (PC = 0.734 > CPC = |0.349|), IBY–Rosario (PC = 0.621 > CPC = |0.355|), and RBY–Rosario (PC = 0.532 > CPC = |0.349|). Due to the lack of irrigation depth data, many studies only focus on modeling RBY; this study is the first in Sinaloa to predict IBY and RBY based on essential climate variables, contributing to the production of sustainable food.


Introduction
By 2050, the world population will have increased to approximately 9725 million inhabitants [1], so food production will also have to increase by about 70% [2]; cited by [3].Due to these projections, the United Nations General Assembly [4] adopted the 2030 sustainable development agenda [5].This agenda proposes the topic "food production" with five priority goals: (1) ending poverty, (2) responsible production and consumption, (3) decent work and economic growth, (4) sustainable cities and communities, and (5) zero hunger.Specifically for this last goal, beans are a crop that can contribute to reducing world hunger, lowering the total of 2795 million people who will suffer from hunger by the year 2050 [4].According to [6][7][8][9], to achieve these five goals, considering the effects of climate change, the prediction of agricultural crop yield through multiple linear regressions (MLR) and essential climate variables is an efficient tool.There are 55 essential climate variables [10], of which certain variables stand out for being easy to obtain and important in agriculture: (1) average surface soil moisture (ASM), (2) cumulative effective precipitation (CEP), and air temperature [11,12].According to [13,14], from the air temperature it is possible to calculate (3) average maximum temperature (AMT), (4) maximum maximorum temperature (MMT), (5) average minimum temperature (AmT), (6) minimum minimorum temperature (mmT), (7) average mean temperature (AMeT), (8) maximorum mean temperature (MMeT), (9) degree days, and (10) cumulative reference evapotranspiration (CET), these nine essential climate variables being the main ones that affect crop yields.According to [15], the crops most sensitive to extreme essential climate variables conditions in Latin America are corn, wheat, and bean.In Mexico, both irrigated (IBY) and rainfed (RBY) beans are crops that require special conditions of ASM and air temperature [11]; therefore, strategies must be developed to guarantee a source of protein, especially for people in extreme poverty [16].Most bean yield modeling studies focus on only RBY; for example, [3,17].The scarcity of IBY modeling studies is due fundamentally to the complications of obtaining daily data on irrigation depths and exact precipitation that the crop uses for proper growth [3,18].The incomplete IBY-RBY and climate series [8,18], which usually do not exceed the statistical threshold for modeling [7], represent another limitation for the creation of sensitive models to predict IBY-RBY.Specifically in Sinaloa, the sulfur bean is the most cultivated bean [19], occupying a large proportion of the annual demand (120,000 tons), obtained with a harvest of 74,000 hectares (average yield of 1.66 t ha −1 ) [20].Although this autumn-winter bean is the most widely cultivated in Sinaloa, there are still no predictive models of its IBY-RBY.This condition exacerbates the vulnerability of this crop to extreme essential climate variables events, such as frost [21], hot extremes [22], and CEP irregularity, which are common meteorological phenomena in this state [11,22].
In this study, the following data were collected for two meteorological stations in Sinaloa (Culiacán-central and Rosario-south) and for the period 1982-2013, for the autumn-winter cycle (October-March): (a) For essential climate variables, using the National Meteorological Service (SMN)-National Water Commission (CONAGUA) database [23], daily series of precipitation and maximum-minimum temperatures were obtained.To obtain reliable, long-term, good quality results [24,25], the SMN-CONAGUA daily series were homogenized using the standard normal homogeneity test method [26].With the homogenized series, the mean daily temperature (meanT) was determined.The annual series of AMT, MMT, AmT, mmT, AMeT, MMeT, average bean degree days (ABDD) [14], CET, and CEP were then calculated.(b) From the European Space Agency (ESA) experimental break-adjusted COMBINED Product (version 07.1) [27] with a spatial resolution of 0.25° × 0.25°, daily soil moisture was obtained.These data were obtained for two points near the Culiacán and Rosario stations, respectively.ASM was calculated for each year.(c) From the Agrifood and Fisheries Information Service (SIAP) [28], the annual series of IBY and RBY were obtained.
Standardized Z normalization was applied to the annual series of all variables.Pearson (PC) or Spearman (SC) correlations were calculated, depending on the normality of the series.To reveal which correlations were significantly different from zero [29], a hypothesis test was applied between PC and SC, on the one hand, and. the Pearson (CPC) and Spearman critical correlations (CSC) on the other [7].Using MLR, four predictive models of IBY-RBY (dependent variables) were obtained as a function of essential climate variables (independent variables).The four models were significantly predictive (PC > CPC).All MLRs met the assumptions of no autocorrelation [30], linearity and severe nonmulticollinearity [31][32][33], homogeneity, and normality.
The scope of this study focuses not only on providing an explanation of the interactive process between IBY-RBY and essential climate variables (correlation) but also on generating equations of the models, which provide an easy diagnostic tool (prediction) for IBY-RBY.This methodology can be used for other crops from different parts of the world.
Specifically in this study, the goal was to model IBY-RBY as a function of essential climate variables in central and southern Sinaloa.This goal has not been addressed by any previous study in Sinaloa, which is why it may stimulate other investigations to expand on filling this research gap.
This study provides models with predictive sensitivity for IBY-RBY [34] based on extreme essential climate variable events [35][36][37][38].The results contribute to the prevention of negative effects due to possible decreases in IBY-RBY, helping in the production of sustainable foods [5] in a purely agricultural state, considered to this day as "the breadbasket of Mexico" [20].

Study Area
The state of Sinaloa is located in the northwest of Mexico (Figure 1) and has a predominantly semi-arid aridity index [7].This state, called "the breadbasket of Mexico", has historically had high rates of agricultural production.Specifically, bean cultivation occupies a significant position, mainly due to the large planted area and economic income generated.For example, in 2013, beans were harvested from 69,727 hectares of irrigated land and 4723 hectares of rainfed land [20].

Quality Control and Homogenization of Meteorological Series
From SMN-CONAGUA "https://smn.conagua.gob.mx/es/climatologia/informacionclimatologica/informacion-estadistica-climatologica(accessed on 15 September 2023)" [23], daily series of maximum and minimum temperature and precipitation were obtained from 70 meteorological stations in Sinaloa.Initially, it was decided to include the stations that presented < 5% daily missing data.This condition was met by only five stations: Culiacán, Santa Cruz de Alaya, Las Tortugas, Rosario, and La Concha.The data from these five stations were previously obtained by [7].Because it has been stated [25] that to detect long-term climate change reliably via good quality research, one must work with homogenized series, in this study, the meteorological series were homogenized with the Climatol library "https://climatol.eu(accessed on 12 January 2024)" [24,39].This software is based on the standard normal homogeneity test method [26].This library also calculated the missing data, using Expression (1): where y ̂ is the estimated meteorological data, through the corresponding nearby data points x 1 , x 2 , … , x n , available at each time step j and with weight w j assigned to each of them.Statistically, the expression y ̂= x j is a linear regression, with a reduced major axis (orthogonal regression), in which the line is fit to minimize the distances of the points perpendicular to it [24].By means of the semi-sum of the homogenized maximumT and minimumT, meanT was obtained.In accordance with what was stated by [13,40,41], the monthly averages of AMT, MMT, AmT, mmT, AmeT, and MMeT were calculated.

Average Bean Degree Days (ABDD), Cumulative Reference Evapotranspiration (CET), and Cumulative Effective Precipitation (CEP)
ABDD was calculated with a daily scale and monthly average (Expression ( 2)) [14]: where ABDD = average bean degree days (°C day −1 ), x i = mean daily temperature, and T b = base temperature = 8.3 °C (for Mexican bean varieties) [14].In this study, the base temperature of the bean was considered to be 8.3 °C, because the models were for Mexican bean varieties only.
Due to the absence of data to calculate bean evapotranspiration, in this study, it was decided to obtain monthly CET using the Hargreaves method [42], expressed by Equation (3).Specifically, this method was applied due to the absence of crop coefficient for bean, relative humidity, and wind speed data (only maximumT and minimumT data were available).Once the meanT, maximumT, and Ra data were obtained, Formula (3) was applied: where CET = cumulative reference evapotranspiration (mm month −1 ), maximumT = maximum temperature (°C month −1 ), minimumT = minimum temperature (°C month −1 ), meanT = mean temperature (°C month −1 ), and R a = solar radiation (mm day −1 ).As recommended by [43], the values of Ra obtained initially (MJ m −2 day −1 ) were multiplied by 0.408 to obtain the units (mm day −1 ).CET in mm month −1 was then calculated.Although the CET value was from the reference crop (for all crops), it represents a good approximation of the effect caused on bean yield, because bean evapotranspiration is a function of CET [44].
Finally, for the monthly determination of CEP, it was decided to apply Expressions (4) and ( 5) [44], since according to [45], when the study area is mostly flat, as in Culiacán and Rosario (all crop fields with slope < 5%), this method is ideal.
CEP values for the periods Oct-Dec and Jan-Mar for the period 1982-2013 were used.
In this study, the sowing-harvest cycle ranged from 1 October to 31 March.Only in the period 2003-2013 was the recorded information divided by municipalities; that is, for the period 1982-2002, IBY-RBY was only recorded at the state level.
The data collected for essential climate variables and IBY-RBY were for the period October-March 1982-2013.
Due to the availability of complete series of essential climate variables and IBY-RBY, in this study, it was decided to work with only two municipalities: Culiacán and Rosario.Culiacán and Rosario were the only two weather stations that presented 100% data availability for IBY-RBY, and the corresponding points where ASM was measured (red squares in Figure 1) were the closest points to the Culiacán and Rosario stations.

Normalization
Standardized Z normalization [51] was applied to all series (Equation ( 6)): where x = value of the variable of the treated series, μ = arithmetic mean of the series, and σ = standard deviation of the series.

Shapiro-Wilk Method
Four normality tests were applied to all series (PC and SC and residuals).The first test was the Shapiro--Wilk test [52], which was calculated using Expression (7): where Z i:n < ⋯ < Z n:n is the ordered sample of the standardized data a i with n constants.

Anderson-Darling Method
The second normality test was the Anderson-Darling test [53], which is defined by Expressions ( 8) and ( 9): where n is the number of observations, and S is defined by Expression ( 9): where F(Y) is the normal cumulative probability distribution function, with mean and variance specified from the sample, and Y i is the data obtained in the sample, ordered from highest to lowest.

Lilliefors Method
The Lilliefors test [54] was also applied, where the average and variance of each series were first estimated (Expression (10)): where F n (x) is the sampling distribution function, F θ ̂(x) is the theoretical distribution function, and x ∈ ℝ.

Jarque-Bera Method
The Jarque-Bera test [55] (Expression ( 11)) was applied, in which the skewness and kurtosis were assessed (Expressions ( 12) and ( 13)): where n is the sample size, and S and K are respectively the skewness and kurtosis coefficients.

Correlations Pearson (PC)
When the series presented normality, the PC was calculated [56], based on Expression (12): where PC = Pearson correlation coefficient, cov = covariance of x and y, and σ x and σ y = standard deviations of x and y.

Spearman (SC)
SC was used when the series did not follow a normal distribution [56] (Expression (13)): where SC is the Spearman correlation coefficient, η is the number of data points in the series, and d i is the rank difference of element η.

H o ∶ PC ≥ |CPC| and SC ≥ |CSC|
∴ PC and SC  0 (null hypothesis), Due to the lack of irrigation depth data for IBY, many studies, for example [3,18], focus only on modeling RBY; however, in this study, both type of bean yields (IBY and RBY) were modeled.
The estimated response was characterized with sampling regression using Equation ( 17): where each regression coefficient β i was estimated with b i from the sample data, using the least squares method.The least squares estimators of the parameters β 0 , β 1 , … , β k , were obtained by fitting the MLR model (Equation ( 16)) to the data of Expression (18): {(x 1i , x 2i , … , x ki , y i ); i = 1, 2, … , n, and n > k}, where y i is the observed response to the values x 1i , x 2i , … , x ki of the k independent variables x 1 , x 2 , … , x k .Each observation (x 1i , x 2i , … , x ki , y i ) satisfied Equation (19): or each observation (x 1i , x 2i , … , x ki , y i ) satisfied Equation ( 14): where ϵ i and e i are the random error and the residual, respectively, associated with response y i and with the fitted value y ̂i [57].

Validation of Mathematical Models
In the four MLRs, the following was carried out: (1) In the residuals, autocorrelation contrasts [58,59] were applied.
(2) Goodness-of-fit statistics were calculated: R 2 , PC, mean error (ME), root mean square error (RMSE), mean error absolute (MEA), percentage of error mean (PEM), percentage of error absolute mean (PEAM), and Theil's U2 statistic (U2).To comply with the linearity hypothesis, in each MLR, the condition PC ≥ CCP ∴ CP ≠ 0 was met [7].(3) For the analysis of severe non-multicollinearity, the variance inflation factor (VIF) and tolerance (To) were initially calculated.For severe non-multicollinearity, it was verified that R 2 ≤ 0.800, VIF ≤ 5.000 ∴ To > 0.200 [60], as cited by [61,62].In the models, the variables that presented severe multicollinearity were eliminated, and each MLR was subsequently recalculated.(4) For the homogeneity, it was verified that the average of each residual series was zero [63].(5) A normality analysis was applied to the residuals of each MLR.Normality methods were the same as for PC and SC.
Only one case with severe multicollinearity was recorded (AMeT vs. ABDD; SC = 0.999, R 2 = 0.999).According to [7], the correlations in bold (Table 1) can be identified as significantly different from zero, since for n = 32 (period 1982-2013), CSC = |0.350|;that is, SC > CSC.In Table 1, the upper triangular portions are the two-tailed probabilities of non-correlation, and the lower triangular portions are the PC and SC correlations.The PC coefficient that exceeded CPC (AmT vs. MmeT) was the only correlation that presented linearity.
Table 1.Pearson (PC) and Spearman (SC) correlation coefficients between irrigated (IBY) and rainfed (RBY) bean yields, and essential climate variables in Sinaloa.Coefficients significantly different from zero (significant correlations) Coefficients with severe multicollinearity

Discussion
The results of CEP and CET (Figure 2a) agree with [64], who noted that in northern Mexico for the period 1961-2000, the lowest CEP (from 100 to 900 mm yr −1 ) was recorded in the 1990s.The results of IBY-RBY can attributed to the high CEP associated with Hurricane Paul (category 2, occurred in 1982), which elevated RBY-Culiacán and RBY-Rosario [65].From the above, it can be established that CEP anomalies are determinants for IBY-RBY in Sinaloa.Also, in the periods 1985-1987 and 1996-2003, extreme droughts were recorded in northern Mexico [66,67], which significantly altered IBY-Culiacán.Furthermore, the results of this study are similar to those reported by [68], since, as those authors pointed out, IBY-RBY in Sinaloa for the autumn-winter cycle of 2021-2022 was 1.93 t ha −1 , being fourth place nationally in sown area (85,657 hectares).The results of ASM can be attributed to, for example, the fact that in 1986 (July-September), minimal anomalies occurred: negative anomalies from the Atlantic multidecadal oscillation and positive anomalies from the Pacific decadal oscillation, which were generators of La Niña events [69].Also, the period 1982-1984 recorded ASM < 0.05 m 3 m −3 , which agrees with [70], who for same period found anomalies with +phase Atlantic multidecadal oscillation and -Pacific decadal oscillation, which are generators of La Niña events (absence of humidity).Furthermore, according to data from [71] cited by [67], 2003-2005 was the period that presented the most extreme wet events in northwest Mexico.The results for temperature coincide with those reported by [21], who pointed out that in Culiacán, maximumT increased from 31.600 °C to 34.600 °C in the period 1982-2008.Figure 2a,b agree with what was stated by [21], who pointed out that 2011 was the most catastrophic year in Sinaloa, due to the lowest historical value of minimumT recorded in the last fifty years.Those values of minimumT in the autumn-winter cycle caused economic losses of 2353 million dollars, corresponding to a 95% loss of vegetables and annual crops [72]; cited by [73].
According to [27] and [74], the correlations in bold (Table 1) can be defined as significantly different from zero, since for n = 32 (1982-2013), CSC = |0.350|;that is, SC > CSC.The significant SC can be attributed to the fact that beans are one of the most sensitive crops to maximumT, minimumT, meant, and wet and dry events (ASM and CEP) [14,20,75].
The simultaneous repetition of essential climate variables (CEP, MMT, and AMT) can be attributed to the fact that in IBY-RBY, CEP is essential, generating good availability of ASM [76].Also, it has been noted [77] that the growth of crops in Sinaloa depends largely on the absence of high maximumT because, for example, this could damage the cells and tissues of the bean [14].
According to [63], all four models meet the assumption of homogeneity (average residuals = 0).

Conclusions
The lower magnitudes of ASM were caused in large part by the seasonal occurrence (July-September) of La Niña events (shortage of CEP), minimum negative anomalies of the Atlantic multidecadal oscillation, and minimum positive anomalies of the Pacific decadal oscillation.These results are different from those reported by [66], who stated that meteorological droughts in northern Mexico were stimulated by anomalies with +phase Atlantic multidecadal oscillation and -Pacific decadal oscillation.
The highest IBY values were associated with tropical cyclones that have made landfall in Sinaloa, mainly due to their high humidity contributions.In general, essential climate variables had a greater correlation with IBY than with RBY.IBY and RBY in Culiacán and Rosario are more sensitive to extreme values of the maximumT indicators than to minimumT, meanT, ASM, CET, and CEP indicators.AMT in Culiacán and Rosario (2009-2013) has not stopped increasing, which suggests that IBY and RBY are highly influenced by the occurrence of intense meteorological droughts.This differs from what was reported by [80], who stated that RBY was more influenced by droughts in the spring-summer period than in the autumn-winter period.
The only event with severe multicollinearity was recorded in Culiacán (AMeT vs. ABDD).
The four models met the hypotheses for MLR: no autocorrelation, linearity and severe non-multicollinearity, homogeneity, and normality.RBY-Culiacán was the model with the best fit.IBY-Rosario was the only model for which a delay was applied to its series.
This provides sensitive tools to prevent damage from a decrease in IBY-RBY, making use of essential climate variables, which are relatively easy to obtain in Sinaloa, a state that ranks fourth nationally in area planted with beans.
Extreme temperature and humidity events are substantial for IBY-RBY, which establishes the need to generate adaptation plans in response to droughts (meteorological, agricultural, and hydrological), which can guarantee the balance between environmental and food sustainability.For future research, it is recommended to use the values for bean evapotranspiration instead of the reference crop evapotranspiration.
This study is a pioneer in Sinaloa in the prediction of IBY-RBY using essential climate variables, and this methodology can be applied to any agricultural region in the world.This study has applicability for the agrometeorological sector, in which more research should be carried out, to reduce the research gap for this very important crop, not only for Sinaloa but for Mexico and the world.

Figure 1 .
Figure 1.Study area.Location of the meteorological stations in the center (Culiacán) and south (Rosario) of the state of Sinaloa, Mexico.The spatial resolution is approximately 0.25° × 0.25° for the two weather stations (Culiacán and Rosario), and 0.25° × 0.25° for the ASM measurement.

Table 3 .
p-values of the normality tests of the residuals of the models.