An Artificial Intelligence Approach to Prediction of Corn Yields under Extreme Weather Conditions Using Satellite and Meteorological Data

: This paper describes the development of an optimized corn yield prediction model under extreme weather conditions for the Midwestern United States (US). We tested six different artificial intelligence (AI) models using satellite images and meteorological data for the dominant growth period. To examine the effects of extreme weather events, we defined the drought and heatwave by considering the characteristics of corn growth and selected the cases for sensitivity tests from a historical database. In particular, we conducted an optimization for the hyperparameters of the deep neural network (DNN) model to ensure the best configuration for accuracy improvement. The result for drought cases showed that our DNN model was approximately 51%–98% more accurate than the other five AI models in terms of root mean square error (RMSE). For the heatwave cases, our DNN model showed approximately 30%–77% better accuracy in terms of RMSE. The correlation coefficient was 0.954 for drought cases and 0.887–0.914 for heatwave cases. Moreover, the accuracy of our DNN model was very stable, despite the increases in the duration of the heatwave. It indicates that the optimized DNN model can provide robust predictions for corn yield under conditions of extreme weather and can be extended to other prediction models for various crops in future work.


Introduction
Since the 18th century, greenhouse gas emissions have been increasing as a result of industrialization and other human activity, leading to global warming and climate change. According to the Intergovernmental Panel on Climate Change (IPCC) Fifth Assessment Report (AR5), global average temperatures would rise by approximately 3.7 °C around the end of the 21st century because of global warming. Also, the frequency of high temperatures will increase further [1].
Global climate change causes a variety of extreme weather conditions, such as drought, heat and cold waves, and heavy rain, which have broad impacts on ecosystems and human society. Extreme weather phenomena have become increasingly frequent; however, it is difficult to predict due to the variability of the patterns of climate change on a global scale. Moreover, climate change is very closely related to water supply and food security. In 2010, more than 20% of Russian agricultural areas were affected by drought and heatwave; in response, the Russian government issued a wheat embargo in 2010-2011, leading to increases of up to 50% in the international wheat price [2,3]. Thus, extreme weather can trigger spikes in crop prices, leading to instability in the global food supply [4]. In the season when grain filling occurs, drought and heatwaves can have acute effects on crop growth and production [5,6]. Therefore, developing a reliable crop yield model is essential for deciding agricultural policy and ensuring food security.
Lobell et al. [7] mentioned that global corn and soybean yields could drop by approximately 5.5% and 3.8%, respectively, because of climate change. Teixeira et al. [5] presented that substantial crop yield losses due to extreme temperatures can seriously affect the global food supply. Lesk et al. [4] showed that drought and heatwave reduced national cereal production by 9%-10%; the effects of drought lasted longer than those of heatwave, reducing both crop yield and harvested area. Qian et al. [8] analyzed climate change impacts on the yields of spring wheat, canola, and maize in Canada using multiple crop models under four different scenarios for global warming levels: 1.5 °C, 2.0 °C, 2.5 °C, and 3.0 °C. The results show that the yield changes would be various according to crop type. Small increases for canola and spring wheat and some decrease for maize would be expected under the warming scenarios. These previous studies demonstrated the impacts of extreme weather on actual crop yields, but they did not sufficiently present the prediction capability of the model under extreme weather conditions. Although challenging, crop yield prediction can provide useful information for adapting to changing Earth environments. A robust model for predicting crop yields under extreme weather conditions is needed for future agricultural management.
As efficient ways to predict crop yield, artificial intelligence (AI) models such as random forest (RF), support vector machine (SVM), and deep neural network (DNN) have been examined in many studies. Jeong et al. [9] used climate and soil data for the prediction of wheat and corn yields and showed that the accuracy of the AI model was better than that of classic linear equations. Kuwata and Shibasaki [10,11] used vegetation indices and climate data for corn yield prediction in the United States (US); they mentioned that the result from a DNN model was best among multiple AI models. Kim et al. [12] examined six AI models for predicting crop yield in the Midwestern US, which showed that an optimized DNN model outperformed the other five models. However, the performance of AI models for crop yield prediction may depend on the data used in the experiments. Particularly under extreme weather conditions, the dataset can have somewhat different characteristics from the usual pattern. So, the crop yield predictability of various AI models under unexpected weather conditions should be examined objectively, but previous AI studies did not focus on a comparative evaluation of AI models concerning rapid climate change.
In this study, we conducted comprehensive performance tests for multiple AI models to predict corn yield under extreme weather conditions in the Midwestern US. Six different AI models included multivariate adaptive regression spline (MARS), SVM, RF, extremely randomized tree (ERT), artificial neural network (ANN), and DNN. We configured each model through the optimization of hyperparameters and compared the validation statistics for drought and heatwave cases. For this, we prepared the input data, including satellite-based vegetation indices and meteorological data for July and August (the dominant growth period). Then, we constructed a match-up database using highresolution maps for crop types and selected cases of extreme weather conditions such as drought and heatwave. Two types of blind tests, such as leave-one-year-out and 10-fold cross-validation, were conducted to evaluate the accuracy of these models under conditions of drought and heatwave. This study concentrated on corn yields in the Midwestern US in 2006-2015.

Study Area
Corn, which is among the world's top three grain crops, has substantial importance for the global economy due to its various uses. The US is the largest grain producer and exporter, supplying approximately 40% of global annual corn production [13,14]. Hence, we focused on the five US states (Illinois with 102 counties, Iowa with 99 counties, Minnesota with 87 counties, North Dakota with 53 counties, and South Dakota with 66 counties) that are collectively known as the Corn Belt ( Figure 1).

Data
The National Aeronautics and Space Administration (NASA) provides Moderate Resolution Imaging Spectroradiometer (MODIS) data for Earth environmental monitoring. We used multiple vegetation products from the MODIS database as indicators of vegetation vitality, biomass, and photosynthesis [15]. The enhanced vegetation index (EVI) can mitigate the signal saturation issue of the normalized difference vegetation index (NDVI). The EVI is widely used for cropland studies because it is more appropriate for the expression of vegetation vitality in highly vegetated areas [16,17]. The leaf area index (LAI), which is calculated using a crop-specific growth coefficient and maximum primary production, can be used as an alternative for leaf biomass [18]. The gross primary production (GPP) represents the chemical energy (kg C m −2 ) for photosynthetic activity [19]. We used the EVI products with 16-day temporal and 250 m spatial resolutions and the LAI and GPP products with 8-day temporal and 500 m spatial resolutions.
The Parameter-elevation Regressions on Independent Slopes Model (PRISM) Climate Group provides daily and monthly reanalysis data for seven climate elements for the US [20]. We used monthly data for precipitation (PPT) and maximum temperature (TMAX) at a spatial resolution of 4 km, considering extreme weather conditions such as drought and heatwave. From the Global Land Data Assimilation System (GLDAS) [21,22], we used monthly soil moisture (SM) data [23] [24]. The CDL maps are updated yearly using the data from the cropland census, Landsat satellite images, and Advanced Wide Field Sensor (AWiFS) [25,26]. In this study, the pixels recorded as the corn-sown area were extracted from the CDL maps. The National Agricultural Statistics Service (NASS) operated by the USDA provides annual corn yield statistics at the county level [27]. For convenience, we converted data in the unit of a bushel per acre to a ton per hectare.
All the data for the period of July and August 2006-2015 are summarized in Table 1. We used cropland data layer (CDL), enhanced vegetation index (EVI), leaf area index (LAI), gross primary production (GPP), precipitation (PPT), maximum temperature (TMAX), and soil moisture (SM) for corn yield prediction. We constructed a spatiotemporal match-up database for corn-sown areas of the Midwestern US to combine the input variables and the yield statistics according to year and county ( Figure 2). The temporal aggregation for July-August was conducted for EVI, LAI, and GPP in terms of the day-of-year (DOY), and for PPT, TMAX, and SM in terms of the month (Figure 3a).
For spatial aggregation, we first extracted the corn-sown area from the CDL maps at a 56 m resolution for 2006-2009 and at a 30 m resolution for 2010-2015 on the World Geodetic System 1984 (WGS84) coordinate reference system. Then, we overlaid the EVI at 250 m resolution, LAI and GPP at 500 m resolution, PPT and TMAX at 4 km resolution, and SM at a 25 km resolution on the CDL maps. We derived the intersections between the CDL layer and each of the variable layers, which produced the data layers at 56 m or 30 m resolution according to year. The data layers were aggregated by the boundary of the counties in the study area using a zonal mean operation (Figure 3b).

Definition of Extreme Weather Events
Drought is a natural disaster that causes water shortage and damages agriculture, leading to adverse economic and societal effects [28][29][30]. It also severely affects crop growth and causes considerable reductions in yield. For example, the 2012 US drought influenced at least 60% of the farms and substantially decreased the yields of corn and soybean nationwide [31].
We referred to the National Oceanic and Atmospheric Administration (NOAA) and the National Center for Atmospheric Research (NCAR) for the criteria of drought conditions. The Vegetation Health Index (VHI) [32] provided by NOAA and the Palmer Drought Severity Index (PDSI) [33] provided by NCAR are the mainly used drought indices. The VHI value 0 represents the worst condition, and the VHI value 100 indicates the best situation in terms of vegetation health. PDSI typically shows a range of −4 to +4, with the negative values for dry state and the positive values for wet state. Table 2 shows that the study area suffered a severe drought in July and August (the dominant growth period for corn) in 2012. The mean VHI value for the study area in 2012 was lower than that of the other nine years, and it corresponded to a drought case according to the criteria of NOAA (VHI < 35). The mean PDSI value also satisfied a drought condition in terms of the requirement of NCAR (PDSI < −2). Hence, we considered the year 2012 as a drought case using both indicators from NOAA and NCAR. Corn yields, PPT, minimum temperature (TMIN), TMAX, mean temperature (TMEAN), VHI, and PDSI in July-August for the period between 2006 and 2015 are summarized in Table 2. The counties with a smaller corn yield accompanied by lower precipitation and higher temperature were observed much more frequently in 2012 than in the other years. High air temperature also significantly affects crop yield, as grains require certain temperatures during the growing season for healthy development. If high temperatures continue for long periods, the crop harvest may be reduced considerably because of weak growth and fruition. Lee at al. [34] used crop growth models with climate data to examine the effects of high-temperature stress on crops; they found that more than 20% of the cultivated area of North America can experience extremely high-temperature stress due to climate change. Climate change would increase thermal stress and damage to crop yields, ultimately affecting the global food supply [5].
For the criteria of heatwave condition from the viewpoint of agriculture, we referred to the two mainly cited studies by Challinor et al. [35] and by Teixeira et al. [5]. Challinor et al. [35] suggested the Heat Stress Intensity (HSI) index that can express the heat stress during the reproductive phase and the yield damage. The HSI index has four categories: low (< 0.05), medium (< 0.15), high (< 0.30), and very high (> 0.30). From our match-up database, only four out of the 3585 records (approximately 0.1%) corresponded to the high or very high category. Alternatively, we decided to adopt the critical temperature suggested by Teixeira et al. [5]. Different crops have different thermal stress tolerance levels, and a heatwave limit temperature of 35 °C has been reported for corn and soybeans [5]. In this study, we defined a heatwave as a period when temperatures exceed 35 °C for three consecutive days during July and August. We conducted sensitivity analyses for the corn yield predictions according to the count of successive heatwave days.

Artificial Intelligence Models
In this paper, six major AI models were employed for crop yield modeling. As a nonparametric regression method, MARS combines multiple linear regression models to handle the nonlinear relationships between explanatory and response variables. The training data set is grouped by adaptive splines having different slopes, which are smoothly connected using a polynomial function. One can evaluate the suitability of a MARS model using the generalized cross-validation (GCV) score [36].
The SVM performs an optimal grouping of data and builds appropriate regression models for each group [37][38][39]. A kernel function such as polynomial, Gaussian, and sigmoid function can be adopted to derive a maximum marginal hyperplane (MMH), which maximizes the margin between data group boundaries [40][41][42]. We used the Gaussian kernel function in this experiment. Different from the old classification and regression tree (CART), RF is a kind of ensemble method using bootstrap and bagging approaches [43]. The RF builds many decision trees with slightly different characteristics by extracting random samples from training data. The bootstrap evaluates the suitability of the sample distribution and performs resampling if required. The bagging, which is short for bootstrap aggregating, carries out an aggregation of the decision trees created by the bootstrap to produce a final ensemble model using the average or majority vote [44,45]. We set the number of trees to 500 for the tree-pruning process. The number of variables for use in dividing nodes was set to n/3, where n denotes the input feature number. The out-of-bag error was used for measuring the model suitability in our experiment.
The ERT is similar to the RF but different in that it adopts unpruned decision trees. The ERT divides the nodes at random cutpoints and includes the whole training data without bootstrap [46]. We set the parameters for ERT to the same in the RF model: (1) the attribute number for each node and (2) the minimum sample size for dividing nodes.
The ANN emulates a biological neural network. It has input, hidden, and output layers [47]; the hidden layer consists of multiple neurons for computation in a node-link system [48,49]. The model performance depends on the weighting scheme of the network. The training tolerance is an error size for convergence of the model training; the learning rate is an increment/decrement of the weight during an iteration [49]. We set up three nodes in a hidden layer through the performance tests.
The generic ANN has the problem of local minima, such that an optimization process may stop at a locally optimized state, not reaching a global optimization. Also, the overfitting problem is often found in the classic machine learning methods, such that they tend to fail to handle outliers because of excessive training from the given set of data. The DNN overcame these problems in a more intensive network, where backward and forward optimizations are combined. The vanishing gradient issue in the loss function, which may occur in the back-propagation system, can be resolved by adopting activation functions like a rectified linear unit (ReLU). The L1/L2 regularization for outlier management can make the DNN model have sparsity (L1) and simplicity (L2) by detailed scaling of the weighting scheme of the network. Also, the dropout method capably handles extreme cases through iterative training under handicapped conditions where hidden units are randomly deleted [50]. The previously built weighting scheme in a DNN model can be imported for use in the initial values for another DNN model, which is called pretraining and transfer learning. Fine-tuning enables a more precise optimization of a DNN model by adding extra data for training [51].
In this paper, we configured the six AI models through hyperparameter optimization, and the settings are shown in Table 3. In particular, the optimization procedure for the DNN model is illustrated in Figure 4. Table 3. Hyperparameters and libraries used for the six AI models in this study.

Training and Validation
We conducted 10 rounds of leave-one-year-out blind tests by dividing the data set into a nineyear training group and a one-year validation group from the entire match-up database (2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015). For example, the blind test for round #1 consisted of the validation data from 2006 and the training data from the remaining nine years; the blind test for round #2 consisted of the validation data from 2007 and the training data from the remaining nine years, and so on.

Experiments under Drought Conditions
The nationwide drought of the US in 2012 caused severe crop damage, reducing corn yields by about 22.5%. We tested corn yield predictions using the six AI models and the spatiotemporal matchup database to explore how this drought affected the accuracy of yield predictions. Table 4 summarizes the result of the blind test for 2012, in which the nine-year data set (excluding 2012) was used for training, and the data set of 2012 was used for validation. The MARS, SVM, RF, ERT, and ANN models showed the mean absolute error (MAE) between 0.975 and 1.153 ton/ha and the root mean square error (RMSE) between 1.248 and 1.643 ton/ha. In contrast, the DNN model had much lower prediction errors: an MAE of 0.666 ton/ha and an RMSE of 0.828 ton/ha, with 51% (i.e., [1.248 − 0.828]/0.828) to 98% (i.e., [1.643 − 0.828]/0.828) better performance than the other five models in terms of RMSE. The mean absolute percentage error (MAPE) of the DNN model was 12.9%, and the correlation coefficient was 0.954, which can be interpreted as an excellent accuracy even under a natural disaster situation.
Actual and predicted corn yields for the study period are compared in Figure 5, in which red dots indicate 2012 data (severe drought) and black dots indicate all other years. The DNN model resulted in smaller errors and showed more concentration around the 1:1 line. However, the other five models showed more scatter around the 1:1 line and slight overestimation. This overestimation of the other models was because of the failure to cope with the unexpected drought conditions of 2012. Despite the sudden drought, the predicted corn yields of the DNN model were very close to the yield statistics, which indicates that our DNN model successfully overcame the overfitting problem and dealt with outliers appropriately. Maps for actual and predicted corn yields and the prediction errors from the leave-one-year-out blind tests for 2006-2015 are illustrated in Figure 6. The year 2012 for a drought case showed lower corn yields than the other usual years, but the accuracy was similar to that of the other usual years.  Indeed, a larger database can increase the performance and reliability of DNN models. Our database was limited to 10 years because the official maps for crop type were provided since 2006 for our study area. Although the data size was not sufficient, we conducted a sensitivity test according to the size of the training set. We had nine years for the period between 2006 and 2015, except for the year 2012 for a drought case. So, we extracted different combinations of three years (9C3), five years (9C5), and seven years (9C7) from the nine years. As presented in Table 5, 20 experiments for each group were carried out. With the increase in the number of years for training (i.e., 3, 5, 7, and 9), the validation statistics of the DNN model were gradually improved. The RMSE and correlation coefficient showed a slight sensitivity to the volume of training data. However, the accuracy might not offer a dramatic improvement even if the number of years for training surpasses that of our experiment, which should be examined in future work for big data. In addition to the size of the training dataset, the kinds of input variables can also affect the performance of DNN models. The VHI and the PDSI were used to identify a drought year in our experiment. Here, we incorporated the VHI and the PDSI as an additional input variable for the DNN model. The current model included soil moisture to represent the drought state of cropland. Hence, we examined how the three drought variables (soil moisture, VHI, and PDSI) would influence the performance for prediction. The weights for the three variables connected to the 300-300 hidden units were optimized by the Adaptive Gradient (AdaGrad) algorithm generally used in DNN models. Table 6 shows the validation statistics from the leave-one-year-out blind tests for the year 2012 as a drought case. The model with soil moisture and the model with the addition of VHI and PDSI were compared. The VHI seemed to be appropriate as an additional variable in that the validation statistics showed a slight improvement. On the other hand, the addition of PDSI did not have a critical impact on this experiment. It is partly because the PDSI is an index to represent a climatological drought from the viewpoint of long-term changes, but the VHI is a general indicator of agricultural drought conditions. In addition to the VHI, more input variables that can explain the interactions between climate and vegetation, including evapotranspiration, diurnal temperature change, irrigation capability, and soil properties, will also be necessary for a more reliable prediction of crop yield. Moreover, the appropriate use of drought index will be important for crop yield prediction because a variety of drought indices can represent different aspects of drought conditions.

Experiments under Heatwave Conditions
Next, we evaluated the accuracy of the AI models for predicting crop yield under heatwave conditions by performing sensitivity analyses for heatwaves of different durations during the study period. We extracted cases in which the counties experienced a heatwave for longer than a given period of days ( 3 to 15 days). Then, we calculated validation statistics for these heatwave cases according to the duration of the heatwave using the six AI models for corn yield. As shown in Figure  7, prediction errors tended to increase as the duration of the heatwave increased. However, the accuracy of the DNN model remained significantly stable despite the increase of the heatwave days. In contrast, the errors of the other five models increased somewhat sharply over time. It is interpreted that the DNN model was very robust to unexpected heatwave disasters.
For the heatwaves longer than five and seven days, the average corn yields were 6.628 and 5.938 ton/ha, respectively. Considering that the average corn yield of all the cases was 9.606 ton/ha during the 10 years, it looks like the heatwave significantly damaged the corn yields. Despite the unexpected heatwave disasters, the DNN model showed much lower prediction errors, with the MAE of 0.781 ton/ha for heatwaves exceeding five days, than the other five models having MAE values between 1.068 and 1.352 ton/ha (  (Table 7).
The correlation coefficients of the DNN model were 0.914 and 0.887 for heatwaves longer than five and seven days, respectively, which indicates an excellent agreement with the actual yields under unexpected extreme weather conditions (Figures 8 and 9). The remaining five models showed more deviation from the 1:1 line and significant overestimation. Similar to the drought prediction results, the other five models were unable to deal with the unexpected heatwave cases. However, the corn yields predicted by our DNN model were very close to the actual corn yields, which indicates that this model was constructed appropriately for such extreme weather events. In Figure 10, the Illinois counties outlined in black correspond to the case of heatwaves for more than five consecutive days. The map shows that the counties under heatwave conditions had lower corn yields than the other counties, but the predicted yields were very closely simulated to the actual yields.

Another Type of Blind Test
An additional experiment was conducted to test the reliability of our approach more objectively. In addition to the leave-one-year-out blind tests in Sections 4.1 and 4.2, we carried out another type of blind test using the 10-fold cross-validation with random sampling (Figure 11). We rebuilt a DNN model and calibrated the hyperparameters for a new testing environment. As a result, the 10-fold blind tests showed similar validation statistics to those of the leave-one-year-out blind tests, which means that the DNN model was appropriately optimized for corn yield prediction and offered a nice performance for both experiments (Table 8).

Conclusions
In this study, we tested six AI models for corn yield under extreme weather conditions and analyzed their performances for a 10-year period that included severe drought and heatwave. Given its tall canopy and broad leaves, corn is relatively sensitive to heat stress [52,53]. In 2012, the Midwestern US experienced severe drought, leading to a 22% reduction in average corn yield. Our DNN model predicted the reduced corn yields in 2012 under drought conditions with a correlation coefficient of 0.954 in the blind tests, which indicates the suitability of our approach for crop yield prediction under extreme weather conditions. Corn yield was remarkably decreased according to the increase of the heatwave duration. Despite the unexpected heatwave, our DNN model produced stable estimates as a result of the effective optimization of hyperparameters, including hidden units, epoch number, activation function, optimizer, L1/L2 regularization, and dropout ratio. Future work using a larger database for about 20 years will be necessary to ensure more reliability of the DNN model for crop yield prediction. Also, more appropriate use of drought index will be required for the prediction of crop yields because various drought indices can represent different aspects of drought conditions.
Losses in crop yield are dependent on the types of extreme weather and the types of crops. For example, the soybean yield is significantly reduced by low temperatures during the growing season [54,55]. Hence, it is also necessary to examine how the crops such as soybean, wheat, and rice are affected by extreme weather conditions, including drought, heat and cold waves, and heavy rainfall. Future studies should use AI modeling to predict the yields of various crops under extreme weather conditions.