3.1. CART Model Trees for Drought Predictions
Using the CART approach, a number of successful model trees were constructed, which can be easily interpretable and used by decision makers in their drought management decisions. Conceptually, the model trees were produced as upside down trees (Figure 1
b) and have branches that lead to the actual leaves of the tree. When there are no more splits for the branches and/or leaves, there are linear models representing the node in the form of rules, which have if-conditions and linear-regression equations.
At each node, the model trees were expressed as collections of rules, where each rule has an associated multivariate linear equation. Whenever a situation matches a rule’s conditions, the associated equation is used to calculate the predicted value of drought outlooks. A sample rule (Rule 1 of June model
) for predicting July drought conditions in June (June one-month outlook) is presented in Figure 3
. In this case, July drought values range from −3.90 to 1.40 with an average value of −0.96. Here, the drought values are in the units of standard deviation values. The regression tree model finds that the target value of these or other cases satisfying the conditions can be modeled by the following formula:
SDNDVIJuly = −123 + 4.14 AMO − 0.85 PDO + 1.16 SDNDVIJune + 0.45 NAO + 0.3 PNA − 0.09 SPI_3month + 0.009 DEM
During this modeling, the estimated error is 0.414. For a case covered by this rule, AMO has the greatest effect on the drought estimate and DEM has the least effect on determining this drought outlook value. Each rule generated during the modeling was interpreted this way, and whenever a case satisfied all the conditions, the linear model was used for predicting the value of the target attribute. If two or more rules applied to a case, then the calculated values from the respective multiple linear regression models were averaged to arrive at the final prediction of drought conditions.
There are two components of the models produced that include: the if-condition and the actual multiple linear regression model. The 11 attributes used were assessed in the model construction (Table 2
). The last column indicated the rank of each attribute, which was determined by summing the attribute usage in the if-conditions and the regression model and dividing it by 2. In our drought modeling experiment, the PDO was the most important attribute for modeling July drought, followed by AMO and SDNDVI, respectively. This means that these three attributes have strong relationships with drought status in the study area. The model performances and selection of the parameters are explored in subsequent sections.
3.2. Model Performance Evaluations
A 24-year historical record of data (1983–2006) was used for developing the time lag prediction models and performance evaluations metrics. In cross-validating the drought models in the regression tree, the data were randomly split into training and testing sets. The learned parameters from the data in the training set were subjected to the parameters of the test dataset, and the quality of the predictions on the test set was evaluated. This approach usually gives an idea of how well the models generalize the unseen data points, with better generalized and more robust models having a higher predictive accuracy [69
] of future drought conditions.
presents the performance of the “rules alone” model on test data. The MAD, which measures the deviation of the predicted value from the actual known value, ranged from 0.22 to 1.9 for the 10 models assessed. These error values are in terms of standard deviations. The highest error values were observed for the July three-month prediction followed by the August two-month prediction. All of the October month predictions (June four-month outlook, July three-month outlook, August two-month outlook, and September one-month outlook) were found to have about two standard deviation values. The possible explanation for this high value is that October is dry and at the end of the growing season [45
] compared to the predictor months, and this high variability is expected for the study area.
The ratio of the average error magnitude to the error magnitude that would result from always predicting the mean value (RE) ranged from 0.29 to 0.67. The lowest RE was recorded for the August one-month outlook and the highest was for the June-three month outlook. In all 10 models, the RE is <1, which indicates that the average error magnitude is lower in the overall observations. RuleQuest [69
] also indicated that for useful models the relative errors should be less than one.
The CC ranged from 0.71 to 0.95. The highest CC was found for the August one-month prediction, which is predicting September drought conditions using August data. This is in agreement with our expectation in that both of these months are vigorous plant-growing months [45
] and there is a similarity in their vegetation conditions over this period of time. The lowest correlation value was for the June three-month prediction. This is in line with our expectation that the predictive accuracy would likely be lower during June, which is early in the growing season when vegetation conditions (e.g., amount and vigor) are more variable because of several early season factors (e.g., moisture and air temperature affecting the initial plant growth rates, resulting in vegetation condition variations produced by several environmental factors beyond drought). In comparison, the August period with higher predictive accuracy is a vigorous growing month [45
] with considerable accumulated plant biomass, leading to more consistent conditions from year to year with major deviations more likely be related to drought stress during the early to mid-growing season.
presents the performance of the 10 “rules alone” models, and “instances and rules” CART Cubist models. In the 10 assessed models, the MAD in the “instance and rules” model was found to have significantly lower value than the rules-alone model (Figure 4
i). The reason for these performance improvements in all of the models for the “instance and rules” model approach is that the latter modeling approach uses bagging (bootstrap aggregating, which is designed to improve the accuracy of the regression tree model) [90
]. Witten et al. [82
] supported this result that the predictive accuracy of a rule-based model can be improved by combining it with nearest neighbor or an instance-based model, which is achieved through the use of bagging in machine learning.
] indicated that bagging and other resampling techniques can be used to reduce the variance in model predictions, where numerous replicates of the original dataset are created using a random selection with replacement method. Each derivative dataset is then used to construct a new model and the models are gathered together into an ensemble. To make a prediction, all of the models in the ensemble are polled and their results are averaged. Using this intelligence, the MAD can be reduced and the overall modeling performance can be improved.
Moreover, the MAD consistently increased as the prediction period lengths increased (Figure 4
i). It was also observed that the June four-month outlook, July three-month outlook, August two-month outlook, and September one-month outlook had the highest MAD both for the instance and rules, and rules-alone models (Figure 4
i). The highest MAD values were found during the October month (the to-be predicted month), which is out of the growing season [45
] and these much large errors are expected for this period of the growing season, as reported earlier. The relative error comparison between the “instance and rules” model and the rules-alone model was found to have a pattern similar to the MAD pattern (Figure 4
The CC for the 10 monthly outlook models showed that the “instance and rules” model had consistently higher correlations than the “rules alone” models (Figure 4
iii). Our assessment also showed that for all 10 models, the “instance and rules” model version were found to have higher accuracy than the “rules alone” model (Figure 4
iii). The comparison of average CC between the “instance and rules” model and the “rules alone” model showed the “instance and rules” model CC to be significantly higher than the “rules alone” model (p
In addition to the instance and rules options, there is let Cubist decide
in the Cubist regression tree modeling option [69
]. For the 10 models assessed, the let Cubist decide
option was found to have the same accuracy (in terms of MAD, RE, and CC values) as composite models (which combine instances and rules), showing equivalent performances of the two modeling options in the Cubist modeling tool. Similar results were also found by Ruefenacht et al. [93
] in that the let cubist decide
model has the same accuracy as the composite models. The challenge using the let cubist decide
option was the increased time required to build the model, as the optimal decision was determined by Cubist.
In addition to the composite rule-based nearest-neighbor models, RuleQuest [69
] can also generate committee models made up of several rule-based models. Each member of the committee predicts the target value for a case and the members’ predictions are averaged to give a final prediction.
The first member of a committee model is always exactly the same as the model generated without the committee option. The second member is a rule-based model designed to correct the predictions of the first member; if the first member’s prediction is too low for a case, the second member will attempt to compensate by predicting a higher value. The third member tries to correct the predictions of the second member, and so on. The default number of members is five, a value that balances the benefits of the committee approach against the cost of generating extra models [69
Before combining the committee and neighbor models, the performances of these approaches were separately analyzed (Figure 5
). For deciding the actual threshold values for the number of committees and neighbors to be used in Cubist drought modeling, the r
-squared values and RMSE were assessed.
presents the r
-squared values and RMSE for the different committees and number of neighbors to be used. In the committee models, it can be observed that the highest average r
-squared value was obtained for 30 committees, and there was no performance improvement gained with the addition of more committees. The RMSE decreased as the number of committees increased from 1 to 30, but remained the same as further committees were added. Therefore, in using the committee models, it is imperative to use 30 committee models in future drought modeling experiments.
For the number of neighbors, the r-squared value was found to increase up to seven neighbors with the r-squared values remaining unchanged, as additional neighbors were included. The RMSE was found to decrease up to seven neighbors and remain the same thereafter. Therefore, the optimum neighbor threshold was found to be seven to achieve the highest predictive accuracy.
The comprehensive set of prediction performances on models for three months (June, July and August) using 0–9 neighbors and with committee ranges 1–100 are presented in Figure 6
. This analysis was done in line with the GUI-based implementations of RuleQuest [69
] for modeling. The GUI has different checkbox options and also has options for specifying the values for the number of nearest instances and committees. Therefore, Figure 6
gives us the benefits that we may achieve by specifying the parameters in the GUI options, as well as parameter specification in the command line, batch-processing options.
For the 0-neighbor and 1-committee options, the highest r
-squared value and lowest RMSE values were achieved for the August one-month outlook (see the * symbol in Figure 6
) with the lowest r
-squared and the highest RMSE observed for the June three-month outlook (see the black square symbol in Figure 6
). These model performances are in line with our previous explanations. From the assessed possible combinations with 0–9-neighbors and 1–100 committee June–August outlooks, the same pattern was observed for all with the exception of 1-neighbor with 1-committee, 10-committee, 50-committee, and 100-committee options (Figure 6
). In all four combinations with 1-neighbor, the RMSE was the highest and the r
-squared was the lowest. The possible explanation for this is that having only one instance and creating the model based on this instance only generated a biased model, and the model performance becomes highly affected. The zero-neighbors option did not use the instance modeling option, which is better than the 1-neighbor instance modeling option (Figure 6
3.3. Percentage Splits for Training and Testing
The percentage splits for training–testing were done for 50/50%, 60/40%, 70/30%, 80/20%, 90/10%, and 99/1%. The relative performances of the models were compared for six models during the core of the growing season (June one-month outlook, June two-month outlook, June three-month outlook, July one-month outlook, July two-month outlook, and August one-month outlook) (Figure 7
). The MAD ranged from 0.2 to 0.5. In all of the assessed models, the minimum MAD was found for the August one-month outlook, and the maximum MAD was observed for the June three-month outlook (Figure 7
a). This is in agreement with our expectation that as prediction length increases from one month to three months, the error increases and the performance of the models decreases.
The descriptive statistics for the MAD of the six models are presented in Table 4
. The average of the MAD for the six models ranged from 0.35 to 0.36. The minimum error was recorded for the 99/1% split and the maximum error was observed for the 50/50% split. The same pattern was observed for the RE (not presented here) as the MAD. The possible explanation for this is that as more data are used for training the model and less data are assigned to the test set, the MAD was found to be decreasing. Therefore, in terms of the MAD parameter, 99/1% is performing the best.
The relative performances of the six models described above were also compared in terms of accuracy CC (Figure 7
b). The CC ranged from 0.74 to 0.96. In all models assessed, the maximum CC was found for the August one-month outlook, and the minimum CC was observed for the June three-month outlook. This is in agreement with our expectation in that the August one-month outlook predicted September conditions, where these two months have similar vegetation conditions for the study area near the end of the core growing season [45
]. In general, as the prediction length increases, the CC was found to be decreasing (Figure 7
a). There are not significant differences between the percentage splits in terms of both MAD and RE (Figure 7
b). The CC was found to be increasing consistently as split percentage changes from 50/50% to 99/1% (Figure 7
b). For the practical implementation of drought modeling using CART, it seems imperative to use the 90/10% splits (compared to both 80/20% and 99/1%) for two reasons: (1) the accuracy is higher compared to 80/20% and reasonably lower compared to 99/1% splits, and (2) evaluation of the model’s performance on unseen data will be more representative and robust using 10% of the data for testing rather than 1% of the total dataset.