Global Optimization of Cultivar Trait Parameters in the Simulation of Sugarcane Phenology Using Gaussian Process Emulation

Bandara, W. B. M. A. C.; Sakai, Kazuhito; Nakandakari, Tamotsu; Kapetch, Preecha; Anan, Mitsumasa; Nakamura, Shinya; Setouchi, Hideki; Rathnappriya, R. H. K.

doi:10.3390/agronomy11071379

Open AccessArticle

Global Optimization of Cultivar Trait Parameters in the Simulation of Sugarcane Phenology Using Gaussian Process Emulation

by

W. B. M. A. C. Bandara

^1,2,*

,

Kazuhito Sakai

^1,3,*

,

Tamotsu Nakandakari

^1,3,

Preecha Kapetch

⁴,

Mitsumasa Anan

^1,5,

Shinya Nakamura

^1,3,

Hideki Setouchi

^1,3 and

R. H. K. Rathnappriya

¹

United Graduate School of Agricultural Sciences, Kagoshima University, 1-21-24 Korimoto, Kagoshima-shi, Kagoshima 890-0065, Japan

²

Department of Agricultural Engineering, Faculty of Agriculture, University of Ruhuna, Kamburupitiya 81100, Sri Lanka

³

Faculty of Agriculture, University of the Ryukyus, 1 Senbaru, Nishihara-cho, Okinawa 903-0213, Japan

⁴

Nakhon Sawan Agricultural Research and Development Center, Moo 2, Udomthanya, Takfa 60190, Thailand

⁵

Faculty of Agriculture, Saga University, 1 Honjo-machi, Saga 840-8502, Japan

^*

Authors to whom correspondence should be addressed.

Agronomy 2021, 11(7), 1379; https://doi.org/10.3390/agronomy11071379

Submission received: 27 May 2021 / Revised: 3 July 2021 / Accepted: 5 July 2021 / Published: 7 July 2021

(This article belongs to the Section Crop Breeding and Genetics)

Download

Browse Figures

Versions Notes

Abstract

The global optimization of parameters in process-based crop models is often considered computationally expensive. Gaussian process (GP) emulation is a widely used method for reducing the computational burden of the optimization process. Total above-ground biomass and cane dry weight of three Thai sugarcane cultivars (KK3, LK92-11 and 02-2-058) collected under rainfed and irrigated conditions were used to optimize cultivar-specific parameters in the Agricultural Production Systems sIMulator (APSIM)-Sugarcane crop model through a GP emulation. GP emulators were trained and validated to approximate APSIM-Sugarcane model and then used for optimizing the cultivar-specific parameters through the differential evolution algorithm. Resulting optimized parameters allowed to obtain simulations that quite well approximated the observed biomass and CDW (validation results between simulated and observed yields: R² 0.93–0.98; normalized root mean squared error: 5–22%; Willmott’s agreement index: 0.87–0.99). The best parametrization was obtained under the lowest water stressed conditions. Based on these results, we suggest that GP emulation can be efficiently implemented for the parameterization of computationally expensive simulators.

Keywords:

APSIM; differential evolution algorithm; Gaussian process emulation; global optimization; sugarcane

1. Introduction

Sugarcane is an important crop for sugar and bioenergy worldwide. As the fourth-largest sugar producer and the second-largest sugar exporter in the world, sugarcane has become the most important crop in Thailand’s agriculture [1].

However recent evidence indicates that Thailand’s sugarcane production is highly affected by climate change. For instance, sugarcane output in the crop year 2020/2021 was recorded as 66.7 million tonnes, which is down from 74.9 million tonnes in the last crop year primarily due to drought [2]. Thus, identification of suitable management strategies to cope with such temporal and spatial variability of sugarcane production has become important for the Thai sugarcane industry.

In this respect, process-based crop models are advantageous. Since they can be used for assessing climate impacts on sugarcane [3,4,5] evaluating cultivar responses under various environments and management strategies, they can predict yield [6,7] and provide information for economic and policy decision-making [8]. Many crop models are available for sugarcane, including Agricultural Production Systems sIMulator (APSIM)-Sugarcane [9], DSSAT-Canegro [10], MOSICAS [11], STICS-Sugarcane [12], QCane [13]. Among these, APSIM-Sugarcane is one of the most widely-used platforms for the modelling and simulation of sugarcane production systems [14,15].

Obtaining accurate model predictions in process-based crop modelling requires cultivar trait parameterization. Parameter-optimizing techniques are frequently used in the parameterization process [6,16,17]. Seidel et al. [18] have reported that almost 50% of the 211 respondents to their survey on parametrization practices searched for the best-fit parameters using trial and error. However, Harrison et al. [19] indicated that such manual parameterization techniques allow only a small number of parameters to be calibrated. As a result, many of the parameter combinations remain uninvestigated.

Holzworth et al. [20] suggested that the use of a much more objective and reproducible parameterization and validation methodology is of great importance to the formulation of models that are to be applied to the growth of agricultural crops. Studies have recently focused on the use of complex statistical approaches for global optimization of parameters in crop models. Formal and informal Bayesian methods are common approaches that have been used for parametrization of crop models. For instance, Sheng et al. [21] used a Generalized Likelihood Uncertainty Estimation (GLUE) method and Differential Evolution Adaptive Metropolis (DREAM) method for an APSIM model of maize production. Sheng et al. [21] have concluded that similar performance is obtained with both GLUE and DREAM, but they recommend GLUE because it is easier to use. Sexton et al. [22] have used GLUE and a Markov Chain Monte Carlo (MCMC) model in an APSIM model of sugarcane production. Although both methods produced acceptable results, Sexton et al. [22] recommend MCMC because its statistical background is well documented. However, due to the relative advantages of diverse parametrizing methods and internal structural differences of crop models, evaluating the performances of various parameterization methods is further important in providing comprehensive guidance for the method selection.

The current study was focused to use a differential evolution (DE) algorithm for global optimization of cultivar trait parameters implemented in the APSIM-Sugarcane model. According to Georgioudakis and Plevris [23], DE has become one of the most popular optimization algorithms used in continuous optimization problems. For instance, the extensive survey conducted by Bilal et al. [24] reviewed nearly 283 research articles focused on the use of DE during the last 25 years.

Although there are a number of promising methods that could be used in optimization, Harrison et al. [19] have explored several parameterization techniques and have reported that the length of time required for convergence is a factor that limits obtaining accurate results. Moreover, according to Saltelli et al. [25], global optimization of parameters in process-based crop models is often considered computationally expensive. Therefore, carrying out the number of simulations required for optimization may not be feasible and may be extremely time-consuming. A versatile way to reduce the computational time is the use of a Gaussian process (GP) [26] for DE optimization. GP emulators are surrogate models, and they are computationally low expensive. An emulator of sufficient accuracy can be used as a substitute for the original simulator (APSIM), and parametrization can be based on the emulator. The feasibility of substituting APSIM-Sugarcane for sensitivity analysis based on GP emulation has previously been evaluated in several studies [27,28,29,30]. However, there has not yet been any study of the use of the GP emulators to optimize parameters in the APSIM-Sugarcane model nor an examination of the accuracy of the optimization method.

The aim of the present study was the utilization of a GP-based DE algorithm for global optimization of cultivar trait parameters implemented in the APSIM-Sugarcane model. Total above-ground biomass (Biomass) and cane dry weight (CDW) at 30-day intervals from 90 to 390 days after planting of three sugarcane cultivars from Thailand obtained under rainfed (Rf) and irrigated (Ir) conditions were used for the optimization of cultivar trait parameters. The influence of field experimental characteristics in the parameter optimization of APSIM-Sugarcane was examined.

2. Materials and Methods

The current study used observations of sugarcane biomass and CDW of three Thai sugarcane cultivars (Section 2.1) for global optimization of cultivar trait parameters implemented in the APSIM-Sugarcane model. APSIM-Sugarcane simulations were initially prepared (Section 2.2) and used for training and validating several emulators (Section 2.3), one for each experimental condition. Emulator accuracy was evaluated. Emulators were then used instead of APSIM-Sugarcane for DE-based global optimization (Section 2.4). Several optimized parameter ensembles for each cultivar were obtained from the above process and used in APSIM-Sugarcane to obtain simulated outputs (sugarcane biomass and CDW). Simulated outputs were then validated with observations for the selection of an optimal parameter ensemble for each Thai sugarcane cultivar (Section 2.5).

2.1. Study Field

Data from the three field experiments (namely A, B1 and B2) conducted by Preecha et al. [6] in Khon Kaen (KK), northeast Thailand (16.48° N 102.82° E; 181 m elevation), was used for the study. Observed data of each experiment composed of sugarcane biomass and CDW of three sugarcane cultivars (namely, 02-2-058, KK3 and LK92-11) at specific dates (days after planting—DAP). Details of the experiments were indicated in Table 1. Mean values of the observed data from the field experiments are indicated in supplementary materials Table S1. More details about the experimental arrangement of the field experiment can be found in Preecha et al. [6].

According to the Köppen–Geiger system, KK has a tropical wet-dry climate (Aw) [31]. Figure 1 shows mean values of monthly rainfall, daily minimum temperatures and daily maximum temperatures and daily solar radiation during study period (2010/12–2012/12). At first view, in 2011, the weather seems wetter, albeit in the first three months and the last two months there was no precipitation (Figure 1). Temperatures do not differ much between 2011 and 2012. In 2012, it seems that the weather was drier and more irradiated. However, experiment A experienced the highest water stressed condition for sugarcane growth compared to B1 and B2. Because it was conducted under Rf conditions and a prolonged dry period was recorded in the early and late stages of sugarcane growth (Figure 1: year 2011). There was less chance for water stress conditions manifest itself in experiment B1 because during the period rainfall was well distributed (Figure 1: year 2012) throughout the year compared to experiment A when irrigation was applied. However, B2 was conducted under Rf conditions in the same year with B1, but water stress conditions were lower than experiment A and higher than B1. Soil properties in the experimental field are shown in Table 2 [32].

2.2. APSIM Simulation

APSIM [33] is a comprehensive model developed to simulate biophysical processes in agricultural systems. APSIM contains interconnected biophysical and management models to simulate the processes that occur in systems comprising soil, crop, trees, pasture and livestock, and it has the flexibility to integrate non-biological farm resources. A more detailed description of the APSIM platform can be found on Holzworth et al. [33] and The APSIM Initiative [34].

The next generation of the APSIM [33] Sugarcane model [34] was used for this study. In the APSIM-Sugarcane model, crop dry weight accumulation is determined via the conversion of the intercepted radiation into biomass, which is based on radiation-use efficiency (rue). The area of the crop leaf canopy in which the radiation is intercepted is expanded as a function of temperature. The partitioning of biomass occurred among various components of the plant (sucrose, structural stem, leaf, cabbage and roots) based on the phenological stage of the crop:

sowing: from sowing to sprouting;
sprouting: from sprouting to emergence;
emergence: from emergence to the beginning of cane growth;
begin cane: from the beginning of cane growth to flowering;
flowering: from flowering to the end of the crop;
end of the crop: crop is not currently in the simulated system.

Several conditions, such as extremes of temperature, plant nitrogen deficits, or soil water shortage or excess, can limit growth during these stages.

The APSIM-Sugarcane model controls the above-mentioned biophysical processes of sugarcane growth via several parameters. Such parameters could be categorized into four groups, namely plant-crop parameters, ratoon-crop parameters, cultivar-specific parameters, and soil-and-climate parameters. Plant_crop parameters govern the processes of sugarcane growth and partitioning, water usage, water and temperature stresses, frosting, nitrogen contents and nitrogen stresses. Same as Plant_crop parameters, Ratoon_crop parameters govern the above processes in ratoon stage of the sugarcane. When parametrizing the model for a new cultivar, cultivar-specific parameters are of special interest because they represent key traits of a particular cultivar. Table 3 provides a description of cultivar trait parameters used in the APSIM-Sugarcane model.

Although the APSIM-Sugarcane model does not contain rue and transp_eff_cf among the cultivar-specific parameters, they were included in the current study because Sexton et al. [29] have identified rue as a highly sensitive parameter for biomass yield estimation and transp_eff_cf as a highly sensitive parameter for biomass yield estimation under water stress conditions. Bandara et al. [27] have obtained similar results related to rue and transp_eff_cf for estimation of CDW. If the supply of soil water is not a growth-limiting factor, the APSIM-Sugarcane model controls dry matter assimilation via radiation interception and rue. However, water supply, vapor pressure deficit, and transp_eff_cf govern dry matter assimilation under conditions in which transpiration demands cannot be met by the soil water supply.

APSIM-Sugarcane was used to make simulations for each experiment using the environmental conditions indicated in Figure 1 and Table 2, default cultivar specific parameters and management conditions indicated in Table 1. Prepared simulations for; experiment A was denoted as A_SIM, experiment B1 was denoted as B1_SIM and experiment B2 was denoted as B2_SIM. Biomass and CDW at different DAP (indicated in Table 1: Observed data) were selected as the outputs of each simulation. These simulations were used in building emulators (described in Section 2.3.2), emulator accuracy evaluation (described in Section 2.3.3) and checking the goodness of the optimization process (described in Section 2.5).

When preparing each simulation, default cultivar specific parameter values of APSIM-Sugarcane were selected. However, these parameters were replaced with random parameter ensembles to build emulators (Section 2.3.2) and also were replaced with optimized parameter ensembles in validation (Section 2.5). Parameters of APSIM-Sugarcane other than those listed in Table 3 remained at default at each simulation.

2.3. Emulation

The emulation concept has drawn increasing attention [35,36] as a way to reduce the computational burden of dynamical simulator runs (e.g., APSIM-Sugarcane). An emulator

\hat{m}

, which is either a metamodel or a surrogate model, can be used to approximate in a statistical sense the underlying simulator model (m) to reduce the cost of training runs. Whenever a simulation run is needed at a point (e.g., q), a fast prediction from a metamodel

\hat{m}

(q) can be used to replace the costly value m(q).

2.3.1. Gaussian Process-Based Emulation

A GP is one of the statistical processes commonly used for emulation. Here we provide a brief overview of how to use GP for emulator generation. More comprehensive details of the GP can be found in Williams and Rasmussen [26] and Kennedy and O’Hagan [37].

If we consider the crop simulator (APSIM-Sugarcane) as an unknown function Y = f(X), where Y is the model output and X is a vector of (p) parameters (X = [x₁, x₂, …, x_p]), then Y is a random function in a Bayesian framework. According to Kennedy and O’Hagan [37], GP can be considered to be a flexible and convenient class of distributions that can be used to represent prior understanding about f(X). A prior distribution can therefore be assumed to be a multi-variate normal function that is characterized by a linear additive mean (Equation (1)) and a covariance function (Equation (2)). The covariance function is specified to characterize the smoothness of the output.

m (X) = β_{0} + β_{1} x_{1} + \dots + β_{p} x_{p}

(1)

c o v (f (x), f {(x)}^{'}) = σ^{2} \prod_{i = 1}^{p} e x p {- r_{i} {(x_{i} - x_{i}^{'})}^{2}}

(2)

where

m (X)

indicates the linear additive mean function of X, X is a vector of p parameters (X = [x₁, x₂, …, x_p]) and

β

₀,

β

₁, …,

β

_p are unknown coefficients.

c o v (f (x), f (x)^{'})

is the covariance function of any pair of joint probability distribution

(f (x), f (x)^{'})

,

r_{i}

is a scaling parameter determining how rough the function is with respect to the ith input and

σ^{2}

is the overall variance of the mean function.

The training data used to build the posterior distribution were used to estimate the

β

hyper-parameters,

σ^{2}

parameter, and roughness parameters (

r_{i}

).

The emulator’s posterior distribution was generated by using training data that could be obtained from the runs of the simulator. The training data consisted of the model outputs (Y) generated from each simulator run and the corresponding vectors of the p-dimensional parameters (X = [x₁, x₂, …, x_p]).

2.3.2. Building Emulators

At first, 500-parameter ensembles were prepared to train and validate emulators. Each ensemble was composed of 14 parameters as listed in Table 3. Parameter ensembles were prepared by using 500 random numbers which are distributed uniformly between the maximum and minimum values of each parameter space (Table 3: Parameter space) using functions in R [38] software. Reasonable values for the parameter spaces were selected based on the descriptions of previous studies of Bandara et al. [27] and Sexton et al. [22,29].

Then, the parameter ensembles were used to run APSIM-Sugarcane under single experiments A_SIM, B1_SIM and B2_SIM (described in Section 2.2) producing 500 simulations per single experiment. Required scripts were created using the jsonlite [39], DBI [40] and RSQLite [41] packages of R.

Three hundred outputs of biomass and CDW at the specified DAP were used to train GP emulators and remaining 200 outputs were used for testing the accuracy of the emulators (Section 2.3.3).

The GP implemented in “Gaussian Emulation Machine for Sensitivity Analysis (GEM-SA)” [42] software was used to build the emulators. Fifty emulators were built to approximate sugarcane biomass and CDW under each experiment (A, B1 and B2) at each DAP: (for Experiment A, biomass at 7 specific DAPs: 7 emulators; Experiment A CDW at 7 specific DAPs: 7 emulators; Experiment B1 biomass at 9 specific DAPs: 9 emulators; Experiment B1 CDW at 9 specific DAPs: 9 emulators; Experiment B2 biomass at 9 specific DAPs: 9 emulators; Experiment B2 CDW at 9 specific DAPs: 9 emulators were built).

All the emulators were evaluated (Section 2.3.3) and used in the global optimization (Section 2.4) as a surrogate for the simulator.

2.3.3. Emulator Accuracy Evaluation

The accuracy of the built emulators was evaluated by using coefficient of determination (R²), leave-one-out cross-validated root-mean-squared standardized error (CV_RMSSE) and sigma-squared value (σ²).

R² was calculated between the APSIM-simulated values and the emulator-predicted values. While developing each emulator, the remaining 200 parameter ensembles that were not used to generate emulators (described in Section 2.3.2) were used in the GEM-SA for obtaining the emulator-predicted outputs. Additionally, corresponding APSIM simulated outputs of the 200 parameter ensembles mentioned in Section 2.3.2 were used here as simulator predicted outputs. Then, both the emulator and simulator predictions were graphed to determine the R² between them. The emulators with R² values close to 1.0 were considered to be highly accurate. R² of emulator accuracy is hereafter denoted by

R_{e m u}^{2}

.

Both CV_RMSSE and σ² were computed internally by the GEM-SA while building emulators with 300 training data points. To calculate CV_RMSSE in GEM-SA, leave-one-out-cross-validation was used. In briefly, it fits the emulator by leaving one data point from the training data and, the missing point is predicted from the fitted emulator. This is repeated for all combinations of training data to provide overall effectiveness of the emulator as CV_RMSSE. According to Qin et al. [43] and Sexton [44], the emulator variance is considered to have accurately estimated the actual error variance if the CV_RMSSE is close to 1; overestimation and underestimation are indicated by lower and higher values, respectively.

Equation (3) defines CV_RMSSE as follows:

C V_{R M S S E} = \sqrt{\frac{\sum_{i = 1}^{n} {((y_{i} - \hat{y}) / s_{i})}^{2}}{n}}

(3)

where “y_i is the true output for the ith training run,

\hat{y}

is the corresponding emulator approximation, s_i is the standard deviation calculated with the ith training point removed and n is the number of runs” [45].

According to Petropoulos et al. [46], the σ² value effectively measures the nonlinearity of an emulator via indicting the emulator variance after the output standardization. For a linear model, the σ² value is close to 0, whereas moderately to highly nonlinear models have greater σ² values (without a defined cutoff value).

2.4. Global Optimization

Global optimization seeks the best parameter ensemble which provides the best agreement between observed values and model-simulated outputs (global optima) in the presence of multiple local optima. Global optimization was conducted using the DE function implemented in the DEoptim [47] package of the R platform. Here, we provide a brief overview of the DE function; a detailed description can be found in Ardia et al. [47].

DE is among a class of genetic algorithms that have been inspired by biological processes. More details about genetic algorithms can be found in Mitchell [48]. Processes such as selection on a population, mutation, and crossover are used by genetic algorithms during optimization to minimize an objective function over the course of successive generations.

Like other evolutionary algorithms, DE evolves a population of parameter vectors to solve optimization problems. A single population is composed of a number of parameter vectors (NP). In many cases, NP should set at least 10 times the number of parameters used for optimization to avoid possible misconvergence in the optimization. The initial population is generated with values given by the user or with random values between lower and upper bounds that are defined by the user. The next successive generation of the population is then created using the parameter vectors of the current population by implementing differential mutation. To generate the first mutant parameter vector (v_i) of the next successive generation, three random parameter vectors of the existing population (x_r0, x_r1 and x_r2) are selected, and v_i is generated by using

v_{i} \dot{=} x_{r o} + F \cdot (x_{r 1} - x_{r 2})

, where F is a differential weighting factor, effective values of which are typically between 0 and 1. After the first mutation, the remaining parameter vectors of successive generations are created by continuing the mutation with a crossover probability

C R \in [0, 1]

. The fraction of the parameter values that are copied from a mutant is controlled by the CR. During the process, all elements of the vector are created with respect to the defined lower and upper bounds of the parameters. The corresponding values of the objective function with respect to each trial vector are then determined. The previous vector in the population is replaced if the objective function value of a particular trial vector is equal to or lower than the previous vector; otherwise, the previous vector is retained.

The “DEoptim” function in the DEoptim package of R requires that several arguments be defined before the optimization process, including the objective function, the lower and upper bounds of the parameters and several control parameters (defined in Ardia et al. [47]). The root mean square error (RMSE) between the observed data and corresponding emulator predictions are defined as the objective function to be optimized (minimized) (Equation (4)). We ran the set of files produced by GEM-SA in R during each emulator generation to obtain emulator predictions. The R script, which can be modified to accomplish this task, can be found in Kennedy and Petropoulos [45]. The same lower and upper bounds used to generate emulators (Table 3: Parameter space) were defined as lower and upper bounds of the DEoptim function. The default control parameters described in Ardia et al. [47] were used for the optimization.

R M S E = \sqrt{\frac{\sum_{i = 1}^{n} {(O_{i} - E p_{i})}^{2}}{n}}

(4)

where O_i is the value of the ith observation, Ep_i is the emulator-predicted value of the ith observation, and n is the number of observations.

The parameter vector that gave the lowest RMSE for the objective function after 1000 iterations from a population of 500 NP of the DEoptim function was obtained as the return result (a vector composed of 27 elements including sub levels of parameters [see Table 3: Code]). For a single cultivar, six optimized parameter ensembles (or vectors) were obtained from above process. Each of this parameter ensemble was optimized using biomass or CDW yield of the particular cultivar in the three field experiments (A, B1 and B2) [six different sets of observed data]. Emulator predictions given by those parameter ensembles were also recorded (Ep_i).

2.5. Validation of Optimized Parameters

2.5.1. Validation—Step One

In the first step, optimized parameter ensembles were used to estimate the sugarcane yield under the same experimental condition (which was used to optimize the particular parameter ensemble) by using APSIM-Sugarcane. Then accuracy of each parameter ensemble was evaluated by comparing simulation predictions with observed data.

By running APSIM-Sugarcane using the parameter ensembles (described in Section 2.4), APSIM predictions (Sp_i) were obtained. Then, Ep_i, Sp_i, and O_i were graphed together, and the normalized root mean square error (NRMSE), R², and Willmott’s agreement index (AI) were calculated to evaluate the accuracy of the optimized results. The R² was calculated from the linear regression between the observed and simulated values (Ep_i and Sp_i). The NRMSE (Equation (5)) was equated to the RMSE divided by the output range and was reported as a percentage. The AI is a measure of non-parametric goodness of fit (Equation (6)), and the desired value is close to one.

N R M S E = \frac{R M S E}{O_{m a x} - O_{m i n}}

(5)

A I = 1 - \frac{\sum_{i = 1}^{n} {(S_{i} - O_{i})}^{2}}{\sum_{i = 1}^{n} {(| S_{i} - \bar{O} | + | O_{i} - \bar{O} |)}^{2}}

(6)

where S_i indicates ith simulated value, O_i indicates the value of ith observation, and

O_{m a x} - O_{m i n}

is the range of observed values.

Calculated NRMSE, R² and AI of the first validation step are hereafter denoted by NRMSE_opt,

R_{o p t}^{2}

and AI_opt, respectively. Results of the validation step one was indicated under Section 3.2.1.

2.5.2. Validation—Step Two

In the second step, optimized parameter ensembles were used to estimate the sugarcane yield under other experimental conditions (which were not used to optimize the particular parameter ensemble) by using APSIM-Sugarcane. Accuracy of each parameter ensemble was evaluated by calculating R², NRMSE, and AI between the observed values and simulator predictions. Comparing all optimized parameter ensembles based on the validation results of step one and two, an optimal parameter ensemble was selected for each cultivar to parameterize APSIM-Sugarcane for estimating biomass and CDW. Validation process of step two was showed in Figure 2.

As indicated in Figure 2, hereafter parameter ensembles optimized from the observed data of; experiment A biomass was denoted as P1, experiment A CDW was denoted as P2, experiment B1 biomass was denoted as P3, experiment B2 CDW was denoted as P4, experiment B2 biomass was denoted as P5 and experiment B2 CDW was denoted as P6 of each cultivar. Further calculated NRMSE, R² and AI of second validation step are hereafter denoted by NRMSE_val,

R_{v a l}^{2}

and AI_val, respectively. Results were depicted using box diagrams and dot plots under Section 3.2.2. Finally, the selected optimal parameter ensembles were compared.

3. Results and Discussion

3.1. Emulator Accuracy

This section discusses the performances of the built emulators in terms of the calculated

R_{e m u}^{2}

(Figure 3), the GEM-SA internally generated σ² (Figure 4), and the CV_RMSSE values (Figure 5).

The linear relationships between the APSIM-simulated values and the emulator-predicted values are indicated as scatter plots in Figure 3. Because it was difficult to show all the graphs, Figure 3 shows only the emulator performances of experiment B1. Results related to experiment A and B2 are indicated in Supplementary Materials: Figure S1 (results of experiment A), Figure S2 (results of experiment B2). Each scatter plot represents the results of two emulator simulations of biomass and CDW and the corresponding APSIM simulations.

With the exception of a few conditions, the calculated

R_{e m u}^{2}

values all ranged between 0.9 and 1. The indication was that all the emulators could approximate the APSIM simulators successfully. However, the

R_{e m u}^{2}

of CDW in 96_DAP of experiment A and 99_DAP of experiment B1 and B2 were only 0.43, 0.47 (Figure 3) and 0.3, respectively. In these cases, results were noisy and highly variable. This noise reflects the fact that around 96_DAP and 99_DAP, the sugarcane plant in the APSIM model is at the “emergence” stage, where emergence is the beginning of cane growth. Although the CDW (sucrose weight and structural stem weight) is very low, it is highly variable (high coefficient of variation) at the emergence stage. As a result, the smoothness of the output of the GP, which is characterized by its covariance function (Equation (2)), can be high, therefore predictions make by the emulator will have understated uncertainty. The new output values will then be further from the emulator approximations than the emulator expects. We therefore further analyzed σ² and CV_RMSSE values to assess the performances of the emulators.

The σ² values ranged between 0.06 and 1.64 for all emulators (Figure 4). We observed relatively high σ² values for the early stages of CDW (e.g., 96_DAP of experiment A and 99_DAP of experiment B1). This pattern was consistent with the observations of low

R_{e m u}^{2}

values under those conditions. The computed values of CV_RMSSE of the emulators were ranged between 0.74 and 1.01 (Figure 5).

Petropoulos et al. [46] have reported that the σ² values for their emulators ranged between 0.13 and 1.6, Qin et al. [43] have reported that the σ² values for their emulators ranged between 0.6 and 2.1 and the parameters they used showed only moderate deviation from the linearity. Gunarathna et al. [30] have indicated that their emulators showed good to moderate linearity with σ² values that ranged from 0.10 to 1.43. It can hence be concluded that good linearity was shown by our emulators under all considered environmental and management condition.

In comparison with the CV_RMSSE values obtained by Gunarathna et al. [30], Kennedy and Petropoulos [45] and Petropoulos et al. [46] the values we obtained were lower, and the fact that they were close to 1.0 in all the experiments suggested that the built emulators could represent the true model well.

3.2. Validation of Optimized Parameters

3.2.1. Validation—Step One

This section evaluated the accuracy of the optimized results with respect to the observed data which was used to optimize the particular parameter ensemble. The parameter ensembles that gave the lowest RMSE values for the objective function (described in Section 2.4) are listed in Supplementary Materials: Table S2 (results for cultivar KK3), Table S3 (results for cultivar LK92-11), and Table S4 (results for cultivar 02-2-058). Figure 6 compares those parameter ensembles based on simulated (Emulator and APSIM) sugarcane yields (Biomass and CDW) and observed sugarcane yields for cultivar KK3. Because of the difficulty of showing all the graphs, this manuscript presents only few examples to represent all conditions. We evaluated the parameter ensembles (Figure 6) by comparing the

R_{o p t}^{2}

and NRMSE_opt percentages and the AI_opt values between sugarcane yields simulations obtained with the optimized parameter ensembles and observed sugarcane yields.

The calculated

R_{o p t}^{2}

and AI_opt values and the NRMSE_opt percentages fell in the ranges 0.95–1, 0.97–1 and 1–11.32%, respectively. The indication was that all parameter ensembles obtained from the optimization could probably be used to approximate observed values using APSIM.

As expected, the results of the APSIM using optimized parameters were less accurate than the emulator results (Figure 6). This is because parameters were first obtained from the global optimization based on emulators and then used with the APSIM-Sugarcane simulations to obtain predictions for accuracy evaluation. However, as indicated by the above results, the APSIM using optimized parameters could still be used to simulate observed values in all cases with acceptable accuracy. The reason is that the built emulators indicated higher accuracy when approximating the simulator (Section 3.1).

Moreover, a GP-based optimization method is very efficient in terms of computational time. Because during our study we could observe that single simulation of APSIM-Sugarcane requires a CPU time of 1.46 ± 0.006 s (CPU had a Quad-core with 1.60 GHz clock speed and 6 Mb L3 cache with 7.86 GB usable RAM) while emulators require 0.02 ± 0.005 s to provide similar output (calculated by using proc.time() function of R). Therefore, emulators with sufficient accuracy can be used to reduce the computational burden of the process which often requires large number of simulator runs. For instance, Sexston et al. [22] used 30,000 simulator runs in calibration of varietal parameters in APSIM-Sugarcane and Sheng et al. [21] used 50,000 simulator runs in calibration of varietal parameters in APSIM-Maize. However, several simulator runs are required to build the emulators and for their accuracy evaluation. However, the number of simulator runs required to build an accurate emulator is less than the number of simulator runs required in case of only simulator is used for optimization. In our study, with 300 simulator runs, we could build sufficiently accurate emulators (additional 200 simulator runs were used for accuracy evaluation). Then emulators used for the optimizations with 1000 iterations each composed with 500 NP of emulator runs. More details about relative advantages of GP emulation could be find in Oakley and O’Hagan [49] and Kennedy et al. [50]. Our results therefore confirmed that the optimization method with the emulator could be used for improving the efficiency of model development.

3.2.2. Validation—Step Two

This section discusses the accuracy of the parameter ensembles obtained from optimization for use in cultivar trait parameterization of the APSIM-Sugarcane model under different environment and management conditions (see Figure 2). Comparisons between the observed sugarcane yields and the APSIM-predicted sugarcane yields are shown in Figure 7 and Figure 8.

Calculated

R_{v a l}^{2}

values ranged between 0.925 and 1.0 for all conditions (Figure 7). Based on the

R_{v a l}^{2}

values, all parameter vectors indicated best performances for estimating observed yields. However, we observed large variations of the NRMSE_val and AI_val values among cultivars and parameter ensembles (Figure 8).

The NRMSE_val percentages and AI_val values of the parameter ensembles P3 of cultivar KK3 (NRMSE_val%: 8–14%, AI_val: 0.94–0.99), P5 of cultivar LK92-11 (NRMSE_val%: 6–22%, AI_val: 0.87–0.99%) and the P3 of cultivar 02-2-058 (NRMSE_val%: 5–19%, AI_val: 0.95–0.99) indicated relatively high performance (Table 4). They could therefore be selected as the best parameter ensemble to parametrize the APSIM-Sugarcane model for cultivars KK3, LK92-11 and 02-2-058. All NRMSE_val percentages were dispersed closer to 0 than other parameter ensembles (Figure 8). Moreover, those parameter ensembles corresponded to the lowest AI_val values, and the data points were dispersed closer to 1 than the data points of the other parameter ensembles of the respective cultivars (Figure 8).

Optimized parameters under high water stress conditions resulted in the lowest performances when estimating the sugarcane yields under the lowest water stress conditions. This pattern can be clearly observed in Figure 8 (NRMSE_val%), where optimized parameter ensembles (P1 and P2) for experiment A resulted in comparatively high NRMSE_val percentages when estimating the observed sugarcane yield of experiment B1. To explain this pattern, we simulated the “soil water deficit factor for photosynthesis (swdef_photo)” in the APSIM for each experiment. For instance, the swdef_photo and the observed sugarcane biomass yields of experiments A, B1 and B2 of cultivar KK3 are indicated in Figure 9. We observed that water stress conditions were most apparent in experiment A (swdef_photo near 0), than experiment B1 and B2. Due to that the lowest sugarcane yield was also observed in experiment A. Therefore, when the parameters were optimized in experiment A, the parameters were estimated to result in lower yields than in experiment B1 and B2. Use of parameters estimated to result in low yields under severe water stress conditions may lead to poor performances under low water stress conditions.

However, parameters optimized under the lowest water stress conditions were found to provide better estimates of sugarcane yields under the highest water stress conditions. This pattern is apparent in the NRMSE_val% in Figure 8. The optimized parameter ensembles for experiment B1 (P3 and P4) resulted in comparatively low NRMSE_val values in the estimates of the observed sugarcane yields for experiment A. Because in APSIM-Sugarcane, parameter rue is previously identified as a highly sensitive parameter for the estimation of biomass and CDW of sugarcane [27,28,29,30]. Therefore, when optimizing parameters under low water stress conditions (B1), the estimated value of rue could be increased to estimate higher yields. When those optimized parameter ensembles are used for simulations under severe water stress conditions (A), APSIM-Sugarcane will reduce the rue because in APSIM-Sugarcane, rue tends to be reduced whenever a soil water shortage condition is met [9,51].

3.3. Comparison of Optimized Parameters

This section discusses the differences between the three selected parameter ensembles (listed in Table 4) for varietal parameterization (Cultivar KK3, LK92-11 and 02-2-058) of the APSIM-Sugarcane.

Cultivars KK3 and 02-2-058 evidenced a higher rue parameter than cultivar LK92-11. This difference is apparent in the best estimated parameter ensembles listed in Table 4. The RUE3 and RUE4 values were lower for cultivar LK92-11 than for cultivars KK3 and 02-2-058. This difference reflects the fact that rue is a highly sensitive parameter for estimation of sugarcane biomass [29]. However, as explained previously, the value of rue may be lower when there is a soil water shortage. The lower values of rue (RUE3: rue of growth stage 3 [from emergence to the beginning of cane growth] and RUE4: rue of growth stage 4 [from emergence to the beginning of cane growth]) for cultivar LK92-11 therefore reflected the fact that the selected parameter for the cultivar was optimized under Rf (water stressed) conditions (Experiment B2). However, we observed a higher value for the RUE5 of cultivar LK92-11 (Table 4) compared to other cultivars. This difference could reflect the fact that in APSIM, growth stage 5 (from flowering to the end of the crop) is currently deactivated because of a lack of good physiological information on which to base predictions [9]. The rue of growth stage 5 (RUE5) thus has no influence on the determination of biomass yield.

Cultivars KK3 and 02-2-058 evidenced higher transp_eff_cf values for growth stage 4 (TEC4) than cultivar LK92-11. To explain this difference, we simulated the growth stages of APSIM-Sugarcane using the selected parameter ensembles (Table 4) under corresponding experimental conditions. Figure 10 shows the results of the simulated and observed yields of each cultivar under the same conditions. Growth stages of 1, 2, 5 and 6 were not included in the simulation in Figure 10 because the observed values were reported for DAP 99–390. It is obvious that transp_eff_cf of growth stages 1 (TEC1), 2 (TEC2), 5 (TEC5) and 6 (TEC6) would have less influence on the estimation of sugarcane biomass. However, transp_eff_cf has previously been identified as a highly sensitive parameter for estimating biomass yields under water stress conditions [29]. Even though the parameter ensemble of LK92-11 was estimated under Rf conditions, growth stage 3 (Figure 10: DAP 99–128) was not affected by the water stress (Figure 9: swdef_photo of DAP 99–128). In this case, assimilation dry matter was governed mainly by the RUE3. Similar values for TEC3 were thus observed for each cultivar. However, the TEC4 of cultivar LK92-11 was lower because during growth stage 4, water stress conditions were reported (Figure 9: swdef_photo of DAP 128–390). In this case, assimilation of dry matter in APSIM is governed by transp_eff_cf, water supply, and the vapor pressure deficit. As a result, a lower value for TEC 4 was reported in accord with the lower yield observed for LK92-11 compared to other cultivars (Figure 10).

However, we observed better drought resistant characteristic of cultivar LK92-11 than the other cultivars. Because the parameter ensemble obtained for cultivar LK92-11 (listed in Table 4) underestimated the observed yields under high water stress conditions (results of estimates: Exp. A_ Biomass = NRMSE_val: 18.14%, Exp. A_ CDW = NRMSE_val: 21.61% [Figure 8: cultivar LK92-11, NRMSE_val% of P5]). However, parameter ensembles of the remaining cultivars indicated satisfactory results under the same conditions (Figure 8, cultivars KK3 and 02-2-058: P3). Our results were similar to the results obtained by Preecha et al. [6] using the same field experiments and are consistent with Peerasak [52], who reported that LK92-11 was less sensitive to water shortage than KK3. Cha-um et al. [53] have also indicated that LK92-11 is tolerant to water deficit.

Cultivar KK3 evidenced rapid sugar accumulation compared to the other cultivars under the same environmental and management conditions. This pattern was observed when simulating each cultivar under each experimental condition (A, B1 and B2) in APSIM-Sugarcane to obtain sucrose weight. This pattern is caused by the parameters that govern sucrose accumulation (cane_fraction, stress_factor_stalk, sucrose_fraction_stalk, sucrose_delay, min_sstem_sucrose_redn, and min_sstem_sucrose). The optimized KK3 parameters facilitated early sucrose accumulation compared to other cultivars (Table 4). For instance, the highest value for the sucrose_fraction_stalk and the lowest value for the min_sstem_sucrose were observed for KK3 (Table 4). Moreover, Khonghinta et al. [54] have conducted field experiments to compare KK3 with several other sugarcane cultivars in KK and have also indicated that KK3 accumulates sugar rapidly.

All cultivars evidenced similar values (p < 0.05) for leaf area index (LAI) and phenological stages when simulated under similar environmental conditions in APSIM-Sugarcane. Although we observed higher values for leaf development parameters (green_leaf_no, average leaf_size, average tiller leaf size) of cultivar LK92-11 than of cultivars KK3 and 02-2-058 (Table 4), we observed similar LAI values because cultivar LK92-11 evidenced a lower rue than the other cultivars. In APSIM-Sugarcane, leaf area growth can be limited by the amount of biomass partitioned to the leaf. The amount of biomass that is produced from the conversion of intercepted radiation is largely governed by the rue.

We did not observe much difference between the parameters that control phenological development based on thermal time (tt_emerge_to_begcane, tt_begcane_to_flowering, and tt_flowering_to_crop_end) in APSIM-Sugarcane. Moreover, in APSIM-Sugarcane tt_flowering_to_crop_end is currently deactivated because of the absence of a good physiological basis for prediction.

4. Conclusions

This study was aimed on using the GP-based emulators to optimize cultivar trait parameters in the APSIM-Sugarcane model. According to the obtained values for R², σ² and CV_RMSSE values, the emulators we built for the optimization showed satisfactory results. The indication is that these emulators can approximate the original simulator (APSIM-Sugarcane) successfully. Via the GP-based emulator optimization, we could obtain acceptable parameter ensembles for parametrization of Thai cultivars KK3, LK92-11 and 02-2-058 by using the APSIM-Sugarcane model. The optimized parameters evidenced satisfactory results during the validation under the environmental and management conditions found in KK, Thailand. Based on our validation results, we suggest that GP emulation can be efficiently implemented for parameterization of computationally expensive simulators. Future studies will be needed to reach more robust conclusions concerning the use of emulation for parameter optimization with APSIM-Sugarcane.

Supplementary Materials

The following are available online at https://www.mdpi.com/article/10.3390/agronomy11071379/s1, Figure S1: Relationship between APSIM-simulated values and emulator-predicted values of biomass and CDW in experiment A at each reporting frequency at days after planting (DAP): 96, 117, 147, 173, 244, 293 and 388; Figure S2: Relationship between APSIM-simulated values and emulator-predicted values of biomass and CDW in experiment B2 at each reporting frequency at days after planting (DAP): 99, 128, 185, 238, 267, 299, 329, 360 and 390; Table S1: Summary of the data collected during field experiments; Table S2: Best estimated parameter ensembles obtained for biomass and CDW of cultivar KK3 based on the emulators of experiments A, B1 and B2; Table S3: Best estimated parameter ensembles obtained for the biomass and CDW of cultivar LK92-11 from the emulators of experiments A, B1 and B2; Table S4: Best estimated parameter ensembles obtained for the biomass and CDW of cultivar 02-2-058 from the emulators of experiments A, B1 and B2.

Author Contributions

Conceptualization, methodology and formal analysis, W.B.M.A.C.B. and K.S.; investigation and writing—original draft preparation, W.B.M.A.C.B.; writing—review and editing, W.B.M.A.C.B. and R.H.K.R.; supervision, K.S., T.N., P.K., M.A., S.N. and H.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

References

Walton, J. The 5 Countries That Produce the Most Sugar. Available online: https://www.investopedia.com/articles/investing/101615/5-countries-produce-most-sugar.asp (accessed on 2 November 2020).
Bangkok, P. Sugar Output to Fall Short of Target Yield. Available online: https://www.bangkokpost.com/business/2136727/sugar-output-to-fall-short-of-target-yield (accessed on 27 June 2020).
Ruan, H.; Feng, P.; Wang, B.; Xing, H.; O’Leary, G.J.; Huang, Z.; Guo, H.; Liu, D.L. Future Climate Change Projects Positive Impacts on Sugarcane Productivity in Southern China. Eur. J. Agron. 2018, 96, 108–119. [Google Scholar] [CrossRef]
Biggs, J.S.; Thorburn, P.J.; Crimp, S.; Masters, B.; Attard, S.J. Interactions between Climate Change and Sugarcane Management Systems for Improving Water Quality Leaving Farms in the Mackay Whitsunday Region, Australia. Agric. Ecosyst. Environ. 2013, 180, 79–89. [Google Scholar] [CrossRef]
Everingham, Y.; Inman-Bamber, G.; Sexton, J.; Stokes, C. A Dual Ensemble Agroclimate Modelling Procedure to Assess Climate Change Impacts on Sugarcane Production in Australia. Agric. Sci. 2015, 6, 870–888. [Google Scholar] [CrossRef][Green Version]
Preecha, K.; Sakai, K.; Pisanjaroen, K.; Sansayawichai, T.; Cho, T.; Nakamura, S.; Nakandakari, T. Calibration and Validation of Two Crop Models for Estimating Sugarcane Yield in Northeast Thailand. Trop. Agric. Dev. 2016, 60, 31–39. [Google Scholar] [CrossRef]
Sexton, J.; Inman-Bamber, N.G.; Everingham, Y.; Basnayake, J.; Lakshmanan, P.; Jackson, P. Detailed Trait Characterisation Is Needed for Simulation of Cultivar Responses to Water Stress. In Proceedings of the 36th Conference of the Australian Society of Sugar Cane Technologists, Gold Coast, Qld, Australia, 29 April–1 May 2014; pp. 82–92. [Google Scholar]
Van Den Berg, M.; Smith, M.T. Crop Growth Models for Decision Support in the South African Sugarcane Industry. In Proceedings of the 79th Annual Congress of South African Sugar Technologists’ Association, Kwa-Shukela, Mount Edgecombe, South Africa, 19–22 July 2005; pp. 495–509. [Google Scholar]
Keating, B.A.; Robertson, M.J.; Muchow, R.C.; Huth, N.I. Modelling Sugarcane Production Systems I. Development and Performance of the Sugarcane Module. Field Crop. Res. 1999, 61, 253–271. [Google Scholar] [CrossRef]
Jones, M.R.; Singels, A. Refining the Canegro Model for Improved Simulation of Climate Change Impacts on Sugarcane. Eur. J. Agron. 2018, 100, 76–86. [Google Scholar] [CrossRef]
Martiné, J.F.; Todoroff, P. Le Modèle de Croissance Mosicas et Sa Plateforme de Simulation Simulex: État Des Lieux et Perspectives. Rev. Agric. Sucrière Maurice 2002, 80, 133–147. [Google Scholar]
Brisson, N.; Gary, C.; Justes, E.; Roche, R.; Mary, B.; Ripoche, D.; Zimmer, D.; Sierra, J.; Bertuzzi, P.; Burger, P.; et al. An Overview of the Crop Model. Stics. Eur. J. Agron. 2003, 18, 309–332. [Google Scholar] [CrossRef]
Liu, D.L.; Bull, T.A. Simulation of Biomass and Sugar Accumulation in Sugarcane Using a Process-Based Model. Ecol. Modell. 2001, 144, 181–211. [Google Scholar] [CrossRef]
Dias, H.B.; Inman-Bamber, G.; Everingham, Y.; Sentelhas, P.C.; Bermejo, R.; Christodoulou, D. Traits for Canopy Development and Light Interception by Twenty-Seven Brazilian Sugarcane Varieties. Field Crop. Res. 2020, 249, 107716. [Google Scholar] [CrossRef]
Peng, T.; Fu, J.; Jiang, D.; Du, J. Simulation of the Growth Potential of Sugarcane as an Energy Crop Based on the APSIM Model. Energies 2020, 13, 2173. [Google Scholar] [CrossRef]
Mthandi, J.; Kahimba, F.C.; Tarimo, A.K.P.R.; Salim, B.A.; Lowole, M.W. Modification, Calibration and Validation of APSIM to Suit Maize (Zea mays L.) Production System: A Case of Nkango Irrigation Scheme in Malawi. Am. J. Agric. For. 2014, 2, 1–11. [Google Scholar]
Sun, H.; Zhang, X.; Wang, E.; Chen, S.; Shao, L.; Qin, W. Assessing the Contribution of Weather and Management to the Annual Yield Variation of Summer Maize Using APSIM in the North China Plain. Field Crop. Res. 2016, 194, 94–102. [Google Scholar] [CrossRef]
Seidel, S.J.; Palosuo, T.; Thorburn, P.; Wallach, D. Towards Improved Calibration of Crop Models—Where Are We Now and Where Should We Go? Eur. J. Agron. 2018, 94, 25–35. [Google Scholar] [CrossRef]
Harrison, M.T.; Roggero, P.P.; Zavattaro, L. Simple, Efficient and Robust Techniques for Automatic Multi-Objective Function Parameterisation: Case Studies of Local and Global Optimisation Using APSIM. Environ. Model. Softw. 2019, 117, 109–133. [Google Scholar] [CrossRef]
Holzworth, D.P.; Snow, V.; Janssen, S.; Athanasiadis, I.N.; Donatelli, M.; Hoogenboom, G.; White, J.W.; Thorburn, P. Agricultural Production Systems Modelling and Software: Current Status and Future Prospects. Environ. Model. Softw. 2015, 72, 276–286. [Google Scholar] [CrossRef]
Sheng, M.; Liu, J.; Zhu, A.X.; Rossiter, D.G.; Liu, H.; Liu, Z.; Zhu, L. Comparison of GLUE and DREAM for the Estimation of Cultivar Parameters in the APSIM-Maize Model. Agric. For. Meteorol. 2019, 278, 107659. [Google Scholar] [CrossRef]
Sexton, J.; Everingham, Y.; Inman-Bamber, G. A Theoretical and Real-World Evaluation of Two Bayesian Techniques for the Calibration of Variety Parameters in a Sugarcane Crop Model. Environ. Model. Softw. 2016, 83, 126–142. [Google Scholar] [CrossRef]
Georgioudakis, M.; Plevris, V. A Comparative Study of Differential Evolution Variants in Constrained Structural Optimization. Front. Built Environ. 2020, 6, 102. [Google Scholar] [CrossRef]
Bilal; Pant, M.; Zaheer, H.; Garcia-Hernandez, L.; Abraham, A. Differential Evolution: A Review of More than Two Decades of Research. Eng. Appl. Artif. Intell. 2020, 90, 103479. [Google Scholar] [CrossRef]
Saltelli, A.; Chan, K.; Scott, M. Sensitivity Analysis. In Probability and Statistics Series; John Wiley Sons: Chichester, UK, 2000. [Google Scholar]
Rasmussen, C.E.; Williams, C.K.I. Gaussian Processes for Machine Learning; Dietterich, T., Bishop, C., Heckerman, D., Jordan, M., Kearns, M., Eds.; MIT Press: Cambridge, MA, USA, 2006; Volume 2, ISBN 026218253X. [Google Scholar]
Bandara, W.B.M.A.C.; Sakai, K.; Nakandakari, T.; Kapetch, P.; Rathnappriya, R.H.K.A. Gaussian-Process-Based Global Sensitivity Analysis of Cultivar Trait Parameters in APSIM-Sugar Model: Special Reference to Environmental and Management Conditions in Thailand. Agronomy 2020, 10, 984. [Google Scholar] [CrossRef]
Sexton, J.; Everingham, Y. Global Sensitivity Analysis of Key Parameters in A Process-Based Sugarcane Growth Model—A Bayesian Approach. In Proceedings of the 7th International Congress on Environmental Modelling and Software, San Diego, CA, USA, 15–19 June 2014. [Google Scholar]
Sexton, J.; Everingham, Y.L.; Inman-Bamber, G. A Global Sensitivity Analysis of Cultivar Trait Parameters in a Sugarcane Growth Model for Contrasting Production Environments in Queensland, Australia. Eur. J. Agron. 2017, 88, 96–105. [Google Scholar] [CrossRef]
Gunarathna, M.H.J.P.; Sakai, K.; Nakandakari, T.; Momii, K.; Kumari, M.K.N. Sensitivity Analysis of Plant and Cultivar-Specific Parameters of APSIM-Sugar Model: Variation between Climates and Management Conditions. Agronomy 2019, 9, 242. [Google Scholar] [CrossRef]
Khon Kaen Climate. Available online: https://en.climate-data.org/asia/thailand/khon-kaen-province/khon-kaen-4291/ (accessed on 21 December 2020).
USDA. Soil Texture Calculator. Available online: https://www.nrcs.usda.gov (accessed on 12 December 2020).
Holzworth, D.; Huth, N.I.; Fainges, J.; Brown, H.; Zurcher, E.; Cichota, R.; Verrall, S.; Herrmann, N.I.; Zheng, B.; Snow, V. APSIM Next Generation: Overcoming Challenges in Modernising a Farming Systems Model. Environ. Model. Softw. 2018, 103, 43–51. [Google Scholar] [CrossRef]
The APSIM Initiative. APSIM: The Leading Software Framework for Agricultural Systems Modelling and Simulation. Available online: https://www.apsim.info/ (accessed on 20 November 2020).
Rohmer, J.; Foerster, E. Global Sensitivity Analysis of Large-Scale Numerical Landslide Models Based on Gaussian-Process Meta-Modeling. Comput. Geosci. 2011, 37, 917–927. [Google Scholar] [CrossRef]
Mohammadi, H.; Challenor, P.; Goodfellow, M. Emulating Dynamic Non-Linear Simulators Using Gaussian Processes. Comput. Stat. Data Anal. 2019, 139, 178–196. [Google Scholar] [CrossRef]
Kennedy, M.C.; O’Hagan, A. Bayesian Calibration of Computer Models. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2001, 63, 425–464. [Google Scholar] [CrossRef]
R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2013; Available online: https://www.R-project.org/ (accessed on 25 November 2020).
Ooms, J. The Jsonlite Package: A Practical and Consistent Mapping between JSON Data and R Objects. Available online: https://arxiv.org/abs/1403.2805 (accessed on 28 December 2020).
Wickham, H.; Müller, K. Package ‘DBI.’ 2019. Available online: https://cran.r-project.org/web/packages/DBI/index.html (accessed on 24 June 2021).
Müller, K.; Wickham, H.; James, D.A.; Falcon, S. ‘SQLite’ Interface for R. Available online: https://cran.r-project.org/web/packages/RSQLite/RSQLite.pdf (accessed on 10 December 2020).
Kennedy, M. The GEM Software. Available online: http://www.tonyohagan.co.uk/academic/GEM/ (accessed on 23 June 2021).
Qin, X.; Wang, H.; Li, Y.; Li, Y.; McConkey, B.; Lemke, R.; Li, C.; Brandt, K.; Gao, Q.; Wan, Y.; et al. A Long-Term Sensitivity Analysis of the Denitrification and Decomposition Model. Environ. Model. Softw. 2013, 43, 26–36. [Google Scholar] [CrossRef]
Sexton, J. Bayesian Statistical Calibration of Variety Parameters in a Sugarcane Crop Model. Master’s Thesis, James Cook University, Townsville, Australia, April 2015. [Google Scholar]
Kennedy, M.C.; Petropoulos, G.P. GEM-SA: The Gaussian Emulation Machine for Sensitivity Analysis. In Sensitivity Analysis in Earth Observation Modelling; George, P.P., Prashant, K.S., Eds.; Elsevier: Amsterdam, The Netherlands, 2017; pp. 341–361. ISBN 978-0-12-803011-0. [Google Scholar]
Petropoulos, G.; Wooster, M.J.; Carlson, T.N.; Kennedy, M.C.; Scholze, M. A Global Bayesian Sensitivity Analysis of the 1d SimSphere Soil-Vegetation-Atmospheric Transfer (SVAT) Model Using Gaussian Model Emulation. Ecol. Modell. 2009, 220, 2427–2440. [Google Scholar] [CrossRef]
Ardia, D.; Mullen, K.; Peterson, B.; Ulrich, J.; Boudt, K. Global Optimization by Differential Evolution. Available online: https://cran.r-project.org/web/packages/DEoptim/DEoptim.pdf (accessed on 28 December 2020).
Mitchell, M. An Introduction to Genetic Algorithms, 1st ed.; MIT Press: Cambridge, UK, 1998; ISBN 0-262-13316-4. [Google Scholar]
Oakley, J.E.; O’Hagan, A. Probabilistic Sensitivity Analysis of Complex Models: A Bayesian Approach. J. R. Stat. Soc. Ser. B Stat. Methodol. 2004, 66, 751–769. [Google Scholar] [CrossRef]
Kennedy, M.C.; O’Hagan, A.; Higgins, N. Bayesian Analysis of Computer Code Outputs. In Quantitative Methods for Current Environmental Issues; Anderson, C.W., Barnett, V., Chatwin, P.C., El-Shaarawi, A.H., Eds.; Springer: London, UK, 2004; pp. 227–243. ISBN 978-1-4471-0657-9. [Google Scholar]
Singels, A.; Bezuidenhout, C.N. A New Method of Simulating Dry Matter Partitioning in the CANEGRO Sugarcane Model. Field Crop. Res. 2002, 78, 151–164. [Google Scholar] [CrossRef]
Peerasak, S. Evaluation of Elite Sugarcane Clones Suitable for Growing Areas. In Progress Report under Project Develop and Engineering No. 7; J-STAGE: Japan, 2013; p. 174. [Google Scholar]
Cha-um, S.; Wangmoon, S.; Mongkolsiriwatana, C.; Ashraf, M.; Kirdmanee, C. Evaluating Sugarcane (Saccharum sp.) Cultivars for Water Deficit Tolerance Using Some Key Physiological Markers. Plant Biotechnol. 2012, 29, 431–439. [Google Scholar] [CrossRef]
Khonghinta, J.; Khruengpat, J.; Songsri, P.; Gonkhamdee, S.; Jongrungkl, N. Classification of the Sugar Accumulation Patterns in Diverse Sugarcane Cultivars under Rain-Fed Conditions in a Tropical Area. J. Agron. 2020, 19, 94–105. [Google Scholar] [CrossRef][Green Version]

Figure 1. Monthly weather data for Khon Kaen (KK) during 2010/12–2012/12; Rain: mean monthly rainfall (mm); Radn: daily solar radiation (MJ/m²); Maxt: daily maximum temperature (°C); Mint: daily minimum temperature (°C).

Figure 2. Process flow diagram of validation—step two. P1, P2, P3, P4, P5 and P6 represent the optimized parameter ensembles of each cultivar. A_SIM, B1_SIM and B2_SIM represent the APSIM simulations described in Section 2.2.

Figure 3. Relationship between APSIM-simulated values and emulator-predicted values of biomass and CDW in experiment B1 at each reporting frequency at days after planting (DAP): 99, 128, 185, 238, 267, 299, 329, 360 and 390. Solid lines indicate linear fit to the APSIM-simulated values and emulator-predicted values of biomass and CDW. Agronomy 11 01379 i001

: Biomass and Agronomy 11 01379 i002

: CDW.

R_{e m u}^{2}

_Biomass and

R_{e m u}^{2}

CDW were calculated by using 200 data points.

Figure 3. Relationship between APSIM-simulated values and emulator-predicted values of biomass and CDW in experiment B1 at each reporting frequency at days after planting (DAP): 99, 128, 185, 238, 267, 299, 329, 360 and 390. Solid lines indicate linear fit to the APSIM-simulated values and emulator-predicted values of biomass and CDW. Agronomy 11 01379 i001

: Biomass and Agronomy 11 01379 i002

: CDW.

R_{e m u}^{2}

_Biomass and

R_{e m u}^{2}

CDW were calculated by using 200 data points.

Figure 4. Heat maps of σ² values of the emulators built for biomass and CDW at each reporting frequency of three experiments: Experiment A, Experiment B1 and Experiment B2. Color ranges from dark blue to white represent values from 0 to higher values.

Figure 5. Heat maps of CV_RMSSE values of the emulators built for biomass and CDW at each reporting frequency of three experiments: Experiment A, Experiment B1 and Experiment B2. Color ranges from dark blue to light blue represent values from 1 to lower values and 1 to higher values.

Figure 6. Comparison between simulated (both from emulator and APSIM) and observed biomass and CDW for cultivar KK3 in experiments A, B1 and B2. NRMSE_opt%: Normalized root mean square error percentage,

R_{o p t}^{2}

: Coefficient of determination and AI_opt: Agreement index.

Figure 6. Comparison between simulated (both from emulator and APSIM) and observed biomass and CDW for cultivar KK3 in experiments A, B1 and B2. NRMSE_opt%: Normalized root mean square error percentage,

R_{o p t}^{2}

: Coefficient of determination and AI_opt: Agreement index.

Figure 7. Box plots of

R_{v a l}^{2}

values obtained during the validation of the parameter ensembles (P1, P2, P3, P4, P5 and P6) under each experimental condition corresponding to cultivars (a) KK3, (b) LK92-11, and (c) 02-2-058. Each parameter ensemble of the box plots indicated

R_{v a l}^{2}

values calculated between observed and simulated yields for six different cases (A_Biomass, A_CDW, B1_Biomass, B1_CDW, B2_Biomass, and B2_CDW) of a single cultivar (see Figure 2 for more details). The median is indicated by thick black lines, the interquartile range (IQR) is indicated by the boxes, 1.5 times the IQR is indicated by the whiskers, and the outliers beyond 1.5 times the IQR are indicated by points in black color.

Figure 7. Box plots of

R_{v a l}^{2}

values obtained during the validation of the parameter ensembles (P1, P2, P3, P4, P5 and P6) under each experimental condition corresponding to cultivars (a) KK3, (b) LK92-11, and (c) 02-2-058. Each parameter ensemble of the box plots indicated

R_{v a l}^{2}

values calculated between observed and simulated yields for six different cases (A_Biomass, A_CDW, B1_Biomass, B1_CDW, B2_Biomass, and B2_CDW) of a single cultivar (see Figure 2 for more details). The median is indicated by thick black lines, the interquartile range (IQR) is indicated by the boxes, 1.5 times the IQR is indicated by the whiskers, and the outliers beyond 1.5 times the IQR are indicated by points in black color.

Figure 8. Dot plots of NRMSE_val percentages and AI_val values obtained during the validation of the parameter ensembles (P1, P2, P3, P4, P5 and P6) under each experimental condition corresponding to cultivars KK3, LK92-11, or 02-2-058. Each parameter ensemble of the dot plots indicated NRMSE_val percentages and AI_val values calculated between observed and simulated yields for six different cases (A_Biomass, A_CDW, B1_Biomass, B1_CDW, B2_Biomass and B2_CDW) of a single cultivar (see Figure 2).

Figure 9. Observed biomass weight (g m⁻²), APSIM-simulated soil water deficit factor (0 = full stress and 1 = no stress) for photosynthesis (swdef_photo) and nitrogen deficit factor (0 = full stress and 1 = no stress) for photosynthesis (nfact_photo) of cultivar KK3 in experiments A, B1 and B2 from planting to harvesting.

Figure 10. Observed biomass weight (gm⁻²) of selected parameter ensembles of each cultivar under optimized experimental conditions. Cultivars KK3 and 02-2-058: observed biomass yield in experiment B1 and cultivar LK92-11: observed biomass yield in experiment B2. Growth stages were obtained from the APSIM-Sugarcane simulations using the corresponding parameter ensemble and APSIM simulation files.

Table 1. Details of the field experiments.

	Experiment A	Experiments B1 and B2
Planting date	1/12/2010	28/11/2011
Harvesting date	20/12/2011	22/12/2012
Cultivars	02-2-058, KK3, LK92-11
Water supply	Rainfed ^a (A)	Irrigated ^b (B1) and Rainfed ^a (B2)
Fertilizer	93.5:40.80:77.62 kg of N:P:K per hectare
Observed data	Biomass and CDW of experiment A were recorded at; 96, 117, 147 173, 244, 29 and 388 days after planting (DAP) and experiment B1 and B2 at; 99, 128, 185, 238, 267, 299, 329, 360 and 390 DAP

^a Amount of water applied: 24 mm per week from planting to 45 DAP, ^b Amount of water applied: 24 mm per week from planting to harvest.

Table 2. Properties of soil in the experimental field in KK.

Soil Depth (cm)	Texture Class *	Wilting Point (mm/mm)	Field Capacity (mm/mm)	Saturation (mm/mm)	Hydraulic Conductivity (mm/day)	Bulk Density (g/cm³)
0–20	Loamy soil	0.075	0.206	0.357	3336	1.52
20–50	Sandy loam	0.116	0.236	0.395	2232	1.61
50–100	Sandy clay loam	0.124	0.238	0.410	2232	1.57

* Soil texture classes according to the USDA soil textural triangle.

Table 3. Description of trait parameters used in the APSIM-Sugarcane model and parameter spaces.

Function of Parameters	Parameter Name	Description	Level	Code	Units	Parameter Space *
Canopy development	leaf_size	Area of the respective leaf	leaf_size_no = 1	LS1	mm²	500–2000
			leaf_size_no = 14	LS2	mm²	20,000–70,000
			leaf_size_no = 20	LS3	mm²	20,000–70,000
	green_leaf_no	Maximum number of fully expanded green leaves		GLN	No.	9–15
	tillerf_leaf_size	Tillering factors according to the leaf numbers	Tiller_leaf_size_no = 1	TLS1	mm²/mm²	1–6
			Tiller_leaf_size_no = 4	TLS2	mm²/mm²	1–6
			Tiller_leaf_size_no = 10	TLS3	mm²/mm²	1–6
			Tiller_leaf_size_no = 16	TLS4	mm²/mm²	1–6
			Tiller_leaf_size_no = 26	TLS5	mm²/mm²	1–6
Partitioning of assimilates	cane_fraction	Fraction of accumulated biomass partitioned to cane		CF	g/g	0.65–0.80
	sucrose_fraction_stalk	Fraction of accumulated biomass partitioned to sucrose		SF1	g/g	0.40–0.70
	stress_factor_stalk	Stress factor for sucrose accumulation		SF2	n/a	0.2–1.0
	sucrose_delay	Sucrose accumulation delay		SD	g/m²	0–600
	min_sstem_sucrose	Minimum stem biomass before partitioning to sucrose commences		MSS	g/m²	400–1500
Phenological development based on thermal time	min_sstem_sucrose_redn	Reduction to minimum stem sucrose under stress		MSSR	g/m²	0–20
	tt_emerg_to_begcane	Accumulated thermal time from emergence to beginning of cane		EB	°C day	1200–1900
	tt_begcane_to_flowering	Accumulated thermal time from beginning of cane to flowering		BF	°C day	5400–6600
	tt_flowering_to_crop_end	Accumulated thermal time from flowering to end of the crop		FC	°C day	1750–2250
Dry matter assimilation	transp_eff_cf	Transpiration efficiency coefficient	From sowing to sprouting	TEC1	kg kPa/kg	0.006–0.014
			From sprouting to emergence	TEC2
			From emergence to the beginning of cane growth	TEC3
			From the beginning of cane growth to flowering	TEC4
			From flowering to the end of the crop	TEC5
			At the end of the crop	TEC6
	rue	Radiation use efficiency	From emergence to the beginning of cane growth	RUE3	g/MJ	0.74–2.5
			From the beginning of cane growth to flowering	RUE4
			From flowering to the end of the crop	RUE5

* Selected parameter spaces used for emulator generation and optimization.

Table 4. A selection of estimated best parameters from optimization.

Parameter Name	Code	Unit	Cultivar
Parameter Name	Code	Unit	KK3 (P3)	LK92-11 (P5)	02-2-058 (P3)
leaf_size	LS1	mm²	1566	1792	1790
	LS2		62,686	56,809	20,252
	LS3		47,681	68,364	61,664
cane_fraction	CF	g/g	0.65	0.66	0.68
sucrose_fraction_stalk	SF1	g/g	0.7	0.6	0.5
stress_factor_stalk	SF2	n/a	0.9	0.9	0.9
sucrose_delay	SD	g/m²	582	563	137
min_sstem_sucrose	MSS	g/m²	432	1097	1420
min_sstem_sucrose_redn	MSSR	g/m²	19	0.26	2
tt_emerg_to_begcane	EB	°C day	1537	1874	1397
tt_begcane_to_flowering	BF	°C day	5404	5748	6523
tt_flowering_to_crop_end	FC	°C day	2138	2153	1794
green_leaf_no	GLN	No.	14	15	14
tillerf_leaf_size	TLS1	mm²/mm²	5	4	3
	TLS2		3	4	3
	TLS3		1	1	1
	TLS4		4	5	3
	TLS5		3	3	5
transp_eff_cf	TEC1	kg kPa/kg	0.008	0.014	0.010
	TEC2		0.007	0.014	0.011
	TEC3		0.013	0.013	0.012
	TEC4		0.014	0.009	0.014
	TEC5		0.014	0.013	0.013
	TEC6		0.011	0.014	0.010
rue	RUE3	g/MJ	2.50	2.24	2.49
	RUE4		2.46	2.34	2.48
	RUE5		1.14	2.40	1.84

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bandara, W.B.M.A.C.; Sakai, K.; Nakandakari, T.; Kapetch, P.; Anan, M.; Nakamura, S.; Setouchi, H.; Rathnappriya, R.H.K. Global Optimization of Cultivar Trait Parameters in the Simulation of Sugarcane Phenology Using Gaussian Process Emulation. Agronomy 2021, 11, 1379. https://doi.org/10.3390/agronomy11071379

AMA Style

Bandara WBMAC, Sakai K, Nakandakari T, Kapetch P, Anan M, Nakamura S, Setouchi H, Rathnappriya RHK. Global Optimization of Cultivar Trait Parameters in the Simulation of Sugarcane Phenology Using Gaussian Process Emulation. Agronomy. 2021; 11(7):1379. https://doi.org/10.3390/agronomy11071379

Chicago/Turabian Style

Bandara, W. B. M. A. C., Kazuhito Sakai, Tamotsu Nakandakari, Preecha Kapetch, Mitsumasa Anan, Shinya Nakamura, Hideki Setouchi, and R. H. K. Rathnappriya. 2021. "Global Optimization of Cultivar Trait Parameters in the Simulation of Sugarcane Phenology Using Gaussian Process Emulation" Agronomy 11, no. 7: 1379. https://doi.org/10.3390/agronomy11071379

APA Style

Bandara, W. B. M. A. C., Sakai, K., Nakandakari, T., Kapetch, P., Anan, M., Nakamura, S., Setouchi, H., & Rathnappriya, R. H. K. (2021). Global Optimization of Cultivar Trait Parameters in the Simulation of Sugarcane Phenology Using Gaussian Process Emulation. Agronomy, 11(7), 1379. https://doi.org/10.3390/agronomy11071379

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Global Optimization of Cultivar Trait Parameters in the Simulation of Sugarcane Phenology Using Gaussian Process Emulation

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Field

2.2. APSIM Simulation

2.3. Emulation

2.3.1. Gaussian Process-Based Emulation

2.3.2. Building Emulators

2.3.3. Emulator Accuracy Evaluation

2.4. Global Optimization

2.5. Validation of Optimized Parameters

2.5.1. Validation—Step One

2.5.2. Validation—Step Two

3. Results and Discussion

3.1. Emulator Accuracy

3.2. Validation of Optimized Parameters

3.2.1. Validation—Step One

3.2.2. Validation—Step Two

3.3. Comparison of Optimized Parameters

4. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI