Rice ( Oryza sativa L.) Growth Modeling Based on Growth Degree Day (GDD) and Artiﬁcial Intelligence Algorithms

: Rice ( Oryza sativa L.) growth prediction is key for precise rice production. However, the traditional linear rice growth forecasting model is ineffective under rapidly changing climate conditions. Here we show that growth rate (Gr) can be well-predicted by artiﬁcial intelligence (AI)- based artiﬁcial neural networks (ANN) and gene-expression programming (GEP), with accumulated air temperatures based on growth degree day (GDD). In total, 10,246 Gr from 95 cultivations were obtained with three cultivars, TK9, TNG71, and KH147, in Central and Southern Taiwan. The model performance was evaluated by the Pearson correlation coefﬁcient (r), root mean square error (RMSE), and relative RMSE (r-RMSE) in the whole growth period (lifecycle), as well as the average and speciﬁc key stages (transplanting, 50% initial tillering, panicle initiation, 50% heading, and physiological maturity). The results in lifecycle Gr modeling showed that ANN and GEP models had comparable r (0.9893), but the GEP model had the lowest RMSE (3.83 days) and r-RMSE (7.24%). In stage average and speciﬁc key stages, each model has its own best-ﬁt growth period. Overall, GEP model is recommended for rice growth prediction considering the model performance, applicability, and routine farming work. This study may lead to smart rice production due to the enhanced capacity to predict rice growth in the ﬁeld.


Introduction
Rice is the second-highest produced cereal in terms of yield and is a staple food for approximately four billion people globally [1]; therefore, knowing the critical requirements for rice growth and the best timing for rice planting and harvest are crucial for understanding the effects of policy, and optimizing agricultural practice to achieve higher food security [2].A precise method is needed so that the rice growth stages may be accurately predicted at varying environmental conditions [3], so as to effectively implement field cultivation management [4], achieve rationalization of irrigation and fertilization [5] and increase yield and profits for farmers [6].Due to the development of digital agriculture modeling, the automatic field operation may be achieved by accurate growth period prediction, leading to precision agriculture [6].Conversely, mismatched fertilization timing may cause lower fertilizer utilization efficiency [7,8], especially in traditional rice production under the critical threat of climate change.For example, the raised carbon dioxide concentration causes existing rice growth patterns to change in Taiwan [9].The increased extreme rainfall events cause variation in rice growth patterns [10] and yield loss [11].
Typical rice growth is modeled based on the accumulated thermal time, called growth degree days (GDD), to reach a specific growth stage regardless of the year or location [12,13].The first rice GDD model, initially referred to as the Degree Day 50, was established in the 1970s and has been widely used in the Southern USA [14].Many modern rice growth prediction models have been developed based on the GDD or similar linear-based principles, such as ORYZA [15] and DSSAT [16].The accuracy of GDD model has been found to be much higher in fieldwork timing than the traditional calendar day method [17,18].However, several difficulties hinder the application of the GDD model.Firstly, the basic growth temperature (Tb) must be calibrated by a long-term experiment before application [19][20][21].Secondly, the GDD model is limited due to the nonlinear relationship between the rice growth rate, rice's cultivars and the growth conditions such as the temperature, humidity, and soil composition.The traditional GDD correlates developmental rate linearly to temperatures in some growth periods; however, linearization is often criticized for its oversimplification despite being widely used [22].
Due to the limitations of the above-mentioned linear-based GDD model, artificial intelligence (AI) based nonlinear algorithms have been developed and applied in rice research in recent years.Examples include yield prediction by artificial neural networks (ANN) [23] and gene-expression programming (GEP) [24], and short-term rice blast forecasting by ANN [25].These AI-based studies showcase the ability of AI algorithms for complex problem-solving.However, although there are many separate studies on developing and using traditional GDD and AI algorithms in rice studies, no research has aimed to integrate GDD in AI-based modeling to predict rice growth.Due to the importance of accurate rice growth prediction for smart agriculture and increasing demand on rice production [26,27], this study aimed to apply AI algorithms to develop nonlinear rice growth models by GDD.The model performances were assessed by Pearson correlation coefficient (r), root mean squared error (RMSE), and relative RMSE (r-RMSE) in whole rice growth period (lifecycle), stage average, and specific key stages, to evaluate the applicability of AI-enabled model by comparing with traditional GDD model in different rice growth stages.

Materials and Methods
Three rice (O.Sativa subsp.Japonica) cultivars: Taiken 9 (TK9), Tainung 71 (TNG71) and Kaohsiung147 (KH147) were used in this study and were grown in 2006-2008, 2006-2009, and 2019 in two crop seasons, respectively.The growing days for TK9, TNK71, and KH147 in the first and second crop season were approximately 123 and 114 days (mid to late maturing), 118 and 104 days (early to mid-maturing), and 128 and 115 days (mid to late maturing), respectively.The cultivars TK9 and TNG71 had the same transplanting dates from 2006-2008.In 2009, TNG71 was planted in two different lots in the same crop season with a 14-day difference in the transplanting dates.Similar cultivation was performed for KH147 in 2019, with the plant grown in four separate lots with the transplanting dates separated by 14 days in the same crop season.In total, 95 cultivations were conducted by completely randomized design.
In each cultivation, growth data were obtained by observing the key growth stages of rice classified by Counce, et al. [28], including the transplanting stage (Stage 0, V3), 50% initial tillering stage (Stage 1, V5, the date when the tillers number exceeds 50%), panicle initiation stage (Stage 2, R0, the date when more than five random rice with panicle lengths observed in 2 mm), 50% heading stage (Stage 3, R3, the date when 50 plants' heading number reach 50%), and physiological maturity stage (Stage 4, R7, the date when most of the grains in the panicle are golden yellow, and 2 to 3 grains at the base of the panicle are still yellow-green).Based on key stages, the rice growth rate (Gr) was calculated by day −1 , and continuously accumulated from Gr = 0 to Gr = 4 (i.e., V3 to R7).For example, in 10 days in Stage 1, the Gr of the first day is 1/10 = 0.10, the second day is 2/10 = 0.20, etc., until the end of this stage (day 10, Gr = 1.00).Then, Gr for the next stages was calculated similarly by dividing the specific date by the duration of the stage plus the Gr values at the end of the previous stage (e.g., 1.00 for Stage 1 and 2.00 for Stage 2).

Experiment Sites
The phenological data of rice was collected by the Taiwan Agriculture Research Institute (TARI) in Taichung (TK9 and TNG 71) and National Pingtung University of Science and Technology (NPUST) in Pingtung (KH147), respectively.The experimental sites of this study are marked in Figure 1.Two crop seasons were cultivated in both places following the typical continuous flood irrigation (Figure 2).The soil texture in TARI is alluvium of Holocene with fine-textured and well-drained red soil; and in NPUST, it is terrace deposits of Quaternary Period with non-calcareous well-drained shallow alluvial soil [29].
Agriculture 2022, 12, x FOR PEER REVIEW 3 of 12 until the end of this stage (day 10, Gr = 1.00).Then, Gr for the next stages was calculated similarly by dividing the specific date by the duration of the stage plus the Gr values at the end of the previous stage (e.g., 1.00 for Stage 1 and 2.00 for Stage 2).

Experiment Sites
The phenological data of rice was collected by the Taiwan Agriculture Research Institute (TARI) in Taichung (TK9 and TNG 71) and National Pingtung University of Science and Technology (NPUST) in Pingtung (KH147), respectively.The experimental sites of this study are marked in Figure 1.Two crop seasons were cultivated in both places following the typical continuous flood irrigation (Figure 2).The soil texture in TARI is alluvium of Holocene with fine-textured and well-drained red soil; and in NPUST, it is terrace deposits of Quaternary Period with non-calcareous well-drained shallow alluvial soil [29].

Growth Degree Day
For temperature-based phenology models, the growth rate is generally modeled as a function of the effective temperature accumulation [30].The prediction results can be used in field operations to guide agricultural managements, such as the timing for fertilization and irrigation.Several studies have been performed to improve the procedures for calculating the GDD [3].In this study, a conventional GDD was calculated.It was assumed that a certain amount of effective temperature (°C d −1 ) is needed to complete a given developmental stage [31].GDD is generally computed based on a daily averaged temperature until the end of this stage (day 10, Gr = 1.00).Then, Gr for the next stages was calculated similarly by dividing the specific date by the duration of the stage plus the Gr values at the end of the previous stage (e.g., 1.00 for Stage 1 and 2.00 for Stage 2).

Experiment Sites
The phenological data of rice was collected by the Taiwan Agriculture Research Institute (TARI) in Taichung (TK9 and TNG 71) and National Pingtung University of Science and Technology (NPUST) in Pingtung (KH147), respectively.The experimental sites of this study are marked in Figure 1.Two crop seasons were cultivated in both places following the typical continuous flood irrigation (Figure 2).The soil texture in TARI is alluvium of Holocene with fine-textured and well-drained red soil; and in NPUST, it is terrace deposits of Quaternary Period with non-calcareous well-drained shallow alluvial soil [29].

Growth Degree Day
For temperature-based phenology models, the growth rate is generally modeled as a function of the effective temperature accumulation [30].The prediction results can be used in field operations to guide agricultural managements, such as the timing for fertilization and irrigation.Several studies have been performed to improve the procedures for calculating the GDD [3].In this study, a conventional GDD was calculated.It was assumed that a certain amount of effective temperature (°C d −1 ) is needed to complete a given developmental stage [31].GDD is generally computed based on a daily averaged temperature

Growth Degree Day
For temperature-based phenology models, the growth rate is generally modeled as a function of the effective temperature accumulation [30].The prediction results can be used in field operations to guide agricultural managements, such as the timing for fertilization and irrigation.Several studies have been performed to improve the procedures for calculating the GDD [3].In this study, a conventional GDD was calculated.It was assumed that a certain amount of effective temperature ( • C d −1 ) is needed to complete a given developmental stage [31].GDD is generally computed based on a daily averaged temperature minus base temperature (Tb).Traditionally, 10 • C is used as the Tb of rice cultivars.However, it had been found that the base temperatures of TK9, TNG71, and KH147 were 6.4, 6.6, and 7.6 • C, respectively [32].A negative GDD indicates that the crop growth has stopped [33].Finally, the daily GDD is calculated and summed, which constitutes the cumulative heating time required for the rice to reach the corresponding Gr [34].The GDD in this study was calculated using Equation (1) below.
where, T min is daily minimum temperature ( • C), T max is daily maximum temperature ( • C), and Tb is base temperature ( • C).
An ANN is a network of processing elements (PEs) assembled in layers and connected through several links or weights.ANN calculates its output from the dynamic input at epochs and compares it with the expected output from each input vector to compute the error [35].The classic backpropagation neural network (BPNN) was used for Gr modeling in this study.NeuroSolution 7.1 from NeuroDimension, Inc. (Gainesville, FL, USA) was used for model establishment in this study.The model framework is shown in Figure 3.The total data, including input and output, was pre-processed by data min-max normalization from 0 to 1, and the data queue were re-sorted by random order to avoid errors caused by outliners and systemic cumulative errors.After the data pre-processing, the dataset was divided into three datasets: training, cross-validation (CV), and testing at a ratio of 70%, 20% and 10% following the typical practice of AI-based modeling.In general, the training dataset was used for initial modeling.However, the initially developed model may be overfitted by model optimization, i.e., model's hyper-parameters adjustment.The potential hyper-parameters, such as the number of iterations during the training, learning rate, layer parameters, or the number of layers, were adjusted by the CV dataset to increase the modeling accuracy.Finally, the testing dataset was applied to the adjusted model the for model performance evaluation.The initial setting of the model development included a 1000 maximum epoch with a 0.01 learning rate.These parameters were selected based on a previous study [25] because the goal of this study was to establish rice growth prediction models instead of comparing the model's performance by varying the model's parameters.The Levenberg-Marquardt gradient search method was applied with an early stopping callback to prevent overfitting.A single hidden layer was conducted with 10 PEs.
Agriculture 2022, 12, x FOR PEER REVIEW 4 of 12 minus base temperature (Tb).Traditionally, 10 °C is used as the Tb of rice cultivars.However, it had been found that the base temperatures of TK9, TNG71, and KH147 were 6.4, 6.6, and 7.6 °C, respectively [32].A negative GDD indicates that the crop growth has stopped [33].Finally, the daily GDD is calculated and summed, which constitutes the cumulative heating time required for the rice to reach the corresponding Gr [34].The GDD in this study was calculated using Equation (1) below.
where, Tmin is daily minimum temperature (°C), Tmax is daily maximum temperature (°C), and Tb is base temperature (°C).

Model Development
An ANN is a network of processing elements (PEs) assembled in layers and connected through several links or weights.ANN calculates its output from the dynamic input at epochs and compares it with the expected output from each input vector to compute the error [35].The classic backpropagation neural network (BPNN) was used for Gr modeling in this study.NeuroSolution 7.1 from NeuroDimension, Inc. (Gainesville, FL, USA) was used for model establishment in this study.The model framework is shown in Figure 3.The total data, including input and output, was pre-processed by data min-max normalization from 0 to 1, and the data queue were re-sorted by random order to avoid errors caused by outliners and systemic cumulative errors.After the data pre-processing, the dataset was divided into three datasets: training, cross-validation (CV), and testing at a ratio of 70%, 20% and 10% following the typical practice of AI-based modeling.In general, the training dataset was used for initial modeling.However, the initially developed model may be overfitted by model optimization, i.e., model's hyper-parameters adjustment.The potential hyper-parameters, such as the number of iterations during the training, learning rate, layer parameters, or the number of layers, were adjusted by the CV dataset to increase the modeling accuracy.Finally, the testing dataset was applied to the adjusted model the for model performance evaluation.The initial setting of the model development included a 1000 maximum epoch with a 0.01 learning rate.These parameters were selected based on a previous study [25] because the goal of this study was to establish rice growth prediction models instead of comparing the model's performance by varying the model's parameters.The Levenberg-Marquardt gradient search method was applied with an early stopping callback to prevent overfitting.A single hidden layer was conducted with 10 PEs.

Gene-Expression Programming (GEP)
A classic GEP begins with a major contest and undergoes a continuous evolutionary process, such as selection, replication, mating, mutation, adaptation, reversal, and transformation to evolve toward a predetermined objective [36].It overcomes genetic algorithm's (GA) chief shortcomings, such as the premature convergence and combined explosion, and its evolution is significantly faster than GA and genetic programming (GP) [37].The modeling flowchart and the structure are shown in Figure 4.The GEP model was trained, cross-validated and tested by the same dataset as the ANN model by GeneXproTools 5.0 program (Gepsoft Ltd., Bristol, UK), following the same ratio of data splitting (70%/20%/10%).The model adopted the same model's calculation elements, fitness function and parameter setting [38], namely five genes with fifty chromosomes and evolved to ten thousand generations.The genetic operators or set of functions were determined.Calculation elements of +, −, *, /, ln, ceiling, floor, absolute, tangent, exponential, 10 x , x 2 , x 1/3 , min, and avg.Afterward, a tree structure was established and linked by each between sub-expression trees (ET sub ).More details on the GEP theory can be found in [36,37].
Agriculture 2022, 12, x FOR PEER REVIEW 5 of 12 A classic GEP begins with a major contest and undergoes a continuous evolutionary process, such as selection, replication, mating, mutation, adaptation, reversal, and transformation to evolve toward a predetermined objective [36].It overcomes genetic algorithm's (GA) chief shortcomings, such as the premature convergence and combined explosion, and its evolution is significantly faster than GA and genetic programming (GP) [37].The modeling flowchart and the structure are shown in Figure 4.The GEP model was trained, cross-validated and tested by the same dataset as the ANN model by GeneXpro-Tools 5.0 program (Gepsoft Ltd., Bristol, UK), following the same ratio of data splitting (70%/20%/10%).The model adopted the same model's calculation elements, fitness function and parameter setting [38], namely five genes with fifty chromosomes and evolved to ten thousand generations.The genetic operators or set of functions were determined.Calculation elements of +, −, *, /, ln, ceiling, floor, absolute, tangent, exponential, 10 x , x 2 , x 1/3 , min, and avg.Afterward, a tree structure was established and linked by each between sub-expression trees (ETsub).More details on the GEP theory can be found in [36,37].

Simple Regression Model (REG)
The simple regression model is a conventional GDD modeling algorithm based on a linear relationship.The dataset used for REG model was the same as that for ANN and GEP models.However, the REG algorithm did not allow the hyper-parameter adjustment by CV process; therefore, the 20% CV data were added to the training dataset, which means 90% data were used for training and 10% were used for testing in REG.The model can be described as Equation (2).
where, xi is the ith GDD; α is the linear equation slope; β is an intercept of the regression, and ε is the error term.

Model Assessment
The performances of ANN, GEP, and REG models were assessed by r, RMSE and relative root mean squared error (r-RMSE, Equation (3)).The significance of the Pearson correlation between observed Gr and predicted Gr of each model was calculated.The whole growth period data, including model training, CV, and testing datasets, were used to retrieve the entire rice lifecycle prediction results by ANN, GEP, and REG models, and the errors were calculated in days (1/Gr) by 0.1 intervals to compute RMSE based on the average Gr increment of each stage, i.e., Stage

Simple Regression Model (REG)
The simple regression model is a conventional GDD modeling algorithm based on a linear relationship.The dataset used for REG model was the same as that for ANN and GEP models.However, the REG algorithm did not allow the hyper-parameter adjustment by CV process; therefore, the 20% CV data were added to the training dataset, which means 90% data were used for training and 10% were used for testing in REG.The model can be described as Equation (2).
where, x i is the ith GDD; α is the linear equation slope; β is an intercept of the regression, and ε is the error term.

Model Assessment
The performances of ANN, GEP, and REG models were assessed by r, RMSE and relative root mean squared error (r-RMSE, Equation (3)).The significance of the Pearson correlation between observed Gr and predicted Gr of each model was calculated.The whole growth period data, including model training, CV, and testing datasets, were used to retrieve the entire rice lifecycle prediction results by ANN, GEP, and REG models, and the errors were calculated in days (1/Gr) by 0.1 intervals to compute RMSE based on the average Gr increment of each stage, i.e., Stage 1 is 0.0632, Stage 2 is 0.0334, Stage 3 is 0.0409, Stage 4 is 0.0269.A comparison by lifecycle, stage average, i.e., Stage 1~4, and specific key stage, i.e., Gr = 1 (V3), Gr = 2 (V5), Gr = 3 (R0), and Gr = 4 (R3), was conducted to evaluate the model performance.
where ŷi is the predicted value, y i is the observed value, y is the average of the observations, and n is the number of actual observations.

Rice Growth Predicted by Growth Degree Day
The GDD-Gr relation was established for the whole dataset (Figure 5).The RMSE of Gr ranged from 0.1234 to 0.3199, as shown in Table 1.In Stage 1, the GDD model has an unignorable prediction bias result from the offset of Gr = 0, which means the initial modeling rice growth stage may start from Gr ≈ 0.45, possibly leading to a wrong fieldwork decision.Higher prediction errors also can be found in Stage 3 and Stage 4.

r-RMSE(%)= y ×100
where  is the predicted value,  is the observed value,  is the average of the observations, and n is the number of actual observations.

Rice Growth Predicted by Growth Degree Day
The GDD-Gr relation was established for the whole dataset (Figure 5).The RMSE of Gr ranged from 0.1234 to 0.3199, as shown in Table 1.In Stage 1, the GDD model has an unignorable prediction bias result from the offset of Gr = 0, which means the initial modeling rice growth stage may start from Gr ≈ 0.45, possibly leading to a wrong fieldwork decision.Higher prediction errors also can be found in Stage 3 and Stage 4.

Model Results
The r, RMSE, and r-RMSE of ANN, GEP, and REG model accuracy assessment results in model testing are shown in Table 2. ANN has the highest r, lowest RMSE and r-RMSE in the model testing stage.The established REG and GEP models presented in equations are shown in Equation ( 4) to Equation (10).

Model Results
The r, RMSE, and r-RMSE of ANN, GEP, and REG model accuracy assessment results in model testing are shown in Table 2. ANN has the highest r, lowest RMSE and r-RMSE in the model testing stage.The established REG and GEP models presented in equations are shown in Equations ( 4)- (10).
where, Gr REG is the Gr predicted by REG model, Gr GEP is the Gr predicted by GEP model, GDD is the accumulated growth degree day ( • C), ET sub1~5 is the sub-expression tree from 1 to 5, min is the minimum function, Ce is ceiling function, Fl is floor function, tan is the tangent function, and abs is the absolute function.

Model Performance Evaluation
The model testing was conducted in the entire growth period each model.The correlation between observed and predicted Gr is shown in Figure 6.The retrieved prediction results can be divided into three periods: the whole lifecycle, stage average, and the key stage, which is shown in Table 3.The ANN and GEP has an equivalent r value (0.9893) in the whole lifecycle prediction, but the GEP model has a relative lower RMSE and r-RMSE, followed by ANN and REG models.Based on the performance of r, the best performing model at stage 1 to 4 is GEP, REG, ANN, and REG, respectively.Furthermore, the performance of the AI-based ANN and GEP models can be determined by the RMSE (days) ratio between AI and REG models (Table 4).The result shows that the performance of ANN in the lifecycle, stage average from 1 to 4, and key stage Gr = 2, 3, 4, is better than the REG model.Only on key stage Gr = 1, the ANN performed slightly worse than the REG model.For the GEP model, its performance in the whole lifecycle, stage average 1, 3, 4, and key stage Gr = 3, 4, was better than the REG model, but the prediction accuracy at stage average 2, and key stages Gr = 1, 2, was lower than the REG model.***: The symbol denotes significant correlation at p < 0.0001.

𝐺𝑟
= (1.6807× 10 ) ×  + 4.3594 × 10 where, GrREG is the Gr predicted by REG model, GrGEP is the Gr predicted by GEP model, GDD is the accumulated growth degree day (°C), ETsub1~5 is the sub-expression tree from 1 to 5, min is the minimum function, Ce is ceiling function, Fl is floor function, tan is the tangent function, and abs is the absolute function.

Model Performance Evaluation
The model testing was conducted in the entire growth period for each model.The correlation between observed and predicted Gr is shown in Figure 6.The retrieved prediction results can be divided into three periods: the whole lifecycle, stage average, and the key stage, which is shown in Table 3.The ANN and GEP has an equivalent r value (0.9893) in the whole lifecycle prediction, but the GEP model has a relative lower RMSE and r-RMSE, followed by ANN and REG models.Based on the performance of r, the best performing model at stage 1 to 4 is GEP, REG, ANN, and REG, respectively.Furthermore, the performance of the AI-based ANN and GEP models can be determined by the RMSE (days) ratio between AI and REG models (Table 4).The result shows that the performance of ANN in the lifecycle, stage average from 1 to 4, and key stage Gr = 2, 3, 4, is better than the REG model.Only on key stage Gr = 1, the ANN performed slightly worse than the REG model.For the GEP model, its performance in the whole lifecycle, stage average 1, 3, 4, and key stage Gr = 3, 4, was better than the REG model, but the prediction accuracy at stage average 2, and key stages Gr = 1, 2, was lower than the REG model.The boldface numbers represent the best performance of the model in each period comparison.

Discussion
This study used rice growth data from three rice cultivars grown in tropical and subtropical climates and calculated the most common rice growth prediction parameter, GDD, by an adjunction weather station for each cultivation.The developed models were applied to retrieve the whole growth period dataset for model performance and applicability evaluation in the lifecycle, stage average, and specific key stages.
We compared the modeled results with a previous study [39] that predicted rice growth date by five conventional models.The models' input factors included critical day length, the minimum number of days required for heading, photoperiod sensitivity parameter, etc., and in-situ experimental parameters.The RMSEs of those models ranged from 6.09 to 8.60 days in heading and 4.23 to 9.12 days in maturity [39].Those results are similar to the REG model in this study, i.e., 6.86 days in heading and 11.96 days.In comparison, the AI-based ANN and GEP models, showed slightly better predicting abilities at heading stage (Gr = 3).However, in the maturity stage (Gr = 4), ANN and GEP models do not present a significant improvement in modeling ability.It should be noticed that the modeling result of rice growth is influenced by many factors, such as the environment, cultivar, tillage method, etc.In general, the comparison results indicated that the AI-based ANN and GEP rice growth models have higher accuracy than the REG model.

Lifecycle vs. Stage Average vs. Specific Key Stage
These three different scenarios can be used for different purposes.For example, the lifecycle scenario was developed by a continuous dataset, which has integral prior probability and does not affect the model transition situation.Therefore, the lifecycle scenario is more suitable for smart agriculture, which operates routine fieldwork entirely based on modeling results by considering the prior probability, and guides farmers lacking in rice farming experience.The stage average scenario is more suitable for semi-automatic rice production, i.e., modern paddy rice farming.In the stage average scenario, farmers will need to modify the model based on their experience and have some background knowledge to select a suitable algorithm.Farmers can also decide the timing to start Stage 2 and use ANN model according to the rice seedling development.In specific key stage prediction, the physiological transformation and growth need to be adjusted at the stage.Specific key stage scenario may be more beneficial for researchers.For farmers, the management measures may not be implemented in those specific key stages because agricultural management may require "early" or "late" applications to supply the necessary nutrition and water for the rice.Therefore, the specific key stage scenario results are more relevant to research.For farmers, the lifecycle or stage average scenario are more useful to conduct in their fieldwork.

Model Applicability
Irrigation is one of the most critical factors in rice production, even more so than fertilization.Therefore, the essential stages of rice production are the ones with a variant water control strategy, i.e., panicle initiation (Gr ≈ 1.9 to 2.1), booting to late heading (Gr ≈ 2.7 to 3.3), and around 10 days before harvest (Gr ≈ 3.8).Moreover, the dry land method can reduce non-effective tillering in the late-Vegetative phase since Gr starts around 1.5.In fertilization, the critical timing is panicle initiation, tillering initiation to 10~15 days after tillering initiation (Gr ≈ 1.0 to 1.3), leading to significant yield promotion when an appropriate application is made.Based on the comparison of important fieldwork conditions, the model performance indicated by RMSE (days) of Gr in 1.9 to 2.1 is REG > ANN > GEP.REG seems to have a better modeling ability.RMSE (days) of predicted Gr = 2 is 4.4199, followed by ANN (4.4288 days) and GEP (4.8465 days).The RMSE for all three models is comparable.Therefore, these three models have similar modeling accuracy on Gr prediction.
The second critical stage is booting to late-heading (Gr ≈ 2.7 to 3.3), and 10 days before harvest (Gr ≈ 3.8), in which appropriate water control is required to increase the rice quality and yield [40,41].The model performance assessed by RMSE (days) in Gr = 2.7 to 3.3 and Gr ≈ 3.8 is GEP > ANN > REG.The error was higher when Gr was greater than 3, and it became even large in Gr near to 4. This is likely because that rice grain is not sensitive to temperature in the last growth stage [8,42].Therefore, a traditional linear algorithm to model this stage with GDD cannot generate good predictions.
The third critical stage is the tillering initiation, which increases panicles per plant by appropriate fertilization [43].The fertilization timing is relatively flexible depending on rice cultivars and crop season-only one application is needed before effective tillering (Gr ≈ 1.0 to 1.3).However, it should be applied for some cultivars twice, for which the Gr is around 1.0 and 1.3.In this case, the REG model has the lowest RMSE in Gr = 1 (1.9651 days) followed by ANN and GEP (2.4451 and 2.6644 days).The differences error between REG and AI-based ANN and GEP are 0.48 days (≈11.5 h) and 0.6993 days (≈16.8 h), respectively.Considering the relatively flexible schedule for rice fertilization, all three models can be used for Gr modeling at this stage.

Conclusions
GDD is the most convenient index used to construct the rice growth prediction model for agronomists.However, the model displayed limited modeling accuracy for rice growth in the late-Reproductive phase and Ripening phase resulted from notable organ developments in these stages.Because of the importance of rice production and smart agriculture, GDD was conducted in AI algorithms-ANN and GEP, to predict rice growth variations.The results indicate that each algorithm can be adapted for specific purposes using the whole dataset combined with different rice cultivars and crop seasons.Based on RMSE, both AI-based ANN and GEP models are suitable for lifecycle, stage average prediction, and specific key stage prediction at the Gr of 3 and 4. Traditional REG methods have a good modeling ability when the specific key stage Gr is 1 and 2. Overall, the GEP model is recommended to farmers for conducting routine fieldwork, considering that the GEP model can be encrypted to Python-based code for further field environment monitoring system development, which may have lower hardware (e.g., CPU, RAM, etc.) requirements compared to ANN model.The accurate rice growth model indicates that it is possible to provide guidance for machines to operate fieldwork automatically based on the prediction

Figure 1 .
Figure 1.Experimental sites of this study.

Figure 2 .
Figure 2. Common practices of rice cultivation in Taiwan.

Figure 1 .
Figure 1.Experimental sites of this study.

Figure 1 .
Figure 1.Experimental sites of this study.

Figure 2 .
Figure 2. Common practices of rice cultivation in Taiwan.

Figure 2 .
Figure 2. Common practices of rice cultivation in Taiwan.

Figure 3 .
Figure 3.The working architecture of the backpropagation neural network (BPNN) model.

Figure 3 .
Figure 3.The working architecture of the backpropagation neural network (BPNN) model.

Figure 5 .
Figure 5.The relationship between accumulated GDD and Gr.Each data point represents the observed rice growth rate (Gr) data and its related cumulated GDD.The red dash-line is the leastsquares regression line.

Figure 5 .
Figure 5.The relationship between accumulated GDD and Gr.Each data point represents the observed rice growth rate (Gr) data and its related cumulated GDD.The red dash-line is the leastsquares regression line.

Figure 6 .
Figure 6.Modeling results for the whole growth period by three models: (a) ANN; (b) GEP; (c) REG.Each data point represents the observed rice growth rate (Gr) data and its related prediction of Gr.The dash-line is the least-squares regression line.

Table 1 .
GDD in each specific key stage.

Table 2 .
Model performances on model testing phase show the similarity between two AI-based ANN and GEP models.

Table 1 .
GDD in each specific key stage.
• C

Table 2 .
Model performances on model testing phase show the similarity between two AI-based ANN and GEP models.

Table 3 .
Modeling results for the whole growth period by three models in the lifecycle, stage average, and specific key stages.: All of the correlation coefficients (r) were significant at p < 0.0001.The boldface numbers represent the best performance of the model in each period comparison. ‡

Table 4 .
RMSE comparison between AI-based ANN, GEP models to the REG model.