Predicting the Photosynthetic Rate of Chinese Brassica Using Deep Learning Methods

: Water stress is a signiﬁcant element impacting photosynthesis, which is one of the major physiological activities governing crop growth and development. In this study, the photosynthetic rate of Brassica chinensis L. var. parachinensis (Bailey) (referred to as Chinese Brassica hereafter) was predicted using the deep learning method. Five sets of Chinese Brassica were created, each with a different water stress gradient. Air temperature (Ta), relative humidity (RH), canopy temperature (Tc), transpiration rate (Tr), photosynthetic rate (Pn), and photosynthetically available radiation (PAR) were measured in different growth stages. The upper limit and lower limit equations were built using the non-water-stress baseline (NWSB) and hierarchical density-based spatial clustering of applications with noise (HDBSCAN) methods. The crop water stress index (CWSI) was then calculated using these built equations. The multivariate long short-term memory (MLSTM) model was proposed to predict Pn based on CWSI and other parameters. At the same time, the support vector regression (SVR) method was applied to provide a comparison to the MSLTM model. The results show that water stress had an important effect on the growth of Chinese Brassica. The more serious the water stress, the lower the growth range (GR). The HDBSCAN method had a lower root mean square error (RMSE) in calculating CWSI. Furthermore, the CWSI had a signiﬁcant effect on predicting Pn. The regression ﬁtting between measured Pn and predicted Pn showed that the determination coefﬁcient (R 2 ) and RMSE were 0.899 and 0.108 µ mol · m − 2 · s − 1 , respectively. In this study, we successfully developed a method for the reliable prediction of Pn in Chinese Brassica, which can serve as a useful reference for application in water saving. Author Contributions: Conceptualization, P.G., J.X. and W.W.; methodology, P.G.; software, P.G., P.Z. and M.Y.; funding acquisition, W.W.; formal analysis, P.G.; investigation, P.G., M.Y. and P.Z.; data curation, P.G., G.L. and Y.C.; writing—original draft preparation, P.G.; writing—review and editing, P.G., W.W., X.H., J.X., G.L. and D.S.; visualization, P.G.; supervision, W.W. and X.H. authors


Introduction
Chinese Brassica is a widely planted green vegetable distributed in China and many other countries [1]. It has a two-month growth cycle and is also high in vitamin C and other nutrients, making it a popular vegetable [2]. Chinese Brassica has strong adaptability to the environment but is sensitive to water and temperature during the growing process. Drought and severe temperature have a negative impact on the photosynthetic rate (Pn), which may completely cease in certain situations [3]. This leads to stomatal closure and the cessation of transpiration, affecting the growth of Chinese Brassica [4]. As the global population is growing, the demand for agricultural irrigation water is increasing as well.
In recent years, machine learning methods have been widely used in modeling and forecasting crop environmental information. Wang et al. [17] realized the dynamic inversion of dissolved oxygen in aquaculture waters by employing unmanned aerial vehicles (UAVs) with a wireless sensor network (WSN) and deep learning methods. Gao et al. [18] proposed the bidirectional long short-term memory model to predict the soil moisture and soil electrical conductivity of citrus orchards. These studies obtained a good accuracy and precision inversion or prediction of the results. In terms of modeling crop growth parameters, Kmet et al. [19] proposed a stomatal conductance model based on the Hopfield neural network and back propagation (BP) to model the Pn of tomato leaves in a greenhouse. Their results indicated that predicting Pn using a machine learning method is feasible. Wang et al. [20] established a prediction model for the Pn of tomatoes at the fruiting stage by using a WSN and a BP neural network. The correlation coefficient between predicted Pn and measured Pn was 0.99, and the root mean square error (RMSE) was 0.28, which was a good result in terms of prediction accuracy. However, the model did not predict Pn according to different crop growth stages, so its generalization ability could potentially be optimized in a further study. Ru et al. [21] chose CWSI as an indicator to evaluate grape water stress. The study calculated the correlation between CWSI and the common growth index, soil moisture, and other parameters by collecting the Tc, Ta, RH, and chlorophyll content of leaves using the traditional regression algorithm. However, research on the description of the relationship between CWSI and Pn remains scarce. Therefore, the main contributions of this study are as follows: (1) Tc, Pn, and environmental parameters including Ta and RH were measured in the chosen research object, Chinese Brassica, to study the ∆T under different water stress gradient conditions. (2) CWSI was calculated using the upper and lower limit equations of ∆T and used to estimate the influence of different water stresses on Chinese Brassica. (3) A deep learning method named multivariate long short-term memory (MLSTM) that used CWSI combined with ∆T and environmental parameters was proposed to predict the Pn of Chinese Brassica with different water conditions. (4) The performance of the established prediction model was evaluated to study how CWSI influenced the prediction performance of Pn.

Study Area and Description of Materials
The study was carried out in an open-air laboratory located in the College of Engineering, South China Agricultural University, as shown in Figure 1. The study site had separate water control pipes and vegetable cultivation tools. The experiment duration was from November to December 2020. The maximum Ta and RH were 34.8 • C and 59.7%, respectively. The minimum Ta and RH were 16.5 • C and 31.2%, respectively. Relevant measurement equipment is described in Section 2.3. The five pots were used to cultivate Chinese Brassica, with two seedlings planted in every pot with a spacing of 10 cm. Each pot had a height of 15 cm and a diameter of 22 cm, had the same water supply, and received the same organic fertilizer for growth. The diameter of the outfall at the bottom of the pot was 4 cm. Before conducting the measurements for the test, the same batch of Chinese Brassica seeds was sown into the prepared pot with sufficient water and organic fertilizer. The soil stored in the pot was taken from the same field. When the seeds germinated their first true leaf, each seedling was transplanted into five pots, as shown in Figure 2. In Figure 2, every two seedlings of Chinese Brassica were grown in each pot with independent irrigation pipes to implement the water stress treatment described in Section 2.2.  The water stress treatment was started when the seedlings had developed four leaves and one Brassica core. The water stress treatment was based on the soil field capacity of the pot (SFCP). The method for calculating SFCP was as follows [22]: Step 1: Fully irrigate the five pots and allow them to stand stationary for 6 h.
Step 2: Take a sample of soil at 10 cm below the surface and determine its weight ( ).
Step 3: Dry the sample soil in the drying box for 24 h at 105 °C.
Step 5: Calculate the SFCP based on the weight of the sample soil before and after drying using Equation (1): Step 6: Repeat the above steps three times and calculate the average as the final SFCP. Following the procedure outlined above, the SFCP was calculated as 32.2% for the five pots. The subsequent water stress experiment was designed according to the calculated SFCP. The irrigation was conducted according to the decreased amount of waterbased on the original soil mass in each pot during the experiment. The method of determining the decreased amount of water was the same as the above method of SFCP and was performed every day at 9:00 a.m.

Water Stress Treatment and CWSI Calculating Methods
The water stress gradient was set as five groups based on the calculated SFCP in this experiment [13], i.e., T1, T2, T3, T4, and T5 with 100, 85, 70, 55, and 40% SFCP, respectively [23], on the stability of non-water-stressed baselines. The two different methods were applied to calculate CWSI, i.e., a non-transpiration baseline and fixed ΔT methods for the  The water stress treatment was started when the seedlings had developed four leaves and one Brassica core. The water stress treatment was based on the soil field capacity of the pot (SFCP). The method for calculating SFCP was as follows [22]: Step 1: Fully irrigate the five pots and allow them to stand stationary for 6 h.
Step 2: Take a sample of soil at 10 cm below the surface and determine its weight ( ).
Step 3: Dry the sample soil in the drying box for 24 h at 105 °C.
Step 5: Calculate the SFCP based on the weight of the sample soil before and after drying using Equation (1): Step 6: Repeat the above steps three times and calculate the average as the final SFCP. Following the procedure outlined above, the SFCP was calculated as 32.2% for the five pots. The subsequent water stress experiment was designed according to the calculated SFCP. The irrigation was conducted according to the decreased amount of waterbased on the original soil mass in each pot during the experiment. The method of determining the decreased amount of water was the same as the above method of SFCP and was performed every day at 9:00 a.m.

Water Stress Treatment and CWSI Calculating Methods
The water stress gradient was set as five groups based on the calculated SFCP in this experiment [13], i.e., T1, T2, T3, T4, and T5 with 100, 85, 70, 55, and 40% SFCP, respectively [23], on the stability of non-water-stressed baselines. The two different methods were applied to calculate CWSI, i.e., a non-transpiration baseline and fixed ΔT methods for the The water stress treatment was started when the seedlings had developed four leaves and one Brassica core. The water stress treatment was based on the soil field capacity of the pot (SFCP). The method for calculating SFCP was as follows [22]: Step 1: Fully irrigate the five pots and allow them to stand stationary for 6 h.
Step 2: Take a sample of soil at 10 cm below the surface and determine its weight (m 1 ).
Step 3: Dry the sample soil in the drying box for 24 h at 105 • C.
Step 5: Calculate the SFCP based on the weight of the sample soil before and after drying using Equation (1): Step 6: Repeat the above steps three times and calculate the average as the final SFCP.
Following the procedure outlined above, the SFCP was calculated as 32.2% for the five pots. The subsequent water stress experiment was designed according to the calculated SFCP. The irrigation was conducted according to the decreased amount of water-based on the original soil mass in each pot during the experiment. The method of determining the decreased amount of water was the same as the above method of SFCP and was performed every day at 9:00 a.m.

Water Stress Treatment and CWSI Calculating Methods
The water stress gradient was set as five groups based on the calculated SFCP in this experiment [13], i.e., T1, T2, T3, T4, and T5 with 100, 85, 70, 55, and 40% SFCP, respectively [23], on the stability of non-water-stressed baselines. The two different methods were applied to calculate CWSI, i.e., a non-transpiration baseline and fixed ∆T methods for the upper limit of ∆T and a non-water-stress baseline for the lower limit of ∆T [24]. The upper and lower limits of ∆T were base parameters of CWSI based on empirical equations [25]. The equations are defined as follows: where (T c − T a ) ul is the upper limit of ∆T, • C; (T c − T a ) ll is the lower limit of ∆T, • C; A and B represent the regression coefficient of the ∆T equation; VPD and VPG represent the vapor pressure deficit and vapor pressure gradient, respectively, in kPa. VPD and VPG can be estimated with Ta and RH [26], the equations of which are as follows: where RH is the relative air humidity; T a is the air temperature, • C. Upon the basis of the upper and lower limit equations of ∆T, the CWSI is calculated as follows: where T c is the canopy temperature, • C.

Methods of Data Collection
Data were collected for three main parameter categories: Tc, air environmental parameters, and photosynthesis parameters. The canopy leaves that were fully expanded at the top of Chinese Brassica plants were chosen as materials for measurement. The data were collected within a 30 min interval between 10 a.m. and 4 p.m. under sunny conditions. The equipment used in the experiment and for the data collecting processes is shown in Figure 3. The Tc was measured using a Raytek ST18 (Raytek Co. Ltd., Wilmington, NC, USA) handheld infrared thermometer, and the main operating parameters were as follows: the optical resolution was 12:1, emissivity was 0.95, the spectral response range was 8-14 µm, the response time was 500 ms, the temperature measurement error was ±1%, and the temperature measurement range was −20 to 500 • C. During the process of Tc measurement, the distance between the probe of the infrared thermometer and the target was about 12 times the size of the leaves. The measurement method was to grip the button with the ST18 pointed at the target canopy to acquire Tc. Every Tc data measurement was implemented three times. The final Tc was defined as the average of three values.
Air environmental parameters, i.e., Ta and RH, were measured using an Aicevoos W8 (Aicevoos Co. Ltd., Beijing, China) handheld hygrograph. The temperature measurement range of the hygrograph is −20 to 60 • C, with an associated error of ±0.3 • C. The RH measurement range is 0-100%, with an associated error of ±3%. During the measurement process, the distance between the probe and the measurement target was 10 cm, and the average value of three separate measurements was used as the final Ta and RH. The photosynthesis parameters, i.e., Pn, Tr, and photosynthetically active radiation (PAR), were measured using a SYS-GH30D (SYS-Tech Co. Ltd., Liaoning, China). The principle of Pn was based on the infrared CO 2 analysis method. The Pn was calculated by measuring the change of CO 2 in a certain period of time, and the temperature and PAR were measured at the same time. The measurement range and error were 0-3000 ppm and 3 ppm, respectively. Pn was measured carefully by holding the leaf and inserting the leaf into the air chamber of the probe. The measurements were taken after preheating. The measurement period was 30 s, and the average of five values was calculated and used as the final Pn in the experiment.  Air environmental parameters, i.e., Ta and RH, were measured using an Aicevoos W8 (Aicevoos Co. Ltd., Beijing, China) handheld hygrograph. The temperature measurement range of the hygrograph is −20 to 60 °C, with an associated error of ±0.3 °C. The RH measurement range is 0-100%, with an associated error of ±3%. During the measurement process, the distance between the probe and the measurement target was 10 cm, and the average value of three separate measurements was used as the final Ta and RH. The photosynthesis parameters, i.e., Pn, Tr, and photosynthetically active radiation (PAR), were measured using a SYS-GH30D (SYS-Tech Co. Ltd., Liaoning, China). The principle of Pn was based on the infrared CO2 analysis method. The Pn was calculated by measuring the change of CO2 in a certain period of time, and the temperature and PAR were measured at the same time. The measurement range and error were 0-3000 ppm and 3 ppm, respectively. Pn was measured carefully by holding the leaf and inserting the leaf into the air chamber of the probe. The measurements were taken after preheating. The measurement period was 30 s, and the average of five values was calculated and used as the final Pn in the experiment.

Method of Studying the Upper Limit of ΔT Based on Clustering Algorithm
The upper limit of ΔT is critical in CWSI calculation, according to Equation (6). It is a dynamic parameter so that its characteristics can be effectively studied with a large number of unlabeled ΔT data. Unsupervised learning is an important branch of machine learning, and the clustering algorithm is an effective tool for data mining [27]. The category labels can be autonomously learned through the clustering algorithm with unlabeled data. The classical k-means clustering algorithm [28] carries out classification according to the distance between points and cluster centers, which means that it has less computation tasks and lower complexity. However, the disadvantage of the k-means algorithm is that the k value must be manually specified. The k value represents the number of categories. Thus, the k-means algorithm cannot independently conduct classification. Mean shift clustering is an improved clustering algorithm based on the learning of the centroid [29] in the slide window. This method was applied to update the center candidate point of the window to the mean value of the sliding serial points so as to find the center point of the corresponding category. Repeated points were eliminated, and the aggregated category was generated with slide window filtering. The advantage of the mean shift clustering

Method of Studying the Upper Limit of ∆T Based on Clustering Algorithm
The upper limit of ∆T is critical in CWSI calculation, according to Equation (6). It is a dynamic parameter so that its characteristics can be effectively studied with a large number of unlabeled ∆T data. Unsupervised learning is an important branch of machine learning, and the clustering algorithm is an effective tool for data mining [27]. The category labels can be autonomously learned through the clustering algorithm with unlabeled data. The classical k-means clustering algorithm [28] carries out classification according to the distance between points and cluster centers, which means that it has less computation tasks and lower complexity. However, the disadvantage of the k-means algorithm is that the k value must be manually specified. The k value represents the number of categories. Thus, the k-means algorithm cannot independently conduct classification. Mean shift clustering is an improved clustering algorithm based on the learning of the centroid [29] in the slide window. This method was applied to update the center candidate point of the window to the mean value of the sliding serial points so as to find the center point of the corresponding category. Repeated points were eliminated, and the aggregated category was generated with slide window filtering. The advantage of the mean shift clustering algorithm is that it can perform automatic classification, while the disadvantage is that the slide window size must be manually determined. Hierarchical density-based spatial clustering of applications with noise (HDBSCAN) is an improved hierarchical clustering method based on nonparametric density estimation [30], first proposed by Ester et al. in 1996. HDBSCAN relies on density-based clustering, combining a hierarchical structure and a mutual reachable distance to identify clustering to distinguish high-density regions from low-density regions [31]. HDBSCAN uses the core distance to calculate the density of target points, that is, the mutual reachable distance. The definition formula is as follows: where d mreach-k (a, b) is the mutual reachable distance between a and b; core k (a) and core k (b) are the core distances between a and b; d(a, b) is the Euclidean distance between a and b; k is the smoothing factor. Having a mutual reachable distance enlarges the gap between high and low density, and the result is a more ideal classification effect in theory. The mutual reachable distance is not reliant on manually determining the number of clusters and is helpful for discarding noise points. It is thus effective in classifying noise and clustering points, especially in the case of an unbalanced sample density; the clustering index can provide a reliable reference for classification results. Therefore, the HDBSCAN algorithm was used to investigate ∆T in this study.

Multivariate Long Short-Term Memory (MLSTM) Model
Pn is an important physiological crop index. The water stress treatment used in this study is expected to have a significant effect on the Pn of Chinese Brassica. Based on the data of photosynthesis parameters and Tc during the whole growth stage, the study of the relationship between CWSI and Pn proved to be of great significance for the evaluation of Pn. LSTM is a special recurrent neural network (RNN) model [32] that has been applied in many data mining fields, such as financial [33] and crop yield [34] predictions. The classic LSTM cell structure [35] is shown in Figure 4. A characteristic of LSTM is that it contains a forget gate f t , input gate i t , and output gate o t . The updated equations of each gate are as follows: Agronomy 2021, 11, x FOR PEER REVIEW 8 of 20 Equation (9) is the updated formula of the input gate that controls the input information. and are the weight and bias of the input gate, respectively. When the value of was 1, all of the input data were allowed through the input gate. Equation (10) is the temporary cell state formula that combines the former state of the hidden layer and input data. and are the weight and bias of the temporary cell state, respectively. represents the cell state determined by the forget gate, input gate, temporary cell state, and former cell state. In Equation (11), * represents the information that should be discarded from the former cell state. * represents the information updated in the current cell state.
is the output gate, while and represent its weight and bias. The output was based on the former state of the hidden layer and current input data.
The MLSTM model was designed based on LSTM cells, as shown in Figure 5, and contains an input layer, an LSTM layer, a fully connected (FC) layer, a merged MLSTM layer, and an output layer. The MLSTM model was proposed to transform time-series data to the supervised learning mode, i.e., multi-feature and single-target. The target of this study was Pn. The input features of MLSTM comprise four categories, i.e., Tc, air environmental parameters, photosynthesis parameters, and CWSI. The LSTM layer was used to collect features of the four categories and transform them to the next layer. The FC layer was used to transform the LSTM layer data to high-dimensional features. In addition, the merged MLSTM layer was added to combine the abstract features to predict Pn. The predicted Pn is given by the output layer.
As the proposed model uses multi-category features, it was necessary to adjust the Equation (8) is the updated formula of the forget gate, which determines how much information should be discarded. σ is the sigmoid function, whose value is in the range (0, 1). With the σ function, the output of the forget gate was transformed to be nonlinear. W f and b f represent the weight and bias of the forget gate, respectively. h t−1 and x t represent the former state of the hidden layer and input value, respectively.
Equation (9) is the updated formula of the input gate that controls the input information. W i and b i are the weight and bias of the input gate, respectively. When the value of i t was 1, all of the input data were allowed through the input gate. Equation (10) is the temporary cell state formula that combines the former state of the hidden layer and input data. W c and b c are the weight and bias of the temporary cell state, respectively. C t represents the cell state determined by the forget gate, input gate, temporary cell state, and former cell state. In Equation (11), f t * C t−1 represents the information that should be discarded from the former cell state. i t * c t represents the information updated in the current cell state. o t is the output gate, while W o and b o represent its weight and bias. The output was based on the former state of the hidden layer and current input data.
The MLSTM model was designed based on LSTM cells, as shown in Figure 5, and contains an input layer, an LSTM layer, a fully connected (FC) layer, a merged MLSTM layer, and an output layer. The MLSTM model was proposed to transform time-series data to the supervised learning mode, i.e., multi-feature and single-target. The target of this study was Pn. The input features of MLSTM comprise four categories, i.e., Tc, air environmental parameters, photosynthesis parameters, and CWSI. The LSTM layer was used to collect features of the four categories and transform them to the next layer. The FC layer was used to transform the LSTM layer data to high-dimensional features. In addition, the merged MLSTM layer was added to combine the abstract features to predict Pn. The predicted Pn is given by the output layer.

Support Vector Regression (SVR) Algorithm
The SVR method was used to compare the performance of the regression algorith with that of the model proposed in this study. Stulp et al. [37] reviewed common regr sion algorithms such as Gaussian process regression (GPR), extreme learning mach (ELM), and SVR. The results show that SVR had advantages in multivariate nonlin regression and convergence speed. Therefore, SVR was applied to estimate performan regarding the prediction of Pn in this study. SVR was first proposed by Corinna et al. [ in 1995. A characteristic of SVR is that the kernel function is applied to achieve nonlin regression. The principle is to fit the data to the optimal hyperplane of the nonlin model. In addition, SVR transforms the optimal hyperplane into an optimization issue introducing relaxation variables and penalty coefficients, so as to carry out iterative a optimization calculations. The constraints of SVR are as follows: , , , where ‖ ‖ is the Euclidean norm; (xi, yi) are data points; • denotes the inner pro As the proposed model uses multi-category features, it was necessary to adjust the input dimension [36] to adapt to the MSLTM network. The dimension settings are shown in Table 1. The Tc in Table 1 contained 204 measurements during the experiment, and each measured Tc consisted of five water stress treatments. Similarly, Ta and RH were two-dimensional data. CWSI was also two-dimensional data consisting of five water stress treatments. Tr and PAR were contained in photosynthetic parameters with 388 data collections in the experiment. The data were divided 70%:30% into a training dataset and testing dataset, respectively. After the feature extraction of the LSTM layer and the FC layer to the merged MLSTM layer, the predicted Pn was finally obtained.

Support Vector Regression (SVR) Algorithm
The SVR method was used to compare the performance of the regression algorithm with that of the model proposed in this study. Stulp et al. [37] reviewed common regression algorithms such as Gaussian process regression (GPR), extreme learning machine (ELM), and SVR. The results show that SVR had advantages in multivariate nonlinear regression and convergence speed. Therefore, SVR was applied to estimate performance regarding the prediction of Pn in this study. SVR was first proposed by Corinna et al. [38] Agronomy 2021, 11, 2145 9 of 19 in 1995. A characteristic of SVR is that the kernel function is applied to achieve nonlinear regression. The principle is to fit the data to the optimal hyperplane of the nonlinear model. In addition, SVR transforms the optimal hyperplane into an optimization issue by introducing relaxation variables and penalty coefficients, so as to carry out iterative and optimization calculations. The constraints of SVR are as follows: where ω is the Euclidean norm; (xi, yi) are data points; ω·x i denotes the inner product of vector ω ∈ R N and x ∈ R N ; ε 1 , ε 2 are relaxation variables; C is the penalty coefficient, which is 4 in this study. The regression model of SVR is where (α i − α i ) = 0 denotes the support vector of SVR, and k x T i x is the radial basis kernel function.
In this paper, the root mean square error (RMSE) was applied to evaluate the error of MLSTM and SVR models, and the determination coefficient (R 2 ) was used to represent the reliability of the two models. The statistical parameter, the variance of the predicted values, was also calculated to compare the performance.

Method to Estimate Growth Range (GR) of Chinese Brassica
The canopy leaf area is an important index to represent the GR of Chinese Brassica. In this study, the GR was evaluated according to canopy leaf area through periodic measurement and calculation. To achieve a nondestructive measurement of the canopy leaf area, GR was defined as the projected canopy leaves. The measurement method [2] involved approximating the projected area of canopy leaves as an ellipse. The maximum length A and maximum width B were measured with a ruler with an error of 1 cm. Thus, the ellipse area calculation formula π*A*B was used to evaluate GR.

Modeling on Lower Limit of ∆T and VPD
Warren et al. [39] demonstrated that ∆T and VPD have a linear relationship under different water stress levels. In this study, the growth stages of Chinese Brassica were divided into the vegetative stage (V stage) and reproductive stage (R stage). The V stage was from 27 November to 12 December 2020, and the R stage was from 12 to 31 December 2020. The linear regression fitting results for ∆T and VPD in two growth stages are shown in Figure 6. The results show that ∆T and VPD had different regression equations in two growth stages, i.e., y = −2x + 1.99 and y = −1.73 + 2.69x, respectively. As can be seen in the scatter diagram, because the growth duration of the V stage was longer than that of the R stage, there were more data points in the V stage than in the R stage for the same data collection periods. The R 2 values of the two equations were 0.846 and 0.840, respectively, with a significant correlation (p < 0.001), indicating that the fitting results had good reliability [40]. The lower limits of ∆T for the V and R stages were −3.29 and −1.53 • C, respectively. This demonstrates that the lower limit of ∆T of the R stage was higher than that of the V stage, meaning that Tc of the R stage might be more susceptible to water stress. The ∆T of the V stage was less than 0 • C, while that of the R stage was more than 0 • C. The main reason was that the leaf area during the V stage was small and insensitive to water stress. When it comes to the R stage, the leaves were mature and larger and thus became more sensitive to water stress and external environmental conditions that could affect leaf transpiration, resulting in a ∆T difference in the V stage and R stage. fluctuated widely, and the maximum value reached 0.92, with an average of 0.8, which meant that T5 was subjected to severe water stress. At the beginning of the V stage, the CWSI of T4 and T5 crossed on 30 November 2020, at which time the leaves were small and insensitive to water stress. With the continuous expansion of the leaf area, the Brassica became sensitive to water stress, and the maximum CWSI gap of T4 and T5 reached 0.35. The results showed that long-term water stress had a significantly negative effect on leaf growth.

The Upper Limit of ΔT Results with HDBSCAN
The HDBSCAN clustering results of the upper limit of ΔT and Ta as determined according to the upper limit equation of ΔT and the measured Ta and RH data are shown in Figure 8. It can be seen in Figure 8 that Ta was positively correlated with the upper limit of ΔT, i.e., the upper limits of ΔT increased with the increase in Ta in both the V stage and the R stage. This was mainly because the Pn and Tr of Brassica leaves accelerated and were then inhibited with the increase in Ta [41]. The path of water transport was soilroot-stem-leaf-stomata-air. The overheated Ta led to a decrease in the stomatal conductance of leaves, which caused a decrease in Tr. At this time, the ability of transpiration to bring down Tc was also influenced so that the upper limit of ΔT increased gradually [42]. During the experiment, the lowest Ta was 16 °C, and the maximum Ta was 35 °C. The According to the fitting results of Figure 6, the non-water-stress baseline method was applied to calculate the daily CWSI of the experiment. The results are shown in Figure 7. The closer the CWSI value was to 0, the less severe the degree of water stress experienced by the Chinese Brassica. Figure 7 shows that the CWSI values of the five water stress treatments had obvious differences. The overall CWSI values of T1 were below 0.2, with an average of 0.186, indicating that it was suitable in terms of water status. The CWSI values were below 0.4 and 0.6, with averages of 0.329 and 0.524, respectively. When the Brassica reached the R stage, the CWSI gap of T2 and T3 widened, and the maximum CWSI gap reached 0.3. The result meant that the T3 water stress treatment had produced a certain degree of water stress in the Brassica leaves. The CWSI values of T4 were in the range of (0.51, 0.75), and the CWSI value of T4 intersected with T3 on 27 December 2020, which meant that the water stress of T3 had exceeded T4 on that day. Furthermore, this indicated that the water stress treatment of T4 had an impact on the growth of Brassica. CWSI of T5 fluctuated widely, and the maximum value reached 0.92, with an average of 0.8, which meant that T5 was subjected to severe water stress. At the beginning of the V stage, the CWSI of T4 and T5 crossed on 30 November 2020, at which time the leaves were small and insensitive to water stress. With the continuous expansion of the leaf area, the Brassica became sensitive to water stress, and the maximum CWSI gap of T4 and T5 reached 0.35. The results showed that long-term water stress had a significantly negative effect on leaf growth.

The Upper Limit of ∆T Results with HDBSCAN
The HDBSCAN clustering results of the upper limit of ∆T and Ta as determined according to the upper limit equation of ∆T and the measured Ta and RH data are shown in Figure 8. It can be seen in Figure 8 that Ta was positively correlated with the upper limit of ∆T, i.e., the upper limits of ∆T increased with the increase in Ta in both the V stage and the R stage. This was mainly because the Pn and Tr of Brassica leaves accelerated and were then inhibited with the increase in Ta [41]. The path of water transport was soil-rootstem-leaf-stomata-air. The overheated Ta led to a decrease in the stomatal conductance of leaves, which caused a decrease in Tr. At this time, the ability of transpiration to bring down Tc was also influenced so that the upper limit of ∆T increased gradually [42]. During the experiment, the lowest Ta was 16 • C, and the maximum Ta was 35 • C. The large Ta difference resulted in a total of eight clusters generated by HDBSCAN in the V stage and the R stage, among which five were in the V stage, and three were in the R stage. The result shows that Ta had a significant impact on the clustering algorithm, and larger differences in Ta led to more clusters. The two clustering results independently generated two larger clusters, of which the centers were 2.7 and 3.3 • C in the V stage and the R stage, respectively. The −1 labels in Figure 8a,b represent outlier noise points, which were at the edge of Ta and the upper limits of ∆T and were removed in the subsequent fitting calculation. The points of the −1 label in Figure 8b seemed sparse because the leaf area became large during the R stage. The leaves under water stress were then influenced by the water status and Ta, which enlarged the gap of the different upper limits of ∆T. This result also reflects the widening gap of CWSI under different water stress treatments, as shown in Figure 7.

The Upper Limit of ΔT Results with HDBSCAN
The HDBSCAN clustering results of the upper limit of ΔT and Ta as determined according to the upper limit equation of ΔT and the measured Ta and RH data are shown in Figure 8. It can be seen in Figure 8 that Ta was positively correlated with the upper limit of ΔT, i.e., the upper limits of ΔT increased with the increase in Ta in both the V stage and the R stage. This was mainly because the Pn and Tr of Brassica leaves accelerated and were then inhibited with the increase in Ta [41]. The path of water transport was soilroot-stem-leaf-stomata-air. The overheated Ta led to a decrease in the stomatal conductance of leaves, which caused a decrease in Tr. At this time, the ability of transpiration to bring down Tc was also influenced so that the upper limit of ΔT increased gradually [42]. During the experiment, the lowest Ta was 16 °C, and the maximum Ta was 35 °C. The  large Ta difference resulted in a total of eight clusters generated by HDBSCAN in the V stage and the R stage, among which five were in the V stage, and three were in the R stage. The result shows that Ta had a significant impact on the clustering algorithm, and larger differences in Ta led to more clusters. The two clustering results independently generated two larger clusters, of which the centers were 2.7 and 3.3 °C in the V stage and the R stage, respectively. The −1 labels in Figure 8a,b represent outlier noise points, which were at the edge of Ta and the upper limits of ΔT and were removed in the subsequent fitting calculation. The points of the −1 label in Figure 8b seemed sparse because the leaf area became large during the R stage. The leaves under water stress were then influenced by the water status and Ta, which enlarged the gap of the different upper limits of ΔT. This result also reflects the widening gap of CWSI under different water stress treatments, as shown in Figure 7. In this study, T1 and T5 are the conditions with no water stress and the most severe water stress, respectively. Therefore, CWSI values of these two treatments were used as a reference. On the other hand, considering that the upper limits of the V stage and R stage of the clustering results were 2.7 and 3.3 °C, respectively, the fixed upper limit temperatures, i.e., 2 and 4 °C, were selected to calculate the CWSI values. The values were compared to the CWSI values of T1 and T5 to evaluate the effect of the HDBSCAN clustering results using RMSE. The results are shown in Figure 9. As can be seen in Figure 9, the RMSE of CWSI calculated by the upper limit of ΔT obtained by HDBSCAN clustering was the lowest in both the V stage and R stage. Among the RMSE results, the RMSEs of 2.7 °C in the V stage were 0.0105 and 0.0165 when comparing the T1 and T5 treatments, which were at least 82.8 and 86.9% lower than when using fixed upper limit temperatures of 2 and 4 °C, respectively. On the other hand, the RMSEs in the R stage also decreased by 74.2% and 80.6%, respectively. The results indicate that the HDBSCAN method effectively represented the upper limits of ΔT of the V stage and R stage. At the same time, the clustering method determined the upper limits of ΔT-which represents the overall CWSI in In this study, T1 and T5 are the conditions with no water stress and the most severe water stress, respectively. Therefore, CWSI values of these two treatments were used as a reference. On the other hand, considering that the upper limits of the V stage and R stage of the clustering results were 2.7 and 3.3 • C, respectively, the fixed upper limit temperatures, i.e., 2 and 4 • C, were selected to calculate the CWSI values. The values were compared to the CWSI values of T1 and T5 to evaluate the effect of the HDBSCAN clustering results using RMSE. The results are shown in Figure 9. As can be seen in Figure 9, the RMSE of CWSI calculated by the upper limit of ∆T obtained by HDBSCAN clustering was the lowest in both the V stage and R stage. Among the RMSE results, the RMSEs of 2.7 • C in the V stage were 0.0105 and 0.0165 when comparing the T1 and T5 treatments, which were at least 82.8 and 86.9% lower than when using fixed upper limit temperatures of 2 and 4 • C, respectively. On the other hand, the RMSEs in the R stage also decreased by 74.2% and 80.6%, respectively. The results indicate that the HDBSCAN method effectively represented the upper limits of ∆T of the V stage and R stage. At the same time, the clustering method determined the upper limits of ∆T-which represents the overall CWSI in the two growth stages-in a statistically significant manner. Therefore, the clustering results were applied to the prediction of Pn in this study.

Performance of Model Fitting
The model fitting performance is shown in Table 2. MLSTM and SVR represent Tc, Ta, RH, Tr, and PAR as input features, while MLSTM-CWSI and SVR-CWSI represent CWSI and the above features. As can be seen in Table 2, the RMSEs of MLSTM-CWSI and MLSTM were 0.108 and 0.198 μmol·m −2 ·s −1 , respectively. The variances of the two were 0.529 and 0.586, respectively. The same phenomenon appeared in the SVR algorithm. The RMSEs of SVR-CWSI and SVR were 0.169 and 0.277 μmol·m −2 ·s −1 , respectively. However, the variances of SVR were 0.611 and 0.642, respectively. Both were larger than the MLSTM model, meaning that the proposed model yielded more stable predicted data. This provides evidence that CWSI plays an important role in the prediction of Pn. On the other hand, the R 2 of MLSTM-CWSI was 0.899, meaning that the prediction results of the model had good reliability. In addition, the R 2 of SVR-CWSI was 0.819, which was lower than that of the proposed model in this study. Furthermore, the R 2 of MLSTM was better than SVR. The results showed that the proposed model had better performance than SVR in processing multi-source input features. The fitting equations and scatter diagrams of the four input feature modes are shown in Figures 10 and 11. As can be seen in Figure 10, with the input CWSI features, the slope and intercept of the MLSTM and SVR regression equations differed by 0.07 and 0.21, respectively, indicating that the fitting results were similar. The measured values of Pn, which were distributed in the range of (1,2.5) μmol·m −2 ·s −1 , were not evenly distributed. If the measured Pn was above 2.5 μmol·m −2 ·s −1 , Pn was overestimated by the MLSTM and SVR under the given conditions. This may be due to the distribution of raw data. When the CWSI parameters were removed, the MLSTM and SVR models were 0.808 and 0.748 in R 2 . Although the intercept difference between the two models was 0.16, the slopes of the fitting equations were 1.1 and 0.85, respectively. Thus, CWSI had an important impact on the model fitting results. Although the SVR algorithm could converge to the ideal hyperplane with the support of the kernel function, it is still impossible to avoid the trend of

Performance of Model Fitting
The model fitting performance is shown in Table 2. MLSTM and SVR represent Tc, Ta, RH, Tr, and PAR as input features, while MLSTM-CWSI and SVR-CWSI represent CWSI and the above features. As can be seen in Table 2, the RMSEs of MLSTM-CWSI and MLSTM were 0.108 and 0.198 µmol·m −2 ·s −1 , respectively. The variances of the two were 0.529 and 0.586, respectively. The same phenomenon appeared in the SVR algorithm. The RMSEs of SVR-CWSI and SVR were 0.169 and 0.277 µmol·m −2 ·s −1 , respectively. However, the variances of SVR were 0.611 and 0.642, respectively. Both were larger than the MLSTM model, meaning that the proposed model yielded more stable predicted data. This provides evidence that CWSI plays an important role in the prediction of Pn. On the other hand, the R 2 of MLSTM-CWSI was 0.899, meaning that the prediction results of the model had good reliability. In addition, the R 2 of SVR-CWSI was 0.819, which was lower than that of the proposed model in this study. Furthermore, the R 2 of MLSTM was better than SVR. The results showed that the proposed model had better performance than SVR in processing multi-source input features. The fitting equations and scatter diagrams of the four input feature modes are shown in Figures 10 and 11. As can be seen in Figure 10, with the input CWSI features, the slope and intercept of the MLSTM and SVR regression equations differed by 0.07 and 0.21, respectively, indicating that the fitting results were similar. The measured values of Pn, which were distributed in the range of (1,2.5) µmol·m −2 ·s −1 , were not evenly distributed. If the measured Pn was above 2.5 µmol·m −2 ·s −1 , Pn was overestimated by the MLSTM and SVR under the given conditions. This may be due to the distribution of raw data. When the CWSI parameters were removed, the MLSTM and SVR models were 0.808 and 0.748 in R 2 . Although the intercept difference between the two models was 0.16, the slopes of the fitting equations were 1.1 and 0.85, respectively. Thus, CWSI had an important impact on the model fitting results. Although the SVR algorithm could converge to the ideal hyperplane with the support of the kernel function, it is still impossible to avoid the trend of underfitting with the reduction of CWSI parameters in the fitting process. This also indicates that water stress had a significant, indirect influence on the Brassica Pn.

Effect of Water Stress on GR of Chinese Brassica
The GR of Chinese Brassica in the described experiment was evaluated on vember, 12 December, 21 December, and 31 December, 2020. The results are sh Figure 12. The initial maximum and minimum GR were 9.68 cm 2 in T2 and 6.83 cm respectively. When Brassica was in the V stage, the GR increased rapidly from 27 N ber to 12 December. The GR increased most rapidly in T1 compared to the initial s 2.53 times. The maximum GR of T2 was 37 cm 2 , only 10.7% smaller than that of T1 the water consumption of T2 was 15% less than that of T1. This result demonstra potential of water stress evaluation for water-saving. The maximum GR of T3 w mately kept at the level of T2 on 21 December, indicating that 70% water stress significant negative effect on the growth of Brassica. The GR of T4 and T5 showed a phenomenon, i.e., their final GR was close to the level of the last water stress grad 21 December. The reason was that water stress resulted in the retarded growth of B The GR of T5 increased only by 15.3% from 12 December to the end of the expe The growth of T5 almost entirely halted due to severe water stress. The study result provide references for settings related to water stress; that is to say, there exists a thr in water supply below which the growth of Brassica would not be explicitly affe improved in the utilization rate of water resources.

Effect of Water Stress on GR of Chinese Brassica
The GR of Chinese Brassica in the described experiment was evaluated on 2 vember, 12 December, 21 December, and 31 December, 2020. The results are sho Figure 12. The initial maximum and minimum GR were 9.68 cm 2 in T2 and 6.83 cm 2 respectively. When Brassica was in the V stage, the GR increased rapidly from 27 N ber to 12 December. The GR increased most rapidly in T1 compared to the initial st 2.53 times. The maximum GR of T2 was 37 cm 2 , only 10.7% smaller than that of T1 the water consumption of T2 was 15% less than that of T1. This result demonstra potential of water stress evaluation for water-saving. The maximum GR of T3 wa mately kept at the level of T2 on 21 December, indicating that 70% water stress significant negative effect on the growth of Brassica. The GR of T4 and T5 showed a s phenomenon, i.e., their final GR was close to the level of the last water stress gradi 21 December. The reason was that water stress resulted in the retarded growth of Br The GR of T5 increased only by 15.3% from 12 December to the end of the exper The growth of T5 almost entirely halted due to severe water stress. The study results provide references for settings related to water stress; that is to say, there exists a thr in water supply below which the growth of Brassica would not be explicitly affec improved in the utilization rate of water resources.

Effect of Water Stress on GR of Chinese Brassica
The GR of Chinese Brassica in the described experiment was evaluated on 27 November, 12 December, 21 December, and 31 December, 2020. The results are shown in Figure 12. The initial maximum and minimum GR were 9.68 cm 2 in T2 and 6.83 cm 2 in T4, respectively. When Brassica was in the V stage, the GR increased rapidly from 27 November to 12 December. The GR increased most rapidly in T1 compared to the initial state, by 2.53 times. The maximum GR of T2 was 37 cm 2 , only 10.7% smaller than that of T1, while the water consumption of T2 was 15% less than that of T1. This result demonstrates the potential of water stress evaluation for water-saving. The maximum GR of T3 was ultimately kept at the level of T2 on 21 December, indicating that 70% water stress had a significant negative effect on the growth of Brassica. The GR of T4 and T5 showed a similar phenomenon, i.e., their final GR was close to the level of the last water stress gradient on 21 December. The reason was that water stress resulted in the retarded growth of Brassica. The GR of T5 increased only by 15.3% from 12 December to the end of the experiment. The growth of T5 almost entirely halted due to severe water stress. The study results could provide references for settings related to water stress; that is to say, there exists a threshold in water supply below which the growth of Brassica would not be explicitly affected or improved in the utilization rate of water resources.

Discussion
As a commonly used empirical estimation formula, the non-water-stress baseline is a simple and rapid method for calculating CWSI. In this study, two linear regression fitting equations were established based on the lower limit of ΔT and VPD of the T1 water stress treatment. Here, the CWSI values were determined based on the non-water-stress baseline condition using the established equations. The results show that the fitting equations in the V stage and R stage were different, i.e., y = −2x + 1.99 and y = −1.73x + 2.69. Dejonge et al. [43] also confirmed the difference in the lower limit of ΔT in different growth stages. Tc and infrared temperature measurements were effective tools for monitoring and quantifying CWSI. In this study, an infrared thermometer was used to measure multiple sets of the Tc data of Brassica in the V stage and R stage, and the relationship between the lower limit of ΔT and VPD in the complete growth period was established. Han et al. [44] studied CWSI using the non-water-stress baseline method to obtain the lower limit of ΔT of the single growth stage of maize. Their study pointed out that the regression coefficient of the lower limit equation of ΔT was not completely consistent with the results in the same crop, which may be affected by wind speed, crop growth, and other factors. The results in this study demonstrate that the crop growth stage does affect the regression coefficients of the slope and the intercept of the fitted linear equations. The mean CWSIs of T1-T5 under the non-water-stress baseline was 0.185, 0.329, 0.524, 0.621, and 0.8, respectively, which accurately indicate the water stress associated with each treatment. Therefore, CWSI suitably reflects the degree of water stress. With an increase in water stress, crop growth is affected. The direct phenomenon observed in our study was that the gap of CWSI between the V stage and R stage showed a trend of widening. For example, the means of the gap of CWSI between T2 and T3 were 0.17 in the V stage and 0.22 in the R stage, respectively; in T4 and T5, they were 0.16 and 0.22, respectively. The study of Colak et al. [45] proved that water stress would not only reduce the water use efficiency (WUE) of crops but also had a significant negative impact on leaf length, leaf width, and other growth parameters. Cogato et al. [46] had implemented a study on vineyards and showed that the water stress would cause a closure of the stomata and decrease the stomatal conductance and stem water potential. The physiological changes would have a significant impact on the growth speed of the crop. The daily CWSI data of the V and R stages responded to the above changes. Although the physiological changes protected crops from irreversible damage, the negative influence on crop growth might some-

Discussion
As a commonly used empirical estimation formula, the non-water-stress baseline is a simple and rapid method for calculating CWSI. In this study, two linear regression fitting equations were established based on the lower limit of ∆T and VPD of the T1 water stress treatment. Here, the CWSI values were determined based on the non-water-stress baseline condition using the established equations. The results show that the fitting equations in the V stage and R stage were different, i.e., y = −2x + 1.99 and y = −1.73x + 2.69. Dejonge et al. [43] also confirmed the difference in the lower limit of ∆T in different growth stages. Tc and infrared temperature measurements were effective tools for monitoring and quantifying CWSI. In this study, an infrared thermometer was used to measure multiple sets of the Tc data of Brassica in the V stage and R stage, and the relationship between the lower limit of ∆T and VPD in the complete growth period was established. Han et al. [44] studied CWSI using the non-water-stress baseline method to obtain the lower limit of ∆T of the single growth stage of maize. Their study pointed out that the regression coefficient of the lower limit equation of ∆T was not completely consistent with the results in the same crop, which may be affected by wind speed, crop growth, and other factors. The results in this study demonstrate that the crop growth stage does affect the regression coefficients of the slope and the intercept of the fitted linear equations. The mean CWSIs of T1-T5 under the non-water-stress baseline was 0.185, 0.329, 0.524, 0.621, and 0.8, respectively, which accurately indicate the water stress associated with each treatment. Therefore, CWSI suitably reflects the degree of water stress. With an increase in water stress, crop growth is affected. The direct phenomenon observed in our study was that the gap of CWSI between the V stage and R stage showed a trend of widening. For example, the means of the gap of CWSI between T2 and T3 were 0.17 in the V stage and 0.22 in the R stage, respectively; in T4 and T5, they were 0.16 and 0.22, respectively. The study of Colak et al. [45] proved that water stress would not only reduce the water use efficiency (WUE) of crops but also had a significant negative impact on leaf length, leaf width, and other growth parameters. Cogato et al. [46] had implemented a study on vineyards and showed that the water stress would cause a closure of the stomata and decrease the stomatal conductance and stem water potential. The physiological changes would have a significant impact on the growth speed of the crop. The daily CWSI data of the V and R stages responded to the above changes. Although the physiological changes protected crops from irreversible damage, the negative influence on crop growth might sometimes be inescapable. Therefore, deep research is necessary to study the quantitative influence on growth with physiological parameters in future work. The Brassica of this study represents a small sample set. Further research is necessary to carry out water stress experiments in a large field to quantitatively study the effect of CWSI on the yield of Brassica.
Based on the non-water-stress baseline, the fixed upper limit equation of ∆T was established to calculate CWSI. The upper limit of ∆T may be affected by many other factors [47], such as Ta, solar radiation intensity, soil moisture, and leaf stomatal conductance. The advantage of this study was that the HDBSCAN method was applied to cluster a large number of values corresponding to the upper limit of ∆T data and Ta by calculating the mutual reachable distance. Considering the dynamic change in the upper limit of ∆T, this meant that the change might exist throughout the whole growth stage. Thus, the significance of clustering by the HDBSCAN algorithm lay in the fact that each generated cluster label aggregated most changes in the upper limit of ∆T of Brassica in the experiment. Similar to the results of the non-water-stress baseline, the cluster centers of the V stage and R stage were different. Although the gap was only 0.6 • C, the statistical significance of this result was to confirm the sensitivity of Brassica to water stress at different growth stages. In other words, the small canopy made it insensitive to water stress. According to the change process in Figure 12, as GR increased, there was an accelerated increase in Tr as the demand for water gradually increased. The water stress gradient of the experiment had a different limitation with regard to the metabolic intensity of the Brassica heart under different water treatments. This directly resulted in discrete clustering results of the upper limit of ∆T in the R stage, in which the cluster center was 3.3 • C, while the gap between maximum and minimum ∆T reached 0.5 • C. Han et al. [48] studied canopy dynamics with the clustering algorithm, and the results also supported the conclusion of this study that there were differences in the clustering results of crops at different growth stages. The overall RMSEs based on the clustering for the upper limit of ∆T were reduced in a more statistically significant manner than for the classical non-water-stress baseline method. However, the current HDBSCAN clustering method focused only on the parameters of the upper limit of ∆T and Ta, so it is necessary to explore the clustering method with more integrated parameters in future studies.
The MLSTM model and SVR algorithms were applied to predict the Pn using CWSI together with other parameters. According to Matese et al.'s research [15], CWSI had significant correlations with Pn, the correlation coefficient of which was −0.80. Since the Ta and RH were applied to calculate the lower limit equation of ∆T and CWSI, the two parameters were also used as input features to the models. The proposed MLSTM model outperformed SVR in terms of the RMSE, R 2 , and variance in the fitting of predicted and measured Pn, and compared to the study of Xin et al. [49] using an artificial neural network, although their R 2 performance was better than that of the MLSTM model. Their RMSE, however, was over 2.3 µmol·m −2 ·s −1 , whereas the RMSE in our study was only 0.277 µmol·m −2 ·s −1 , which is superior to their results. The comparison confirmed that the proposed model in this study was effective. If CWSI parameters were removed from the model, the performance of both MLSTM and SVR would worsen. With the Ta, RH, Tc, PAR, and other parameters, the given CWSI was pertinent for predicting Pn. Photosynthesis is the key process of crop growth and development and directly affects crop growth. According to the results of Figure 12, water stress treatments T1-T5 resulted in significant differences in GR. The more serious the water stress, that is, the closer CWSI was to 1, the more serious the GR restriction was. Gonzalez-Dugo et al. [50] demonstrated the value of CWSI in quantifying water stress, which was the same as that of this study. The water content of the leaves of Brassica usually reached over 90% [51], coinciding with the conclusion that water supply was a factor important for Brassica growth. The final GR of T2 was only 10.7% smaller than that of T1, but with a 15% reduction in water usage. In addition, the final GR of T3 was 18% smaller than that of T1, with a 30% reduction in water usage. This discovery has important implications for the water control of Brassica in areas that are arid or experiencing water shortage. Due to the small sample size in this study, further research on the practical application of water control in a large field may be necessary. Considering the disadvantages of a portable photosynthetic system, such as its heavyweight and potential damage to leaves in the process of measuring Pn, it is of great significance to indirectly predict Pn efficiently and accurately. The proposed model combining CWSI, Tc, and environmental parameters appear to be reliable in predicting Pn. However, although the present study demonstrated that CWSI plays an important role in predicting Pn, it is still necessary to explore the quantitative relationship between CWSI and Pn and the total yield under conditions where external environmental parameters such as Ta, RH, wind speed, and solar radiation intensity are controlled. It is also necessary to figure out the relationship between Pn and other environmental parameters, such as Ta and RH. Such a quantitative study could provide valuable guidance for the water control and yield evaluation of Brassica to realize the purposes of increasing yield and the water resource utilization rate or effectiveness.

Conclusions
In this study, the different water stress treatments were set, and the GR was determined for different growth stages. The CWSI was calculated based on the non-water-stress baseline method. Meanwhile, the HDBSCAN clustering algorithm was applied to study the upper limit of ∆T in calculating CWSI. The proposed MLSTM model combines Tc, Ta, RH, CWSI, Tr, and PAR in the prediction of Pn. Furthermore, the comparison of performance was implemented with SVR. The main conclusions are as follows: • The linear regression fitting equation of the lower limit of ∆T and VPD under the non-water-stress baseline shows that the regression equations of the V stage and R stage differ, and the R 2 of the two stages was 0.846 and 0.840, respectively. The lower limits of ∆T for the V stage and R stage were determined as −3.29 and −1.53 • C. The values calculated for CWSI show that it is suitable for quantifying the degree of water stress.

•
The upper limits of ∆T in the V and R stages obtained by the HDBSCAN algorithm were 2.7 and 3.3 • C, respectively. The CWSI values calculated based on the clustered upper limits of ∆T showed lower RMSEs compared to the non-water-stress baseline method. • The MLSTM model proposed in this study outperformed the SVR algorithm in predicting the photosynthetic rate. Combined with the CWSI value in the case of the clustered upper limit of ∆T, the results demonstrate that CWSI plays an important role in improving the accuracy of Pn prediction. • There were significant differences in GR under different water stress gradients, which could provide valuable guidance in water control to improve the water resource utilization rate in areas that are arid or experiencing water shortage.
This paper has proved that the upper limit of ∆T determined by the clustering method decreased the RMSEs of calculating CWSI by at least 82.8% in the V stage and 74.2% in the R stage, respectively. The method showed a new way of finding the upper limit of ∆T. This paper also found that water stress will retard the growth of Chinese Brassica. At the same time, when the water stress gradient was reduced from 100% to 85%, the GR decreased by only 10.7%. The foundation was helpful for water saving. However, further study could be implemented to research quality indicators of Chinese Brassica, such as vitamins, trace elements, and yield. When it comes to the proposed model, although the MLSTM model based on CWSI for predicting Pn was intended to provide valuable guidance for water control in the management of Brassica crops, further experiments should be implemented in a large field to validate the results of this study. On the other hand, more parameters could be integrated to study the upper limits of ∆T, and research can be conducted to quantify the relationships between the CWSI, yield, and Pn of Brassica so as to allow for a more precise water supply control strategy and to thus realize water savings and improve the utilization ratio of water resources.