Deep Learning Sensor Fusion in Plant Water Stress Assessment: A Comprehensive Review

Featured Application: In this paper, an all-inclusive review of deep learning sensor fusion with its challenges and future perspectives in plant water stress assessment has been carried out. Abstract: Water stress is one of the major challenges to food security, causing a signiﬁcant economic loss for the nation as well for growers. Accurate assessment of water stress will enhance agricultural productivity through optimization of plant water usage, maximizing plant breeding strategies, and preventing forest wildﬁre for better ecosystem management. Recent advancements in sensor technologies have enabled high-throughput, non-contact, and cost-efﬁcient plant water stress assessment through intelligence system modeling. The advanced deep learning sensor fusion technique has been reported to improve the performance of the machine learning application for processing the collected sensory data. This paper extensively reviews the state-of-the-art methods for plant water stress assessment that utilized the deep learning sensor fusion approach in their application, together with future prospects and challenges of the application domain. Notably, 37 deep learning solutions fell under six main areas, namely soil moisture estimation, soil water modelling, evapotranspiration estimation, evapotranspiration forecasting, plant water status estimation and plant water stress identiﬁcation. Basically, there are eight deep learning solutions compiled for the 3D-dimensional data and plant varieties challenge, including unbalanced data that occurred due to isohydric plants, and the effect of variations that occur within the same species but cultivated from different locations.


Introduction
Water stress, also known as drought stress, is part of plant abiotic stress, a pressing threat to plant productivity if sustained over a long period [1]. Most major crop plant yields can reduce by more than half due to lack of water availability [2,3]. Failure to produce agriculture will in turn create a food security challenge, affecting the nation's economy as well as threatening farmers' livelihood. Routine plant water stress assessment could help minimize the risk of productivity loss by timely detection and appropriate intervention.
On the other hand, moderate stress induced in plants can improve the quality of yields as supplying excess irrigation can lead to loss of nutrients in soil [4]. A moderate level of water stress applied during the suitable growth period of mandarin and peach cultivation can benefit through enhanced fruit quality parameters without significantly compromising yield [5]. Regulated deficit irrigation (RDI) is the key technology that can help improve plant water use efficiency, at the same time reducing the staggering amount of agricultural water utilization [6]. Knowledge of plant water stress assessment methods is compromising yield [5]. Regulated deficit irrigation (RDI) is the key technology that ca help improve plant water use efficiency, at the same time reducing the staggering amoun of agricultural water utilization [6]. Knowledge of plant water stress assessment method is therefore highly significant for precision control of irrigation as not to induce unrecov erable stress [7]. Moreover, to some extent, assessing the water status in plants could eve prevent wildfire [8], help in monitoring infested trees which inhibit water stress lik symptoms [9] and become a valuable tool for water stress resistant plant breeding [10].
There are many factors that can inflict water stress in plants. Soil condition, in parti ular the moisture level, is directly correlated with the available water for plant consump tion. Water stress occurs as a result of an imbalance between water availability and wate consumption which leads to soil moisture being the widely used indicator of water stres in plants [11,12]. Environmental factors such as air temperature, relative humidity an solar radiation can influence plant transpiration and soil water evaporation resulting i low water content in plants. Hence several methods of plant water stress assessment ar based on ecological measurements [13]. Recently, plant physiological responses to limite water conditions has been used as indicator of stress and is considered more sensitiv than soil moisture [14]. Kramer [15] has argued that plant water stress assessment shoul be performed directly on the plant, as growth is directly affected by the tissue water con dition and only indirectly by soil water deficit. However, most of the techniques used t measure the responses in plants are destructive, time consuming, and labour intensive.
Advances in sensing technologies have truly revolutionized the methods for plan water stress assessment as illustrated in Figure 1, allowing rapid, automated and cos efficient soil-plant ecophysiology monitoring [16]. These days, environmental sensor have become inexpensive owing to the popularity of IoT devices that can measure agr cultural data in real-time. Various state-of-the-art remote sensing (RS) techniques hav also been introduced for fast and non-invasive plant water status estimation [17]. Alvin and Marino [18] have reported in the recent literature on several RS approaches for plan water stress assessment in the cultivated environment for irrigation management. Nume ous platforms and sensor types have been used providing unlimited heterogenous dat analysed using a range of modelling approaches. Machine learning (ML) has promptly become the standard for data processing espe cially in agriculture, thanks to its ability of to process large amounts of information in non-linear framework [19,20]. Despite the significant achievements of ML application, th technique has a fundamental limitation in that performance is subject to the features used the quality of data collected and the specific targeted application [21]. An advanced pa of ML called Deep learning (DL) has been widely investigated for big data analysis wit remote sensing [22][23][24] and computer vision [25] applications. DL has been attracting a lo Machine learning (ML) has promptly become the standard for data processing especially in agriculture, thanks to its ability of to process large amounts of information in a non-linear framework [19,20]. Despite the significant achievements of ML application, the technique has a fundamental limitation in that performance is subject to the features used, the quality of data collected and the specific targeted application [21]. An advanced part of ML called Deep learning (DL) has been widely investigated for big data analysis with remote sensing [22][23][24] and computer vision [25] applications. DL has been attracting a lot of interest recently in the agricultural field, for example in plant disease detection and yield prediction; however, it has been noted that DL application in plant water stress assessment is quite new and research has been limited [26].
Several review papers on DL application in plant stress phenotyping have been published [27][28][29] showing the high potential of the technique; however, these papers have mainly focused on leaf-based biotic stress detection from the image processing point of view. This investigation aims to comprehensively review almost all the major sub-approaches of plant water stress assessment method connected to DL. Furthermore, an advanced prospect of DL application in the field of plant water stress assessment and its challenges is also presented.
The structure of this paper is as follows: the review methodology is presented in Section 2 while Section 3 provides the background of deep learning network architecture. Section 4 presents the implemented approach of recent DL sensor fusion work based on the planning protocol established in Section 2, as well as the performance metrics when available. Section 5 discusses the challenges as well as the future prospects in the application domain, followed by the conclusion in Section 6.

Review Methodology
A comprehensive and systematic review approach has been widely used to identify, evaluate and interpret relevant research on a specific issue, area or phenomenon of interest. This review methodology is an important study, which aims to carry out a survey of research with the same scope, evaluating these critically in their methodologies and bringing them together in a meta-analysis when this is possible.

Literature Review Planning Protocol
This paper considers the following planning protocol for the review:

Execution
The search for papers was methodically performed to identify studies that are related to the scope of this work using academic databases such as Science Direct, IEEE Xplore, Springer, Wiley, Taylor & Francis, MDPI, Google Scholar, and other Scopus indexed journals and conference proceedings. Combinations of the main keyword "deep learning" with keywords related to plant water stress assessment parameters such as "plant water status", "soil moisture" and "evapotranspiration" were used. Numerous research articles were found (100+ articles), most recently published from 2016 onwards. After removing all duplicates, articles selection was done with emphasis placed on peer-reviewed articles from reputable journals and conference papers. Papers which are related to deep learning but not applied to plant water stress assessment, do not present any type of experimental results and merely make proposals, are excluded. The remaining papers were carefully read and analysed according to the protocol established in Section 2.1. Section 4 summarizes all the selected studies.

Background on Deep Learning Network Architecture
Deep learning's superiority in performance against conventional machine learning was first demonstrated by Hinton and Salakhutdinov [30]. Deep learning can be described as a model that represents non-linear processing consisting of multiple layers of artificial neural networks (ANN). The method of learning used can be supervised or unsupervised with feature representation in the form of high-level abstraction [31]. Schmidhuber [32] described a Deep Neural Network (DNN) as having more than one hidden layer with thousands of units of neurons making small and simple calculations. The increased structure complexity means DL can process large scale and complex data with more learning capacity to characterize input and targeting data. In modern agriculture where the use of a wireless sensor network (WSN) [33] is prevalent, the huge amount of data produced will make the application of DL well suited, and usually leads to better performance.
The conventional machine learning method often requires specifically defined preprocessing to obtain features as input variables. In plant science, this involves extracting the plant features manually such as colour (e.g., RGB colour ratio) [34], texture (e.g., Grey-level concurrence matrix (GLCM)) [35], spectral reflectance (e.g., vegetation indices (VI)) [36] and thermal radiation (e.g., crop water stress index (CWSI)) [37] for analytical processing. Processing the information is a labour-intensive and time-consuming endeavour which depends highly on expert knowledge. Feature selection is also required to eliminate lowquality variables and reduce the dimension of the input. Hendrawan and Murase [35] developed several algorithms dedicated to the feature selection process for water stress detection in Sunagoke Moss. Deep learning can reduce the need for hand-engineered feature extraction and selection, which allows for automatic transformation for a faster analysis, providing end-to-end data processing.
DL is now easily usable by many communities, including in plant science. There are many supportive materials available on deep learning applications. The availability of friendly DL libraries together with easy to learn tutorials ensuring convenience of use [38] with popular programming languages [39] have propelled the wide adoption of the method. Platforms such as Google Colab with GPUs notebook have made online software development and teaching much simpler [40,41]. There are also many tools provided by the DL community that could be useful in the study of plant stress phenotyping [42]. In addition, much research has been conducted to understand more about the DL model and why its performance is exceptional [43]. The problem of limited data sources has also slowly been resolved as many research groups have shared their datasets with others [44,45].

Deep Belief Network
The deep belief network (DBN) is arguably the first successfully trained DL model. The architecture is constructed of stacked Restricted Boltzmann Machines (RBMs) [46]. A single RBM is an energy-based probabilistic generative model that consists of two connected layers, visible and hidden, with no connections within a layer. The hidden layers are regarded as higher-order features that capture the characteristics of the input data. In multi-layer DBN, the output of the preceding RBM is used as input for the following RBM ( Figure 2) allowing for deeper feature extraction and dimension reduction. The training process is done in two steps: firstly, initializing the proper deep network weights by utilizing an efficient unsupervised fast greedy learning strategy; then the final fine-tuning of weights takes place in a supervised manner through Back Propagation, with the addition of a linear classifier on the top layer of the DBN.

Convolutional Neural Network
Convolutional Neural Network (CNN) is a type of feedforward deep learning model most commonly used for two-dimensional input data, such as an image. It is the most popular model in computer vision applications, with growing popularity in agriculture [47]. The normal architecture is horizontally hierarchical and constitutes several convolutional, pooling and fully connected layers as presented in Figure 3. Convolutional layers act as feature extractors from the input image, employing kernels or filters. The output dimensionality is then reduced by the pooling layers which extract the most significant features through max pooling or average pooling. The last fully connected layers usually act as classifiers through the softmax function. There have been many other popular architectures such as AlexNet [48] and Visual Geometry Group VGG [49] with different applications and advantages. Recently, 3D CNN has attracted attention in various visionbased applications utilizing image sequences, such as video data [50].

Convolutional Neural Network
Convolutional Neural Network (CNN) is a type of feedforward deep learning model most commonly used for two-dimensional input data, such as an image. It is the most popular model in computer vision applications, with growing popularity in agriculture [47]. The normal architecture is horizontally hierarchical and constitutes several convolutional, pooling and fully connected layers as presented in Figure 3. Convolutional layers act as feature extractors from the input image, employing kernels or filters. The output dimensionality is then reduced by the pooling layers which extract the most significant features through max pooling or average pooling. The last fully connected layers usually act as classifiers through the softmax function. There have been many other popular architectures such as AlexNet [48] and Visual Geometry Group VGG [49] with different applications and advantages. Recently, 3D CNN has attracted attention in various vision-based applications utilizing image sequences, such as video data [50].

Convolutional Neural Network
Convolutional Neural Network (CNN) is a type of feedforward deep learning model most commonly used for two-dimensional input data, such as an image. It is the most popular model in computer vision applications, with growing popularity in agriculture [47]. The normal architecture is horizontally hierarchical and constitutes several convolutional, pooling and fully connected layers as presented in Figure 3. Convolutional layers act as feature extractors from the input image, employing kernels or filters. The output dimensionality is then reduced by the pooling layers which extract the most significant features through max pooling or average pooling. The last fully connected layers usually act as classifiers through the softmax function. There have been many other popular architectures such as AlexNet [48] and Visual Geometry Group VGG [49] with different applications and advantages. Recently, 3D CNN has attracted attention in various visionbased applications utilizing image sequences, such as video data [50].

Long-Short Term Memory Network
Introduced by Hochreiter and Schmidhuber [51], the Long Short-Term Memory (LSTM) network is the augmented version of the Recurrent Neural Network (RNN) used for time series data. RNN was commonly used for discrete sequence analysis through deep feed forward network to learn long-term features; however, it is difficult to store information for very long. LSTM was constructed by adding a memory cell ( Figure 4) to overcome the problem of a vanishing gradient in RNN weight training when computed using Backpropagation through time (BPTT) [52]. The cell state allows information to pass through the forget gate layer, input gate layer and output gate layer enabling the recurrent unit to capture long-term information at different time scales. The LSTM model highlights an ability to preserve and learn previous information from long-term time series data through application in natural language processing [53] and speech recognition [54].

Long-Short Term Memory Network
Introduced by Hochreiter and Schmidhuber [51], the Long Short-Term Memory (LSTM) network is the augmented version of the Recurrent Neural Network (RNN) used for time series data. RNN was commonly used for discrete sequence analysis through deep feed forward network to learn long-term features; however, it is difficult to store information for very long. LSTM was constructed by adding a memory cell (Figure 4) to overcome the problem of a vanishing gradient in RNN weight training when computed using Backpropagation through time (BPTT) [52]. The cell state allows information to pass through the forget gate layer, input gate layer and output gate layer enabling the recurrent unit to capture long-term information at different time scales. The LSTM model highlights an ability to preserve and learn previous information from long-term time series data through application in natural language processing [53] and speech recognition [54].

Deep Learning in Plant Water Stress Assessment
DL applications were divided into sub-categories via the main methods used in plant water stress assessment including soil moisture estimation, soil water modelling, evapotranspiration estimation, evapotranspiration forecasting, plant water status estimation, and plant water stress identification. The DL techniques used in the reviewed literature are discussed below.

Soil Moisture Estimation
Soil moisture (SM) is an important parameter for assessing plant water stress as it is directly related to water availability for plant consumption. Over the years, SM measurement has been used for agricultural drought monitoring and irrigation control [55]. It is critical that the percentage of SM does not approach the permanent wilting point, which is the lower limit of plant-available water [56]. The traditional soil sampling few points method for moisture analysis is time-consuming and does not reflect the whole field conditions accurately. In situ measurement, on the other hand, is not feasible for large field application due to the high cost of sensor installation [14]. Advanced remote sensing tech-

Deep Learning in Plant Water Stress Assessment
DL applications were divided into sub-categories via the main methods used in plant water stress assessment including soil moisture estimation, soil water modelling, evapotranspiration estimation, evapotranspiration forecasting, plant water status estimation, and plant water stress identification. The DL techniques used in the reviewed literature are discussed below.

Soil Moisture Estimation
Soil moisture (SM) is an important parameter for assessing plant water stress as it is directly related to water availability for plant consumption. Over the years, SM measurement has been used for agricultural drought monitoring and irrigation control [55]. It is critical that the percentage of SM does not approach the permanent wilting point, which is the lower limit of plant-available water [56]. The traditional soil sampling few points method for moisture analysis is time-consuming and does not reflect the whole field conditions accurately. In situ measurement, on the other hand, is not feasible for large field application due to the high cost of sensor installation [14]. Advanced remote sensing technique has offered advantages in terms of low-cost and wide scale of monitoring. DL techniques have been used for fast and accurate SM estimation from several remote sensing platforms.
Several studies have used DL for aerial image analysis which could be easily captured using an unmanned aerial vehicle (UAV) or drone for SM estimation. Sobayo et al. [57] proposed a CNN-based regression model to estimate SM content from aerial captured thermal images from three different farm areas. The model was able to predict SM content more accurately than the plain DNN model. The technique shows promising application; however, there is currently a lack of adequate local ground measurement data available at the scale of captured images due to the cost and labour. Tseng et al. [58] tried to overcome the shortcoming by developing an image simulator that can generate pseudo real plant images from the available SM dissipation rate data. The study proposed CNN, which has generated less error in the prediction of soil moisture dissipation rate from the simulated test images in comparison to the traditional ML methods. It also worth noting that the CNN method was robust against the noise introduced into the simulated images.
Large scale SM monitoring has been explored based on satellite observation. DL has been proposed as an alternative to conventional physically-based models developed for SM estimation using satellite data which often perform less well due to limited processing capacity. Zhang et al. [59] used DNN for upscaling in-situ soil moisture estimation using Visible Infrared Imaging Radiometer Suite (VIIRS) raw data records (RDR). The DL model was able to achieve better accuracy than Soil Moisture Active Passive (SMAP) active sensor products and the Global Land Data Assimilation System (GLDAS) model. In another study, Lee et al. [60] also employed DNN to estimate soil moisture over the Korean peninsula using surface and thermal variables from satellite. Compared with Advanced Microwave Scanning Radiometer-2 (AMSR 2 ) and GLDAS SM products, the model showed excellent agreement with the ground measured data.
Wang et al. [61] used DBN to extract features from Fengyun-3D (FY-3D) Medium Resolution Spectral Imager-II (MERSI-II) imagery to estimate soil moisture in the Ningxia Hui Autonomous Region of China. The developed model, called SM-DBN, has outperformed the other conventional models of linear regression (LR) and ANN, based on accuracy performance in correlation with the actual ground measurement data. Ge et al. [62] compared the performance of CNN and conventional ANN for estimating in-situ SM using satellite data of L-band Soil Moisture and Ocean Salinity (SMOS) brightness temperature (TB), C-band Advanced Scatterometer (ASCAT) backscattering coefficients, the Moderate Resolution Imaging Spectroradiometer (MODIS) Normalized Difference Vegetation Index (NDVI), and soil temperature. The results showed that CNN outperformed conventional ANN with better correlation with in-situ soil moisture measurement.

Soil Water Modelling
Soil water content can be affected by many factors including soil properties, climatic changes, plant growth dynamics, etc., hence it is challenging to determine the soil water content variation accurately and in a timely manner. Modelling the soil water distribution using collected data can be beneficial for predicting in advance the moisture condition in soil by a few days earlier. This will not only support in planning an appropriate irrigation schedule, but can also reduce the impact of drought conditions [63]. However, modelling soil water distribution is not an easy task due to the complex hydrological nature of soil and the correlation with plant and environmental variables. Efficient algorithms are necessary to deal with the nonlinear complex characteristics.
DL has played an important and increasing role in soil water modelling using soil, plant and environmental measurements and has shown better performance than the conventional ML model. Song et al. [64] proposed DBN as feature extraction method combined with the macroscopic cellular automata (MCA) model to simulate spatio-temporal soil hydrological changes in an irrigated corn field using several determining environmental parameters. In comparison to conventional Multilayer Perceptron (MLP), DBN has shown better results for predicting soil water content data, with a reduced error of 18%.
Cai et al. [65] used a DNN-based regression model to predict soil moisture at a depth of 20 cm using selected meteorological variables and soil water content data. Comparison with conventional models such as linear regression, support vector machine (SVM), and ANN showed a higher correlation in DNN with the actual data.
Yu et al. [66] proposed a hybrid convolutional neural network-gated recurrent unit (CNN-GRU) for predicting soil moisture in maize root zone using input from soil moisture content and meteorological variables from five different cultivation areas. The validation results showed that CNN-RGU performs better than the CNN and GRU model alone.
In an extended forecasting period, the predicted soil moisture from the proposed models were comparable to the sensor-based soil moisture measurements. In another study, Yu et al. [67] investigated further the application of DL models and proposed a combination of a CNN-based Resnet and Bidirectional Long Short-Term Memory (BiLSTM) model for soil water content prediction at four different depths using integrated meteorological, soil water content and growth stage records data for training. The deep learning model had better prediction accuracy than the traditional machine learning models of support vector regression (SVR), MLP and random forest (RF).
DL has also been used for large scale soil moisture prediction based on satellite data such as SMAP. Fang et al. [68] developed LSTM that predicts SMAP level-3 moisture product with atmospheric forcing, model-simulated moisture, and static physiographic attributes as inputs. The results showed that LSTM outperformed conventional methods of multiple linear regression (MLR), autoregressive models (AM) and one-layer ANN. In a more recent study, Fang and Shen [69] further improved the work by introducing a novel data integration (DI) kernel to assimilate the most recent SMAP observations for near-real-time forecast of SM product. The DI-LSTM was compared to the original LSTM model and gave less error in the forecasted values compared to actual measurements.

Evapotranspiration Estimation
Evapotranspiration (ET) correspond to atmospheric water loss due to plant transpiration and soil evaporation. Crop evapotranspiration (ETc) specifically provides understanding of how fluctuations in climatic parameters can affect plant water consumption under the active phase without restriction from nutritional supply [70]. Measuring field ETc conventionally using a lysimeter is expensive, complex and labour-costly [71]. The FAO-56 [72] model is the well-established method for ETc calculation based on energy and water balance schemes at field scale shown in Equation (1): where Kc is the crop coefficient and ETo is the reference evapotranspiration. ETo incorporates the effects of weather into the ETc estimate whilst the properties of the crop which affect ETc are quantified by Kc [73]. ETo is traditionally calculated using the Penman-Montieth (PM) Equation [72] based on measured meteorological sensors. However, the method is limited to the need for large variables. Deep learning has been used for faster estimation of ETo with limited meteorological variables as inputs. Saggi and Jain [74] used optimized meteorological based processed data as input to DNN for estimation of evapotranspiration values in Punjab, Northern India. Comparison with the standard methods of Generalized Linear Model (GLM), RF, and Gradient-Boosting Machine (GBM) showed DNN's better performance, providing higher accuracy with the PM-based ETo while avoiding the overfitting issue. In another study, Ferreira and da Cunha [75] proposed that the CNN model showed good performance when used to estimate daily ETo directly from limited hourly meteorological data compared to other models such as RF, extreme gradient boosting (XGBoost), and ANN.
Afzaal et al. [76] used LSTM and BiLSTM for estimating ETo using air temperature and relative humidity as the only variables. Both DL models showed high accuracy compared to the actual ETo, with less difference in performance between the two models. Chen et al. [77] evaluate the performance of three DL methods of deep DNN, temporal convolution network (TCN) [78], and long short-term memory neural network (LSTM) for reference evapotranspiration estimation. The results showed that DL model outperformed other conventional models, and TCN and LSTM showed outstanding performance when temperature features were used.
Recently, remote sensing has been used for large scale ET estimation providing high spatial and temporal resolution. Several models and algorithms, such as Mapping evapotranspiration at high resolution with internalized calibration (METRIC), the Two-source energy balance (TSEB) model, and machine learning have been used and discussed [79]. Recently, DL learning has been introduced to improve the ET estimation based on remotely sensed data. Cui et al. [80] developed a gap-filling algorithm based on the temperature surface-vegetation index (TSVI) model to provide continuous daily actual ET estimation at regional scale. Results showed good correlation between the estimated ET from TSVI-DNN model with ground observation. García-Pedrero, et al. [81], for the first time, introduced CNN for estimating spatially distributed ET using remote sensing data without the need of a surface energy model. Comparison with data from METRIC showed satisfactory ET maps were produced by the CNN.
At individual plant scale, stress variation is expressed directly in plant transpiration (T). Measuring plant transpiration can give better understanding of the relation between plant stress and the environment. However, direct measurement of T by sap flow is expensive, difficult and time-consuming. Several studies have used DL for plant transpiration estimation using a sensory measured environmental variable. Shuaishuai et al. [82] have utilized DBN and least square support vector machine (LSSVM) to directly predict the leaf transpiration rate of strawberry in a greenhouse environment. DBN-LSSVM model outperformed two conventional models with higher prediction accuracy suggesting DBN as an efficient feature extractor. Fan et al. [83] compares several ML algorithms of SVM, XG-Boost, and ANN, including DNN, for estimating maize transpiration using meteorological variables, soil water and leaf area index as inputs. The result showed DNN outperformed other techniques by a slight margin.

Evapotranspiration Forecasting
ET forecasting can bring important insights into future climate conditions which can be beneficial in assessing plant water stress and irrigation planning. It can also compensate for the lack of data generated, due to less sensors being installed as a result of high cost and limited manpower. Recently, DL ability to consider temporal features in the data has been utilized for accurate ET forecasting. Ferreira and da Cunha [84] proposed a CNN-LSTM combination for multi-step ahead prediction of daily ETo with relatively good performance achieved in comparison with traditional ML models and CNN and LSTM alone. Lucas et al. [85] also explored the use of three different architectures of CNN in the prediction of ETo time series. Using real climatic data, the models were able to predict daily ETo with lower uncertainty. Yin et al. [86], on the other hand, considered a hybrid BiLSTM to forecast short-term daily ETo from meteorological data collected in the semi-arid region of central Ningxia, China. The results showed that the model was able to improve the forecast performance constantly compared to the general model. DL has also been investigated as a highly capable techniques for ETc forecasting based on historical field measurement data. Chen et al. [87] conducted an attempt to estimate ETc for maize using two-years field measurements of ETc data using a lysimeter and TCN. The new form of DL, comprised of LSTM and DNN, was able to estimate maize ETc based on plant height, air temperature, relative humidity, solar radiation, leaf-area index and soil temperature better than the DNN and LSTM method alone. Elbeltagi, et al. [88] used the DNN model to estimate and predict ETc for several locations of wheat cultivation in Egypt based on recorded historical and future meteorological data. The results showed high correlation between ETc values from the standard FAO-56 method and the values predicted from the DL model.

Plant Water Status Estimation
Many techniques have been developed over the years to describe plant water status (PWS) such as relative water content (RWC) [89], equivalent water thickness (EWT) and fuel moisture content (FWC) [90]. Other techniques are based on the changes in the plant physical system as a response to water content, such as stem diameter variations (SD) [91,92]. These parameters have been used as indicators for assessing plant conditions related to drought (as water stress index) and irrigation management [93]. Recent progress has seen remote sensing measurement estimate plant water status non-destructively with various modelling methods have used to correlate the measurement and ground measured plant water status, ranging from simple linear regression to machine learning technique [94][95][96].
The DL model is currently being investigated as a fast and robust method for PWS estimation using aggregated data collected from various sensory measurements. Fariñas et al. [97] evaluated CNN and the Random forest (RF) technique as regression model to estimate RWC in plant leaf from the transmission coefficient of frequency data collected using Non-contact resonant ultrasound spectroscopy (NC-RUS). The CNN technique utilizes the entire ultrasonic spectra collected non-invasively from the leaf while RF takes only four relevant input spectra resulting in better performance of the CNN model as shown by the higher correlation and lower median prediction error.
DL has also been used to extract features from multi-modal data to increase inversion performance. Kaneda et al. [98] combined environmental measurements with leaf wilting features extracted from plant images using a CNN-based model as multi-modal feature extractor to predict changes in stem diameter to assess water status in the tomato plant. Comparison of several conventional regression algorithms showed the superiority of the proposed model due to its low prediction error. The method proposed, however, has not considered the temporal aspect of the data which could further increase the prediction accuracy. In more recent years, Wakamori et al. [99] improved the existing work by proposing an LSTM network with clustering-based drop (C-Drop). The C-Drop neural network supports the regression analysis by giving the environmental feature an equal consideration. Based on the results, the new proposed model improved the prediction accuracy by 21% compared to the previous method.
Plant spectral information has been used to estimate PWS based on the developed vegetation indices techniques, such as WI and NDWI [100]. Moreover, ML has been widely investigated as the method used in the process of finding the relevant spectral bands that give high correlation with the PWS parameters [90,[101][102][103][104]. DL, however, has been hardly applied to this problem. Rehman et al. [105] were possibly the first to have used a 1D-convolutional neural network to extract the mean spectral reflectance from the hyperspectral image to predict RWC in maize. The model based on the updated Inception module called DeepRWC was able to learn the features from hyperspectral data without the need for spectral selection or dimensionality reduction. DeepRWC achieved better accuracy in comparison with the two standard approaches of partial least square regression (PLSR) and SVM.
In another study, Rao et al. [106] have integrated a physical model with RNN to accurately predict live moisture fuel content (LMFC), defined as the mass of canopy leaf water per unit of dry biomass, which is a key parameter for assessing wildfire risk. The model takes input variables from microwave remote sensing of Sentinel-1 backscatter and Landsat-8 optical reflectance, with additional soil properties, for training on LFMC field samples. They found that the DL model performance exceeds the conventional process-based methods and that precision can be improved with the addition of the microwave RS data.

Plant Water Stress Identification
Plant water stress identification refers to water stress detection in plants by distinguishing water stress and non-water stress in the labelled plant. A computer vision system for plant water stress identification has been developed over the years using conventional ML algorithms such as ANN [107], Adaptive neuro-fuzzy classifier [108], and Gradient boosting decision tree (GBDT) [109]. However, the need for feature extraction from the images and for selection of conventional ML has limited the performance of the analysis.
DL superiority in image recognition by automatically learning from patterns has been leveraged in plant water stress identification, with CNN becoming the standard model for automated feature extraction and transformation. An et al. [110] were possibly the first who attempted to identify plant water stress in maize using pre-trained CNN: Resnet50 and Resnet120 based on three treatments of stress: optimum moisture, light drought, and moderate drought stress. The performances of the models showed between 91-98% accuracy with the fastest training time close to 8 min. Compared to manually extracted features and classification using the conventional GBDT model, CNN performs better with Resnet50 showing the highest accuracy.
Jiang et al. [111] attempted to improve the performance of the identification by introducing a Gabor filter to extract texture features from the same dataset used in [110] for the Zhengdan maize variety and reduced the dimension before feeding to the pre-trained CNN model of Alexnet. The results showed slight improvement in the model accuracy but with good adaptation to illumination changes and angle transformation. Zhuang et al. [112] further investigated the Zhengdan maize dataset by designing their own CNN architecture, that was relatively simple but efficient in feature extraction, by selecting the most relevant feature maps. The proposed method used SVM as final classifier and the results showed better accuracy achieved in comparison with the other well-known CNN models of VGG16, ResNet50, Xception and xPLNet.
Chandel et al. [113] evaluated instead the three different popular CNN models of AlexNet, GoogLeNet and Inception V3 for plant water stress classification of maize, okra and soybean images collected from different growth stages. The performance of GoogLeNet was found to be superior compared to others with an accuracy of 98.3%, 97.5% and 94.1% for maize, okra and soybean, respectively.
Realizing the potential of CNN as an image classifier for plant water stress identification, recent studies have been assessing the applicability of such techniques to be used in field conditions. Plant images were collected at certain heights providing wide monitoring of multiple plants under the same cultivation area. Soffer et al. [114] used the pre-trained CNN of VGG16 for real time classification of water stress treatment of five different groups of corn. The proposed method used the image data concatenated with the plant images as an input to the DL model. The results showed excellent accuracy for five treatment classifications, although most studies showed less performance when dealing with a higher number of classes. Freeman, et al. [115] assessed the use of cloud-based CNN training to identify plant water stress across six container-grown ornamental shrub species. Using near infrared images captured in the field from small, unmanned aircraft system (sUAS), the framework was able to achieve high accuracy performance despite constraints of small sample size, low image resolution, and lack of clear visual differences.
Although CNN is the most commonly used model in image processing, the sequential characteristic of the time-series image is not considered in the previous classification tasks. The temporal dependence relationship of a specific plant growth period may hold information that can be used for water stress identification. In consideration of the temporal feature, Li et al. [116] applied for the first time BiLSTM networks to extract features from sequential digital images of maize and sorghum. BiLSTM has shown good performance for plant water stress identification compared to the vanilla CNN, LSTM and RNN model. They also found that BiLSTM can identify plant water stress at an early stage and with mild stress, where less variations can be noted in the plant. Table 1 is the compilation of the highlighted methods, input data, models with remarks based on DL applications.

Discussion and Future Perspectives
Deep learning is a powerful and versatile tool that can be applied to solve a wide range of problems either through regression or classification analysis. The superior performance compared to conventional ML makes it a suitable method for modelling highly complex plant water stress conditions. Remote sensing is another advanced method with much potential in plant water stress assessment, providing high spatial, temporal and spectral resolution. However, it has been noted that DL application in remote sensing plant water stress assessment is still at the initial stage of development. This is especially true in the case of ETc estimation. Many strategies can still be explored and a few studies have started to pave the way in utilizing DL for solving problems in the remote sensing application in reference to ET estimation [80,81,117].
Remote sensing thermal and hyperspectral images have been extensively investigated for plant water stress assessment based on vegetation indices and the crop water stress index technique. However, challenges remain regarding the effect of plant geometric structure and image background which could influence the analysis. RGB image has been proposed for remote sensing estimation of plant water content estimation utilizing morphological features (Leaf Area) [118,119] and colour [120] to estimate plant physiological response to water stress. It is preferable due to the low-cost and easy to use sensor [120]. DL has been successfully used for segmenting plant and background in RGB images for plant water stress detection with good results [121]. However, lack of spectral bands, with only three basic colours in the image, limits detailed information about the status of the plant [122]. Khan et al. [123] proposed an innovative approach to using DL for predicting vegetation index from an RGB aerial image. The development shows that it is possible to have a lowcost and efficient method for plant water stress assessment while improving agricultural production with the help of an advanced algorithm such as DL.
The inherent bottleneck of the DL technique has always been the size and quality of the datasets available for training and validating the model. Data collected through remote sensing mostly involves images, and although satellite-based images are commercially available, data on aerial plant images under water stress is limited leading most of the studies to use self-collected images while adopting transfer learning in their application. Transfer learning is a method of using a pre-trained model for other applications which can be used to train a new model from scratch using available models. It has already been successfully used for solving the problem of small sample size with improved accuracy. For a robust model to be developed, large varieties of datasets are required including data collected from different sensors and plant species.
Despite successful applications and superior performance achieved relative to conventional ML methods, DL utilization in water stress assessment is still at the beginning phase and there is still a lot to explore. Some of the future prospects are discussed together with the possible key issues.

Deep Learning for 3-Dimensional Data
The combination of deep learning and 3D data thus far has been infrequently explored despite the possibilities of improving assessment. Most studies have only used either one or two-dimensional data with the DL processing algorithm. Integration of data from soil, atmospheric and plant-based measurements would create a three-dimensional input that can improve the accuracy of assessment [14].
Furthermore, there are sensors that can collect three-dimensional data that can generate more information for plant water stress assessment. For example, a hyperspectral image which contains spatial and spectral information can simultaneously be utilized for accurately assessing plant water status. Joint spectral and spatial information from remote sensing data has been utilized for improved the prediction task using a deep learning approach [124,125]. The application of 3D digital RGB image data to DL is also foreseeable in the future as a 3D image can be used to extract features like the bending angle of a leaf that can be related to water stress [99].
Using advanced data can come with challenges. Firstly, the inaccuracies in 3D image reconstruction can affect the features extracted by the DL model resulting in subpar performance. Secondly, the high cost of the advanced sensors might limit the adoptability of such a method. This in turn will reduce the data production which then might further limit the much-needed data varieties pertinent to the successful learning of the DL model.

Plant Varieties Challenge
Varieties of plant species have always been a challenge when using ML for plant water stress detection. A specific model developed for a particular plant species usually does not hold the same accuracy when trained with other varieties, as the physiological responses of plants to water stress and their relative importance for crop productivity vary with species, soil type, nutrients and climate [115].
There is also the need to study the effect of variations that occur within the same species but cultivated in different locations. Although plants may look similar, they vary widely in both shape, chemical composition and the concentration of water in the leaf intercellular spaces due to environmental influences. Thus far no study has been found to have investigated this case, although the result can be significant for improving the scalability of the technique. Theoretically, DL's huge learning capabilities can be used to cater for plant varieties provided sufficient data is available. Khaki et al. [126] have used CNN to identify corn hybrids based on tolerance to environmental stress from the inputs of yield, soil and weather data.
Unfortunately, unbalanced data can occur due to the fact that isohydric plants have better control mechanisms for lack of water and can last longer, while anisohydric plants which have a less effective control mechanism will die within days without water [127], although the use of a deep learning model of Generative Adversary Network (GAN) may provide the required dataset through synthetically generated data [128].

Conclusions
This paper presents a comprehensive review of plant water stress assessment in which cutting-edge sensor fusion technologies have been applied at a rapid pace contributing significantly towards improving agricultural production, plant breeding, efficient agricultural water consumption and a fast method for monitoring forest wildfires. Huge and complex data generated from the sensors require advanced analytical algorithms for processing data representation. Deep learning sensor fusion as a state-of-the-art technique has been implemented as model inversion for the estimation of plant water stress parameters and identification. As seen from the literature, DL's high learning capability is definitely useful for fast processing of field data from multisource sensors. DL was also successfully applied as an advanced tool for image processing with huge potential for fast detection of plant water stress in the field. However, sufficient datasets and variations of sample are required for creating a robust method. Combining DL with an advanced data type such as 3D image can be considered for increasing this accuracy.