Patch-TS: A Fast and Accurate PatchMixer-Based Model for Medium- and Long-Term Sap Flow Prediction with Environmental Factors

Li, Yane; Hu, Yunhao; Wang, Weibo; Ren, Zhen; Weng, Xiang; Feng, Hailin

doi:10.3390/f16040606

Open AccessArticle

Patch-TS: A Fast and Accurate PatchMixer-Based Model for Medium- and Long-Term Sap Flow Prediction with Environmental Factors

by

Yane Li

^1,2,3,†

,

Yunhao Hu

^1,†,

Weibo Wang

¹,

Zhen Ren

¹,

Xiang Weng

^1,2,3 and

Hailin Feng

^1,2,3,*

¹

College of Mathematics and Computer Science, Zhejiang A&F University, Hangzhou 311300, China

²

Zhejiang Province Key Think Tank, Institute of Ecological Civilization, Zhejiang A&F University, Hangzhou 311300, China

³

Institute of Carbon Neutrality, Zhejiang A&F University, Hangzhou 311300, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Forests 2025, 16(4), 606; https://doi.org/10.3390/f16040606

Submission received: 16 January 2025 / Revised: 17 March 2025 / Accepted: 28 March 2025 / Published: 30 March 2025

(This article belongs to the Section Forest Inventory, Modeling and Remote Sensing)

Download

Browse Figures

Versions Notes

Abstract

In this study, we proposed a fast and accurate PatchMixer-based framework (Patch-TS). After the data were processed, which included missing values and normalization, the environmental factors were selected via the Pearson correlation coefficient method. Then, the newly developed sap flow prediction model was trained. The resulting data demonstrated that the coefficient of determination (R²), mean squared error (MSE), and mean absolute error (MAE) are 0.921, 0.00824, and 0.0497, respectively. The R² of Patch-TS further improved to 0.929 after 7 factors were extracted via the Pearson correlation method. Furthermore, we comparatively analyse the mitigating effects of RevIn (Reversible Instance Normalization) and Dish-TS on data drift. In addition to the predictive performance of the models under different prediction windows, Patch-TS outperforms the other models. The results demonstrate that the model developed in this paper is an effective tool for accurately predicting sap flow, which is a valuable contribution to the practical management of trees and forests.

Keywords:

sap flow; environmental factors; long-term prediction

1. Introduction

As an important resource for production and life, water resources are crucial for the sustainable development of human society and ecosystems [1]. The freshwater resources on earth account for only approximately 2% of the total water on earth, and according to the amount of freshwater utilization, by 2050, 2 billion people living in 44 countries may suffer from water shortages, 95% of whom may live in developing countries [2], and the terrestrial water cycle process has become a major important topic at present. Evapotranspiration is an important component of the terrestrial water cycle [3], and plant transpiration, as the largest terrestrial water flux, accounts for 80% to 90% of terrestrial evapotranspiration [4]. Sap flow is a key link in the soil–plant–atmosphere continuum, and approximately 98% of the absorbed water is returned to the atmosphere through leaf transpiration, which determines the transpiration of the whole plant and can reflect water transport in the plant. Therefore, the measurement of sap flow can be used to accurately estimate the transpiration of trees and then evaluate the transpiration of regional vegetation, which is highly important for understanding the application of water resource management in terrestrial ecosystems and evaluating water use efficiency and health status. Furthermore, precise estimation of sap flow facilitates the examination of transpiration and hydrological alterations, optimizing water use and irrigation scheduling [5].

The factors affecting daily changes in sap flow are mainly environmental factors and a tree’s own physiological characteristics, of which environmental factors can be subdivided into meteorological factors and site conditions. Many studies have shown that the air temperature and humidity, soil moisture content, saturated water vapor pressure difference, solar radiation, and other meteorological factors are the main factors affecting sap flow [6,7,8,9]. The influence of environmental factors on sap flow essentially influences the transpiration rate and transpiration amount of trees. Lin et al. found that water retained on leaves after rainfall, dew, or fog can affect transpiration, and there was a higher transpiration inhibition rate in broadleaf species than in conifer species for the same amount of leaf-retained water [10]. Among a tree’s own physiological characteristics, the size of diameter at breast height and crown width can affect the photosynthesis of plants, and vessel density, size, and fibrotic xylem characteristics are the physiological structural factors affecting stem sap flow [11]. Short-term prediction of sap flow can help agroforestry managers understand the water demand and water consumption of plants [12], rationalize the time and amount of irrigation [13], monitor tree health, and detect water deficits and other physiological problems in time [14]. The long-term prediction of sap flow can provide data support for tree selection [15] and ensure the health and productivity of forests [16].

Research shows that a time lag exists between the occurrence of changes in sap flow and that of changes in the surrounding environment. In some cases, changes in sap flow occur before or after changes in the environment For example, Wang et al. reported that the daily peak of sap flow lagged 2–4 h behind the peaks of air temperature, water and air pressure deficit, and net radiation [17]. Ford et al. reported that older trees (and thus typically larger and taller) had longer time lags than younger trees did (and thus typically smaller and shorter) [18]. Tu et al. improved the performance of a back-propagation neural network by incorporating a phenology index and hysteresis effects, with a coefficient of determination (R²) and accuracy of fit (Acc) greater than 0.9 and 80%, respectively [19]. Juan Carlos Suárez et al. used a mixed linear model to predict not only the differences in the sap flow of Theobroma cacao in three different agroforestry composite systems but also the nocturnal transpiration of the trees and the nocturnal reverse sap flow [20]. Li et al. combined the correlation between sapflow and environmental factors as well as the time lag effect with a back propagation neural network (BPANN). The BPANN model (R² = 0.90) had a better fitting performance than the MLR (Mixed Logistic Regression) (R² = 0.8915), which more reasonably explains the nonlinear relationship between transpiration and the control factors [21]. Ouyang et al. developed a copula-based method to predict sap flow based on the readily available vapor pressure difference (VPD), which produced reliable statistical measurements that successfully replicated typical diurnal sap flow patterns [22]. Paulína Nalevanková et al. used a simple linear algorithm (LM) to predict European beech (Fagus sylvatica) sap flow, achieving a better result than other more sophisticated machine learning methods [23]. Zhao et al. designed a classical autoregressive model to compare and analyse the performance of univariate and multivariate models for predicting Eucalyptus spp. sap flow, and the results revealed that considering both seasonal factors and exogenous variables can improve the accuracy of daily sap flow prediction [24]. Peng et al. used a random forest model to build a grape sap flow prediction model, and the coefficients of determination of the model and the Willmott consistency index exceeded 0.78 and 0.90 [25]. Liu et al. simulated the prediction of sap flow in sandy pears (Pyrus pyrifolia) via an artificial neural network. The results revealed that the correlation coefficient, mean relative error, and root mean square error were 0.953, 10.0%, and 5.33 Ld⁻¹, respectively, which were superior to those of the multiple linear regression model [26].

In recent years, deep learning methods have been used by many researchers to predict time series data. At present, time series models based on deep learning are emerging, which can be broadly categorized into several types of models, such as attention, MLP, and convolution [27], among which the linear model has become popular after DLinear was proposed. A new idea is proposed that DLinear, a linear model, is more suitable for time series forecasting [28]. In our previous studies, we designed a CGRU structure, a CNN-GRU-BiLSTM structure, and a DLinear-based structure to predict sap flow with respect to environmental factors, all of which exhibited good performance [29,30,31]. The prediction performance of deep learning models in time series data is better than that of machine learning methods, but most of the models studied for sap flow are focused mainly on short-term prediction, and fewer experimental studies have been carried out on the long-term prediction of sap flow.

The purpose of this study was to explore a prediction model for predicting sap flow based on environmental factors in the medium and long term and to construct a real-time fast sap flow prediction model via a new linear algorithm. In the course of developing the model, we compared the LSTM (Long Short-Term Memory), GRU (Gated Recurrent Unit), Autoformer, Informer, and DLinear models, and the results showed that the linear-based model prediction performance was better than that of the other sap flow prediction models built in this paper. Therefore, the linear model PatchMixer is improved, and the new improved model Patch-TS is further improved in prediction performance.

By setting a lookback window and a prediction window, the sap flow prediction model is able to learn the history of sap flow and environmental variables to predict future sap flow. Theoretically, the longer the length of the prediction window is, the worse the model performance. The present study compares and analyses the performance of models constructed with varying prediction window lengths. The flowchart is shown in Figure 1 below.

The innovations of this paper are as follows:

This paper proposes a prediction model for predicting sap flow based on environmental factors in the medium and long term, which can better estimate the water consumption of tree transpiration and thus assess the transpiration of regional vegetation.
In this paper, we improve PatchMixer and propose a faster linear algorithm, Patch-TS, which uses a novel patch design to partition the data into multiple segments and extracts the information in the time series via depth-separable convolution.
In this work, the feature screening method, the Pearson correlation coefficient method, is considered. The screened features can better reflect the relationship with the predicted values, enhance the prediction accuracy, and improve the performance of the model.
This paper includes comparison and analysis of two methods to solve the data drift problem: RevIn and Dish-TS. This effectively mitigates the impact of data drift caused by the inconsistent distribution of data, which improved the accuracy of prediction results.

2. Materials and Methods

2.1. Data Sources and Data Processing

The data presented in this paper were selected from a publicly available dataset in SAPFLUXNET (Global Database of Sap Flow Tests). The SAPFLUXNET database was developed by the Centre for Research on Ecology and Forestry Applications and others, in coordination with Rafael Poyatos [32]. The data used for the experiments in this paper are from the kauri, Agathis australis, with data uploaded by researchers from the University of Auckland, with a total of 29,184 data points from 1 January 2014 to 31 August 2015, each consisting of 10 environmental factor data points and 1 sap flow data point, with 30 min measurement intervals. Trees in this dataset were measured at the University of Auckland Science Reserve, approximately 15 hectares of forest in Waipai, in the northern region of the Waitakere Ranges, west of Auckland, New Zealand [33]. The environmental factors include air temperature (Ta, °C), vapour pressure deficit (VPD, kPa), photosynthetic photon flux density (PPFD_IN, μmol·m⁻²s⁻¹), precipitation (PRECIP, mm), wind speed (WS, m/s), the shallow soil water content (SWC_SL, cm³·cm⁻³), the deep soil water content (SWC_DP, cm³·cm⁻³), shortwave incoming radiation (SW_IN,W·m⁻²), relative humidity (RH, %), and extraterrestrial radiation (EXT_RAD, W·m⁻²). These 29,184 time series records are divided into training set, testing set, and validation set in the ratio of 7:2:1, including 20,428, 5836, and 2918 records, respectively. We used spline interpolation to fill in the missing values in the dataset and triple standard deviation method (3-sigama) to remove outliers. The values of individual variables exhibit considerable variation throughout the year, with notable discrepancies between the values of different variables due to the varying basic units and substantial differences. In order to eliminate the effect of the range of values and improve model training stability and training speed, the maximum–minimum normalization technique normalizes and deflates the data to a value between 0 and 1. Formula (1) is shown below:

X^{'} = \frac{X - X_{m i n}}{X_{m a x} - X_{m i n}}

(1)

where

X

and

X^{'}

represent the original data and normalized data, respectively, and

X_{m i n}

and

X_{m a x}

denote the minimum and maximum values, respectively.

2.2. Methods

PatchMixer: This is a novel model built on convolutional structures that uses a novel patch mixing design to reveal complex temporal patterns in time series [34]. After generating patches through a sliding window of fixed length and step size, the model maps the sequence data within the patch into embedding directly through an MLP. Local and global information is extracted through depthwise separable convolution, and the results are divided into linear and nonlinear parts via multihead prediction. The linear part is mainly used to predict the trend term of the time series, the nonlinear part takes the output of the convolution through the nonlinear MLP to obtain the rest of the prediction terms, and the final result is the sum of the two.

In this study, a PatchMixer-based sap flow prediction model, Patch-TS, is proposed, and the following figure shows the basic structure of the model. Patch-TS adds a normalization method, trend decomposition, and a feature-mixing layer (Mixer Layer) to PatchMixer. After the input variables are normalized and post-processed, the input data are decomposed into seasonal and trend terms by moving average method; for the trend terms, we adopt a channel independent strategy and map them using only a single linear layer, and for the seasonal terms, the patch and embedding are completed to enter the feature mixture layer. The feature-mixing layer maps the input feature domain using only a simple linear layer to better utilize the cross-variate information, and residual connectivity is also added to the feature-mixing layer in order to improve the learning efficiency of the model for deeper architectures. Finally, the output is mapped after passing through a depthwise separable convolutional layer. The seasonal and trend terms are summed after their respective outputs, and the final prediction is output through inverse normalization. The model can be roughly divided into four modules: a normalization layer, a feature-mix layer, a patch embedding layer, and a depthwise separable convolutional layer. The structure of the model is shown in Figure 2.

(1)

Data normalization: The general framework for recent time series forecasting is shown in Figure 3. It consists of three main core components: a reversible normalization layer; a temporal feature extractor, such as an attention layer, an MLP layer, or a convolutional layer; and a linear projection layer for mapping the final forecast.

Data distribution drift mainly refers to the statistical properties of the time series, such as the mean variance, which changes over time. In the time series prediction task, the training set and the test set are often divided according to time, which naturally introduces the problem of inconsistency between the data distributions of the training set and the test set, which leads to inaccurate time series prediction. RevIn and Dish-TS is a data normalization method that can be easily applied to a variety of time series models and can be divided into two parts, normalization and denormalization, to solve data drift problems [35].
In this paper, we introduce two normalization methods, RevIn (Reversible Instance Normalization) and Dish-TS, to address the problem of data drift. RevIn is a normalization method proposed by Kim [35], which is specifically used for time series forecasting tasks. The core idea is to remove the mean and variance of the data at the model input stage and restore the original distribution of the data after the model forecast output. In this way, the distributional drift of the time series is reduced, allowing the model to focus on learning the patterns in the data rather than being affected by the overall size of the data. Dish-TS is a recently proposed time series normalization method that is mainly used for long-term time series forecasting [36]. Dish-TS mainly uses multiscale normalization to address features of different temporal granularity and de-trending in order to remove the long-term trend disturbances.

(2)

Patch embedding layer: the structure is shown in Figure 4, and this layer can be divided into two parts: patch and embedding.

The patching method uses the sliding window method, which unfolds each input univariate time series X_1D through a sliding window of length P and step size S. The overlap length between neighbouring chunks is P − S, so that the final number of patches N = (L − P)/S + 2 for a one-dimensional time series of length L.
Since the CNN structure itself has alignment variance, there is no need to use positional embedding in the model. Therefore, our embedding can be represented by Equation (2), where only a single linear layer is used to accomplish the embedding operation. In the formula VE denotes value embedding, N × S denotes the input dimension of the variable, and N × D denotes the output dimension after embedding.

$E m b e d d i n g (X) = V E : x^{N \times S} \to x^{N \times D}$

(2)

(3)

Feature-mixing layer: The structure is shown in Figure 5, borrowed from the mixer layers in the TS-Mixer model, which can be divided into two main parts: time-mixing MLP and feature-mixing MLP. The MLP is applied alternately in the time and feature domains to better utilize the cross-variate information. To make the model more effective in learning the deep architecture, we also add residual connections to it. After repeated tests, we found that the time-mixing MLP does not perform well in predicting sap flow, so we discarded the time-mixing MLP and retained only the feature-mixing MLP.

(4)

Depthwise separable convolutional layer: In Figure 6, a specific type of grouped convolution is used in depthwise convolution, where the number of groups is equal to the number of patches, denoted as N. In order to expand the receptive domain, we use a larger kernel size equal to the default patch step S, so K = 8. In this process, each of the N patches in the input feature maps are individually convolved. This operation generates N feature maps, each corresponding to a specific patch. These feature maps are then sequentially concatenated to obtain an output feature map with N channels. Depthwise convolution effectively uses group convolution kernels that are identical for patches sharing the same spatial location. This allows the model to capture potential periodic patterns in the patches.

Pointwise convolution is shown in Figure 7. Since depthwise convolution operations may not effectively capture feature correlations between patches, temporal interactions between patches are implemented using pointwise convolution after depthwise convolution. In pointwise convolution, the convolution kernel size K = 1, pointwise convolution acts only on the channel dimension without affecting the temporal dimension, and residual connectivity is also used to enhance the gradient mobility of the model, making the training more stable.

2.3. Performance Assessment Indices

In this experiment, the MSE, MAE, and R² are used to evaluate the performance of the sap flow prediction model, as shown in Formulae (3)–(5).

MSE = \frac{1}{m} \sum_{i = 1}^{m} {(y_{i} - {\hat{y}}_{i})}^{2}

(3)

MAE = \frac{1}{m} |\sum_{i = 1}^{m} (y_{i} - {\hat{y}}_{i})|

(4)

In Formulae (3) and (4),

y_{i}

represents the true value, and

{\hat{y}}_{i}

represents the predicted value.

R^{2} = 1 - \frac{\sum_{i} ({\hat{y}}_{i} - y) i^{2}}{\sum_{i} {({\bar{y}}_{i} - y_{i})}^{2}}

(5)

In Formula (5),

y_{i}

represents the true value,

{\hat{y}}_{i}

represents the predicted value,

{\bar{y}}_{i}

represents the sample mean,

\sum_{i} ({\hat{y}}_{i} - y) i^{2}

represents the prediction error, and

\sum_{i} {({\bar{y}}_{i} - y_{i})}^{2}

represents the mean error.

3. Results

3.1. Sap Flow Analysis and Data Dimensionality Reduction

As shown in Figure 8, the sap flow exhibited a cyclic variation, with a slow downward trend from January to July 2014, a gradual increase from August to December, and a similar slow downward trend from January to July 2015.

Correlation analysis: After the data were preprocessed, the correlations between sap flow and the environmental factors were analysed mainly via the Pearson correlation coefficient method. It is a statistic used to measure the degree of linear correlation between two variables X and Y. It is a measure of the degree of linear correlation between two variables [37]. In practice, the Pearson correlation coefficient is widely used in scientific research, data analysis, and machine learning to quantify the strength and direction of the linear relationship between two variables. On the basis of the results of the analysis, a heatmap is drawn, as shown in Figure 9 below, and the results reveal that sap flow is positively correlated with Ta (air temperature), VPD (vapour pressure deficit), PPFD_IN (photosynthetic photon flux density), SW_IN (shortwave incoming radiation), EXT_RAD (extraterrestrial radiation), and WS (wind speed) and negatively correlated with RH (relative humidity) and that the correlations with PRECIP (precipitation), SWC_SL (the shallow soil water content), and SWC_DP (the deep soil water content) are not significant.

Therefore, seven environmental factors, TA, VPD, PPFD_IN, WS, SW_IN, RH, and EXT_RAD, were ultimately selected to construct the sap flow prediction model.

3.2. Performance Comparison of Sap Flow Prediction Models

To confirm and compare the effectiveness of the sap flow prediction models, we choose the classical LSTM, GRU, transformer, transformer, and autoencoder models for comparison. Ten environmental factors with a lookback window of 96 and a prediction window of 720 were selected for the long-term prediction experiment, and the other parameters are shown in Table 1. Table 2 shows the results of the long-term prediction, and the evaluation indices are the MSE, MAE and R².

As shown in Table 2, Patch-TS performs best in long-term forecasting, and the new baseline model, PatchMixer, outperforms the other comparative models developed in this paper in terms of MSE, MAE, and R². The performance of the new model, Patch-TS, improves on that of PatchMixer and shows significant improvement over the pre-improvement model. DLinear outperforms the Attention class of models in the long-term prediction of sap flow, which suggests that the trend decomposition is sensitive to changes in the lookback window and provides a better linear fit in the long-term prediction and improves the prediction accuracy. Autoformer and Informer have some drawbacks in long-term prediction tasks. This may be due to the fact that the attentional mechanism suffers from attenuation problems when modelling long sequences, making it difficult to capture ultra-long-term dependencies effectively, and Autoformer and Informer may be more suitable for short-term forecasting. LSTM and the GRU are not as good as the other models in long-term prediction, which may be due to their difficulty in capturing long-term dependencies, and their efficiency is affected by the sequential computation paradigm, which tends to lead to error accumulation as the prediction length increases; thus, these two models are less suitable for long-term prediction. The performance of the former class model is less different from that of the linear model, but the training time is much longer than that of the linear model, which also shows the advantage of the linear model in the time series task.

3.3. Analysis of Data Drift

The data drift problem is one of the major challenges in time series forecasting. In this paper, we introduce RevIn and Dish-TS, which are more popular recently, to solve the data drift problem. In order to verify the effectiveness of the two normalization methods, RevIn and Dish-TS, we constructed a new model named Patch, which removed the normalization module in Patch-TS. The long-term prediction model of trunk sap flow was constructed using PatchMixer, Patch, and the two normalization methods, and the results are shown in Table 3 below.

This table compares the effects of two different normalization methods (RevIn and Dish-TS) on the structure of Patch and PatchMixer, and Patch combined with Dish-TS normalization substantially improves the long-term prediction performance compared to PatchMixer. The structure of the Patch model, with or without the incorporation of the normalization method, is slightly better than that of PatchMixer, indicating that the improved model has better feature extraction capability in long-term prediction tasks and has further optimized architectural design compared to PatchMixer. Overall, the Dish-TS normalization method is more effective than RevIn. On the Patch model, Dish-TS reduces MSE by 9.3% and MAE by 4.4% compared to RevIn, and Dish-TS is also slightly better than RevIn on PatchMixer, indicating that Dish-TS is more effective compared to RevIn for long-term time series. The normalization effect is also stronger, which can more effectively mitigate data drift and improve long-term prediction stability.

3.4. Ablation Study

As can be seen from the previous section, Patch-TS is constructed based on the Patch mechanism model PatchMixer, with the addition of three new modules: normalization, trend sequence decomposition, and feature mixture layer. In order to verify and evaluate the effect of these three modules to enhance the predictive ability of the model, we conducted ablation experiments, and the results are shown in Table 4 below.

The addition of modules A, B, and C effectively improves the metrics, in which the sequence decomposition has the greatest effect on the model performance, MSE decreased by 17.6% compared to the pre-improvement period, and R² improved significantly. Decomposition of the time series enables the model to learn the long- and short-term trend information more clearly, and the sap flow data have obvious trends and seasonality. The decomposition helps the model to capture the long-term dependence and improve the prediction performance. The normalization module (Dish-TS) has a relatively small improvement, probably because the sap flow data are generally smooth, and the impact of data drift is not significant. The feature-mixing module (mix layer) improves less than the sequence decomposition but still contributes more than Dish-TS by improving the feature extraction method of PatchMixer and enhancing the interaction of temporal features for optimization. Finally, these three improvements are added to the model at the same time for experimentation, and the R² is improved by 1.54% over the pre-improvement value. Based on the experimental results and data analysis, this paper successfully validates the three improvements, all of which achieve the expected results. The improved algorithms improve the prediction performance and have certain feasibility.

3.5. Effect of Feature Selection on Sap Flow Prediction Models

To analyse the effect of feature selection on the sap flow prediction model, we constructed a sap flow prediction model using seven features selected through correlation analysis and all ten features. The results of the comparative and analysis are presented in Table 5.

As shown in Table 5, the performance of all models improved more significantly after feature dimensionality reduction, indicating that feature screening effectively reduces redundant information and improves the model generalization ability. PatchMixer and Patch-TS are most affected by feature selection when compared to 10 features, with MSE reduced by 14.8% and 18.8%, MAE reduced by 12.3% and 13.9%, and R² improved by 1.1% and 0.8%, respectively. Compared to these two models, the enhancement of DLinear, Autoformer, and Informer is smaller, and the optimization effect brought by dimensionality reduction is limited. This suggests that feature screening of environmental factors is effective in long-term sap flow prediction tasks. The correlation analysis revealed a more pronounced relationship between the seven environmental factors and the predicted values of sap flow, which improved the predictive performance of the model.

3.6. Performance for Different Prediction Window Lengths

The lengths of the lookback and prediction windows are key parameters for model construction, and different prediction window lengths are analysed for their effects on the performance of each prediction model. Therefore, we constructed sap flow prediction models with different prediction window lengths for the same lookback window, using 10 factors and maintaining consistency in the remaining parameters for each model. The evaluation indicators are shown in Table 6 below and are plotted in Figure 10.

As shown in Figure 10, the performance of the models is affected by the length of the prediction window regardless of whether the model used is changed or not. At a prediction window length of 96, the performance of all models is better, and as the prediction window length increases to 720, the error of all models increases and R² decreases, that is to say, the longer the prediction window is, the performance of the sap flow prediction models gradually becomes worse, which is due to the fact that the longer the prediction window is, the weaker the correlation between the environmental data of the historical information at the initial moment and the sap flow, and it is difficult for the models to extract the effective features. Furthermore, although the problem of data drift is mitigated by the normalization method, the impact of the problem also remains, especially in long-term prediction. In addition, since long-term prediction requires the model to infer complex laws from limited laws, it is easy to lead to overfitting problems.

From Table 6, we can see that Patch-TS performs best under all prediction windows, and especially shows stronger robustness in the long-term prediction task. At 96 steps, PatchMixer’s performance is similar to that of Patch-TS, but at 720 steps, PatchMixer’s performance decreases more than that of Patch-TS, which also shows that PatchMixer is not as adaptable as Patch-TS for long-term prediction, and the additional improvement module improves the long-term prediction stability. DLinear performs stably on the medium-term prediction task but has limited long-term prediction capability, while Autoformer and Informer perform poorly in long-term prediction, and the Transformer structure has difficulties in modelling long-time dependencies, which leads to a decline in long-term prediction performance.

Figure 11 shows the prediction result plots of PatchMixer and the improved Patch-TS model for the prediction model at a prediction window of 720, which demonstrates the applicability of the improved model for building sap flow prediction models.

4. Discussion

In this study, different sap flow prediction models were constructed, and the following aspects were discussed to improve the performance of the sap flow prediction models.

In this work, the model is improved in three aspects: the data are decomposed into seasonal and trend terms to make the data smoother and more conducive to forecasting; the design of the dual forecasting head integrates linear and nonlinear components to better simulate the future curves; the MLP is applied to the feature domain to better extract the feature information; and the normalization operation is performed on the data to reduce the effect of data drift on the time series forecasting and to improve the forecasting accuracy.

First, for model selection, LSTM, GRU, Autoformer, Informer, DLinear, PatchMixer and the improved Patch-TS algorithm were chosen to build the sap flow prediction model, and the results show that the improved algorithm can predict the sap flow more accurately and quickly. The Patch-TS algorithm is compared with LSTM, GRU, Autoformer, Informer, DLinear, and PatchMixer by 21.98%, 29.9%, 4.3%, 7.46%, 2.33%, and 1.54%, respectively.

Second, the correlations between sap flow and its environmental factors were analysed via Pearson correlation analysis, and the features with greater correlations were selected for constructing the sap flow prediction model. The performance of the sap flow prediction models constructed after screening was basically greater than that of the model constructed with all the environmental factors, which provides ideas for subsequent studies to select different feature extraction methods for comparative analysis.

Data drift has always been a major challenge in time series forecasting problems, and in order to address the problem of inconsistent distribution of statistical attributes such as mean squared deviation in the training and test sets in sap flow datasets, we compare two mainstream methods used to solve the distributional bias problem—RevIn and Dish-TS. The results show that Dish-TS works better than RevIn in this experiment.

Fourth, this paper analyses the impact of the length of the prediction window, and the results show that different prediction window lengths have a large impact on the model performance and that the performance of the model deteriorates as the prediction window becomes longer. This may be because as the prediction time is extended, the model performs prediction on the basis of historical data, and the influence of early data on later prediction gradually diminishes. The model may not be able to effectively capture long-term trends or cyclical changes, while the complexity increases and the number of model parameters increases, resulting in performance degradation due to overfitting or overdependence on the data.

In conclusion, the linear-based MLP model is well suited for sap flow prediction and provides support for calculating tree transpiration and predicting forest stand levels. Moreover, normalization methods such as RevIn, Dish-TS, and trend decomposition can better improve the performance of the sap flow prediction model, and they are easier to apply to other time series models to alleviate the problem of data drift and improve the model generalization ability. The feature screening method can exclude the environmental factors that are invalid for predicting sap flow, improve the computational efficiency, and enhance the model performance.

Furthermore, the following issues need to be addressed in further research. First, a single tree species was selected for the dataset, and no sap flow prediction studies were conducted for other tree species to validate the transportability of the model. Second, only the effect of environmental factors on sap flow prediction is considered in the present research; other factors such as tree diameter at breast height and crown width should also be taken into account. The sap flow prediction model proposed in this paper only estimates sap flow for a single tree, whereas the real situation requires estimation for the whole forest or ecosystem, so further research can be conducted on how to extend transpiration water consumption to the whole forest stand through the sap flow or water consumption of a single tree or multiple trees. In addition, when the sap flow is damaged, the sap flow is different from the healthy level pattern, so this is also required.

5. Conclusions

In this study, we systematically developed and optimized a tree sap flow prediction model through multiple methodological innovations. The proposed Patch-TS algorithm outperforms six other comparative models, demonstrating the superior efficiency and accuracy of the new model for sap flow prediction. Feature selection based on Pearson correlation analysis significantly improves the model performance compared to using all environmental factors, highlighting the importance of feature selection. Dish-TS is a more effective solution than RevIn in mitigating the data drift problem. Prediction window length has a critical impact on performance, with model accuracy decreasing with longer prediction times due to decreased long-term dependency capture and increased risk of overfitting. The optimized Patch-TS model not only improves the accuracy of sap flow predictions but also provides practical value for tree transpiration quantification and forest water management. Methodological contributions—including dual prediction heads, feature-domain MLP integration, and normalization strategies—provide portable solutions to a wider range of time series prediction challenges.

However, there are still limitations that require further research, and current validation is limited to a single tree species, necessitating cross-species model evaluation. Key biological parameters (e.g., DBH, crown width) are still not included in the feature set. Scaling mechanisms from single-tree predictions to ecosystem-level estimates need to be systematically developed. The ability to diagnose sap flow anomalies in damaged trees is a pressing issue. Future work should prioritize multi-tree datasets, integration of trunk measurement variables, and hierarchical modelling frameworks to link single-tree measurements to forest-scale hydrologic dynamics.

Author Contributions

Y.L.: writing—original draft, funding acquisition, methodology, investigation, visualization. Y.H.: original draft, data curation, funding acquisition, software, investigation, methodology, validation and visualization. W.W.: investigation, methodology, funding acquisition, visualization, validation, and data curation. Z.R.: investigation, software, methodology, validation, and visualization, X.W.: investigation, methodology, validation, visualization, and data curation. H.F.: conception, supervision, data curation, formal analysis, methods, and visualization. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by a grant (Research of SAP flow prediction model based on environmental factors) from the 2023 Joint Research and Development Center Project of the College of Mathematics and Computer Science from Zhejiang A&F University, National-level Undergraduate Innovation Training Program of College Students’ Innovation and Entrepreneurship Training Project (202410341067), National-level Undergraduate Innovation Training Program of College Students’ Innovation and Entrepreneurship Training Project (Research on Liquid Flow Analysis and Prediction Methods of Different Tree Species), Key R&D Projects in Zhejiang Province (2022C02020, 2022C02009, and 2022C02044), and the Research Development Foundation of Zhejiang A&F University (2019RF065).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Makanda, K.; Nzama, S.; Kanyerere, T. Assessing the Role of Water Resources Protection Practice for Sustainable Water Resources Management: A Review. Water 2022, 14, 3153. [Google Scholar] [CrossRef]
Dhakal, N.; Salinas-Rodriguez, S.G.; Hamdani, J.; Abushaban, A.; Sawalha, H.; Schippers, J.C.; Kennedy, M.D. Is Desalination a Solution to Freshwater Scarcity in Developing Countries? Membranes 2022, 12, 381. [Google Scholar] [CrossRef] [PubMed]
McColl, K.A.; Rigden, A.J. Emergent Simplicity of Continental Evapotranspiration. Geophys. Res. Lett. 2020, 47, e2020GL087101. [Google Scholar] [CrossRef]
Jasechko, S.; Sharp, Z.D.; Gibson, J.J.; Birks, S.J.; Yi, Y.; Fawcett, P.J. Terrestrial Water Fluxes Dominated by Transpiration. Nature 2013, 496, 347–350. [Google Scholar] [CrossRef] [PubMed]
Liu, Y.; Gui, D.; Chen, X.; Liu, Q.; Zeng, F. Sap Flow Characteristics and Water Demand Prediction of Cash Crop in Hyper-Arid Areas. Agric. Water Manag. 2024, 295, 108767. [Google Scholar] [CrossRef]
Chang, X.; Zhao, W.; He, Z. Radial Pattern of Sap Flow and Response to Microclimate and Soil Moisture in Qinghai Spruce (Picea crassifolia) in the Upper Heihe River Basin of Arid Northwestern China. Agric. For. Meteorol. 2014, 187, 14–21. [Google Scholar] [CrossRef]
Chen, X.; Zhao, P.; Hu, Y.; Zhao, X.; Ouyang, L.; Zhu, L.; Ni, G. The Sap Flow-Based Assessment of Atmospheric Trace Gas Uptake by Three Forest Types in Subtropical China on Different Timescales. Environ. Sci. Pollut. Res. 2018, 25, 28431–28444. [Google Scholar] [CrossRef]
Hayat, M.; Zha, T.; Jia, X.; Iqbal, S.; Qian, D.; Bourque, C.P.-A.; Khan, A.; Tian, Y.; Bai, Y.; Liu, P.; et al. A Multiple-Temporal Scale Analysis of Biophysical Control of Sap Flow in Salix psammophila Growing in a Semiarid Shrubland Ecosystem of Northwest China. Agric. For. Meteorol. 2020, 288–289, 107985. [Google Scholar] [CrossRef]
De Blécourt, M.; Gröngröft, A.; Thomsen, S.; Eschenbach, A. Temporal Variation and Controlling Factors of Tree Water Consumption in the Thornbush Savanna. J. Arid Environ. 2021, 189, 104500. [Google Scholar] [CrossRef]
Lin, M.; Guan, D.; Wang, A.; Jin, C.; Wu, J.; Yuan, F.; Lin, M.; Guan, D.; Wang, A.; Jin, C.; et al. Impact of Leaf Retained Water on Tree Transpiration. Can. J. For. Res. 2015, 45, 1351–1357. [Google Scholar] [CrossRef]
Rita, A.; Cherubini, P.; Leonardi, S.; Todaro, L.; Borghetti, M. Functional Adjustments of Xylem Anatomy to Climatic Variability: Insights from Long-Term Ilex aquifolium Tree-Ring Series. Tree Physiol. 2015, 35, 817–828. [Google Scholar] [CrossRef] [PubMed]
Amir, A.; Butt, M.; Van Kooten, O. Using Machine Learning Algorithms to Forecast the Sap Flow of Cherry Tomatoes in a Greenhouse. IEEE Access 2021, 9, 154183–154193. [Google Scholar] [CrossRef]
López-Olivari, R.; Ortega-Farías, S.; Poblete-Echeverría, C. Partitioning of Net Radiation and Evapotranspiration over a Superintensive Drip-Irrigated Olive Orchard. Irrig. Sci. 2016, 34, 17–31. [Google Scholar] [CrossRef]
Stubblefield, A.P.; Reddy, K. Measurement and Prediction of Water Consumption by Douglas-Fir, Northern California, USA. Ecohydrology 2022, 15, e2388. [Google Scholar] [CrossRef]
Rahman, M.A.; Hartmann, C.; Moser-Reischl, A.; von Strachwitz, M.F.; Paeth, H.; Pretzsch, H.; Pauleit, S.; Roetzer, T. Tree Cooling Effects and Human Thermal Comfort under Contrasting Species and Sites. Agric. For. Meteorol. 2020, 287, 107947. [Google Scholar] [CrossRef]
Jiao, L.; Lu, N.; Fu, B.; Gao, G.; Wang, S.; Jin, T.; Zhang, L.; Liu, J.; Zhang, D. Comparison of Transpiration between Different Aged Black Locust (Robinia pseudoacacia) Trees on the Semi-Arid Loess Plateau, China. J. Arid Land 2016, 8, 604–617. [Google Scholar] [CrossRef]
Wang, H.; Tetzlaff, D.; Soulsby, C. Hysteretic Response of Sap Flow in Scots Pine (Pinus sylvestris) to Meteorological Forcing in a Humid Low-Energy Headwater Catchment. Ecohydrology 2019, 12, e2125. [Google Scholar] [CrossRef]
Ford, C.R.; Goranson, C.E.; Mitchell, R.J.; Will, R.E.; Teskey, R.O. Diurnal and Seasonal Variability in the Radial Distribution of Sap Flow: Predicting Total Stem Flow in Pinus taeda Trees. Tree Physiol. 2004, 24, 951–960. [Google Scholar] [CrossRef]
Tu, J.; Wei, X.; Huang, B.; Fan, H.; Jian, M.; Li, W. Improvement of Sap Flow Estimation by Including Phenological Index and Time-Lag Effect in Back-Propagation Neural Network Models. Agric. For. Meteorol. 2019, 276–277, 107608. [Google Scholar] [CrossRef]
Suárez, J.C.; Casanoves, F.; Bieng, M.A.N.; Melgarejo, L.M.; Di Rienzo, J.A.; Armas, C. Prediction Model for Sap Flow in Cacao Trees under Different Radiation Intensities in the Western Colombian Amazon. Sci. Rep. 2021, 11, 10512. [Google Scholar] [CrossRef]
Li, Y.; Chen, Q.; He, K.; Wang, Z. The Accuracy Improvement of Sap Flow Prediction in Picea crassifolia Kom. Based on the Back-Propagation Neural Network Model. Hydrol. Process. 2022, 36, e14490. [Google Scholar] [CrossRef]
Ouyang, Y.; Sun, C. A Copula Approach for Predicting Tree Sap Flow Based on Vapor Pressure Deficit. Forests 2024, 15, 695. [Google Scholar] [CrossRef]
Nalevanková, P.; Fleischer, P.; Mukarram, M.; Sitková, Z.; Střelcová, K. Comparative Assessment of Sap Flow Modeling Techniques in European Beech Trees: Can Linear Models Compete with Random Forest, Extreme Gradient Boosting, and Neural Networks? Water 2023, 15, 2525. [Google Scholar] [CrossRef]
Zhao, X.; Zhao, P.; Zhu, L.; Zhang, G. A Comparison of Multivariate and Univariate Time Series Models Applied in Tree Sap Flux Analyses. For. Sci. 2022, 68, 473–486. [Google Scholar] [CrossRef]
Peng, X.; Hu, X.; Chen, D.; Zhou, Z.; Guo, Y.; Deng, X.; Zhang, X.; Yu, T. Prediction of Grape Sap Flow in a Greenhouse Based on Random Forest and Partial Least Squares Models. Water 2021, 13, 3078. [Google Scholar] [CrossRef]
Liu, X. Simulation of Artificial Neural Network Model for Trunk Sap Flow of Pyrus pyrifolia and Its Comparison with Multiple-Linear Regression. Agric. Water Manag. 2009, 96, 939–945. [Google Scholar] [CrossRef]
Li, Z.; Qi, S.; Li, Y.; Xu, Z. Revisiting Long-Term Time Series Forecasting: An Investigation on Linear Mapping. arXiv 2023, arXiv:2305.10721. [Google Scholar]
Zeng, A.; Chen, M.; Zhang, L.; Xu, Q. Are Transformers Effective for Time Series Forecasting? Proc. AAAI Conf. Artif. Intell. 2022, 37, 11121–11128. [Google Scholar] [CrossRef]
Li, Y.; Ye, J.; Xu, D.; Zhou, G.; Feng, H. Prediction of Sap Flow with Historical Environmental Factors Based on Deep Learning Technology. Comput. Electron. Agric. 2022, 202, 107400. [Google Scholar] [CrossRef]
Li, Y.; Guo, L.; Wang, J.; Wang, Y.; Xu, D.; Wen, J. An Improved Sap Flow Prediction Model Based on CNN-GRU-BiLSTM and Factor Analysis of Historical Environmental Variables. Forests 2023, 14, 1310. [Google Scholar] [CrossRef]
Li, B.; Li, Y.; Feng, H.; Wu, B.; Zhu, Q.; Weng, X.; Ruan, Y. An Improved Model for Sap Flow Prediction Based on Linear Trend Decomposition. In Proceedings of the 19th EAI International Conference, QShine 2023, Shenzhen, China, 8–9 October 2023; pp. 179–196. [Google Scholar]
Poyatos, R.; Granda, V.; Flo, V.; Adams, M.A.; Adorján, B.; Aguadé, D.; Aidar, M.P.M.; Allen, S.; Alvarado-Barrientos, M.S.; Anderson-Teixeira, K.J.; et al. Global Transpiration Data from Sap Flow Measurements: The SAPFLUXNET Database. Earth Syst. Sci. Data 2021, 13, 2607–2649. [Google Scholar] [CrossRef]
Macinnis-Ng, C.; Schwendenmann, L.; Clearwater, M.J. Radial variation of sap flow of kauri (Agathis australis) during wet and dry summers. Acta Hortic. 2013, 991, 205–213. [Google Scholar] [CrossRef]
Gong, Z.; Tang, Y.; Liang, J. PatchMixer: A Patch-Mixing Architecture for Long-Term Time Series Forecasting. arXiv 2024, arXiv:2310.00655. [Google Scholar]
Kim, T.; Kim, J.; Tae, Y.; Park, C.; Choi, J.-H.; Choo, J. Reversible Instance Normalization for Accurate Time-Series Forecasting against Distribution Shift. In Proceedings of the International Conference on Learning Representations, Vienna, Austria, 4 May 2021. [Google Scholar]
Fan, W.; Wang, P.; Wang, D.; Wang, D.; Zhou, Y.; Fu, Y. Dish-TS: A General Paradigm for Alleviating Distribution Shift in Time Series Forecasting. Proc. AAAI Conf. Artif. Intell. 2023, 37, 7522–7529. [Google Scholar] [CrossRef]
Benesty, J.; Chen, J.; Huang, Y.; Cohen, I. Pearson Correlation Coefficient. In Noise Reduction in Speech Processing; Springer Topics in Signal Processing; Springer: Berlin/Heidelberg, Germany, 2009; Volume 2, pp. 1–4. ISBN 978-3-642-00295-3. [Google Scholar]

Figure 1. Process diagram of this study.

Figure 2. Framework of Patch-TS model.

Figure 3. The general framework for time series forecasting comprises a reversible normalization layer, a temporal feature extractor, and a linear projection layer.

Figure 4. Patch embedding layer.

Figure 5. Feature-mix layer.

Figure 6. Depthwise: capture potential periodic patterns within the temporal patches.

Figure 7. Pointwise: capture feature correlations between patches.

Figure 8. Continuous interday variation in sap flow.

Figure 9. Correlation analysis yields a heatmap of the interrelation of each variable. The closer the colour is to red, the stronger the positive correlation between variables; conversely, the closer the colour is to purple, the stronger the negative correlation.

Figure 10. Influence of different lengths of the prediction window on the model.

Figure 11. Comparison of sap flow measurements with PatchMixer and Patch-TS predictions for a prediction window = 720.

Table 1. Network parameter settings of each model.

Parameter	Method or Value
Sequence length	96
Predicted length	720
Factors	10
Learning rate	0.001
Optimization iteration algorithm	Adam
Loss function	MSE
Activation function	gelu
Epochs	100

Table 2. Performance metrics for each model.

Models	MSE	MAE	R²
Patch-TS	0.00824	0.0497	0.921
PatchMixer	0.00961	0.0543	0.907
DLinear	0.01039	0.0649	0.900
Autoformer	0.01214	0.0699	0.883
Informer	0.01483	0.0843	0.857
LSTM	0.0138	0.0761	0.755
GRU	0.01646	0.0747	0.709

Table 3. Performance comparison of models using different normalization methods.

Models	MSE	MAE	R²
Patch-	0.00910	0.0523	0.912
Patch-TS + RevIn	0.00909	0.0520	0.913
Patch-TS + Dish-TS	0.00824	0.0497	0.921
PatchMixer	0.00961	0.0543	0.907
PatchMixer + RevIn	0.00922	0.0522	0.910
PatchMixer + Dish-TS	0.00909	0.0496	0.912

Table 4. Changes in the evaluation indicators for different module models are added.

Models	MSE	MAE	R²
PatchMixer	0.00970	0.0518	0.907
Dish-TS (A)	0.00936	0.0531	0.910
Sequence decomposition (B)	0.00799	0.0486	0.917
Mixer (C)	0.00914	0.0521	0.912
A + B + C	0.00824	0.0497	0.921

Table 5. Comparison of performance for models established with factors in different numbers.

Models		MSE	MAE	R²
Patch-TS	7 selected factors	0.00672	0.0430	0.929
Patch-TS	All 10 factors	0.00824	0.0497	0.921
PatchMixer	7 selected factors	0.00817	0.0470	0.913
PatchMixer	All 10 factors	0.00961	0.0543	0.907
DLinear	7 selected factors	0.00916	0.0606	0.903
DLinear	All 10 factors	0.01039	0.0649	0.900
Autoformer	7 selected factors	0.01038	0.0651	0.890
Autoformer	All 10 factors	0.01214	0.0699	0.883
Informer	7 selected factors	0.01288	0.0719	0.863
Informer	All 10 factors	0.01483	0.0843	0.857

Table 6. Evaluation metrics for models with different prediction window lengths.

Models	Metrics	96	192	336	720
Patch-TS	MSE	0.00513	0.00630	0.00647	0.00824
	MAE	0.0357	0.0403	0.0435	0.0497
	R²	0.951	0.940	0.938	0.921
PatchMixer	MSE	0.00555	0.00712	0.00792	0.00961
	MAE	0.0386	0.0438	0.0479	0.0543
	R²	0.947	0.932	0.923	0.907
DLinear	MSE	0.00690	0.00820	0.00902	0.01039
	MAE	0.0507	0.0555	0.0601	0.0649
	R²	0.933	0.921	0.913	0.900
Autoformer	MSE	0.00801	0.00935	0.01000	0.01214
	MAE	0.0549	0.0600	0.0633	0.0699
	R²	0.923	0.910	0.903	0.883
Informer	MSE	0.00925	0.01127	0.01406	0.01483
	MAE	0.0666	0.0723	0.0815	0.0843
	R²	0.911	0.892	0.865	0.857

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, Y.; Hu, Y.; Wang, W.; Ren, Z.; Weng, X.; Feng, H. Patch-TS: A Fast and Accurate PatchMixer-Based Model for Medium- and Long-Term Sap Flow Prediction with Environmental Factors. Forests 2025, 16, 606. https://doi.org/10.3390/f16040606

AMA Style

Li Y, Hu Y, Wang W, Ren Z, Weng X, Feng H. Patch-TS: A Fast and Accurate PatchMixer-Based Model for Medium- and Long-Term Sap Flow Prediction with Environmental Factors. Forests. 2025; 16(4):606. https://doi.org/10.3390/f16040606

Chicago/Turabian Style

Li, Yane, Yunhao Hu, Weibo Wang, Zhen Ren, Xiang Weng, and Hailin Feng. 2025. "Patch-TS: A Fast and Accurate PatchMixer-Based Model for Medium- and Long-Term Sap Flow Prediction with Environmental Factors" Forests 16, no. 4: 606. https://doi.org/10.3390/f16040606

APA Style

Li, Y., Hu, Y., Wang, W., Ren, Z., Weng, X., & Feng, H. (2025). Patch-TS: A Fast and Accurate PatchMixer-Based Model for Medium- and Long-Term Sap Flow Prediction with Environmental Factors. Forests, 16(4), 606. https://doi.org/10.3390/f16040606

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Patch-TS: A Fast and Accurate PatchMixer-Based Model for Medium- and Long-Term Sap Flow Prediction with Environmental Factors

Abstract

1. Introduction

2. Materials and Methods

2.1. Data Sources and Data Processing

2.2. Methods

2.3. Performance Assessment Indices

3. Results

3.1. Sap Flow Analysis and Data Dimensionality Reduction

3.2. Performance Comparison of Sap Flow Prediction Models

3.3. Analysis of Data Drift

3.4. Ablation Study

3.5. Effect of Feature Selection on Sap Flow Prediction Models

3.6. Performance for Different Prediction Window Lengths

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI