Applicability of Smoothing Techniques in Generation of Phenological Metrics of Tectona grandis L. Using NDVI Time Series Data

: Information on phenological metrics of individual plant species is meager. Phenological metrics generation for a speciﬁc plant species can prove beneﬁcial if the species is ecologically or economically important. Teak, a dominating tree in most regions of the world has been focused on in the present study due to its multiple beneﬁts. Forecasts on such species can attain a substantial improvement in their productivity. MODIS NDVI time series when subjected to statistical smoothing techniques exhibited good output with Tukey’s smoothing (TS) with a low RMSE of 0.042 compared to single exponential (SE) and double exponential (DE). Phenological metrics, namely, the start of the season (SOS), end of the season (EOS), maximum of the season (MAX), and length of the season (LOS) were generated using Tukey-smoothed MODIS NDVI data for the years 2003–2004 and 2013–2014. Post shifts in SOS and EOS by 14 and 37 days respectively with a preshift of 28 days in MAX were observed in the year 2013–2014. Preshift in MAX was accompanied by an increase in greenness exhibiting increased NDVI value.LOS increased by 24 days in the year 2013–2014, showing an increase in the duration of the season of teak. Dates of these satellite-retrieved phenological occurrences were validated with ground phenological data calculated using crown cover assessment. The present study demonstrated the potential of a spatial approach in the generation of phenometrics for an individual plant species, which is signiﬁcant in determining productivity or a crucial trophic link for a given region.


Introduction
The occurrence dates of phenophases such as blooming, full leaf expansion, leaf coloration, or senescence are keys for the determination of phenological metrics, viz., start of the season (SOS), end of the season (EOS), and maximum of the season (MAX) of any tree [1]. These metrics can prove to be of crucial importance in the tree's productivity assessment. They can also prove significant in modeling and monitoring climate change [2], explaining the seasonal changes, determining net terrestrial carbon dioxide flux [3][4][5][6], and assessing leaf cycle functioning.
Information is meager on phenological data of individual plant species in many parts of the world [7]. More focus is placed on generating phenological metrics of forest types [8][9][10][11] rather than on a particular natural species [12,13]. Hence, evaluating the capability of satellite data for effective and reliable phenological metrics generation for one individual forest species, namely, Tectona grandis L. (teak) will be of great importance. Teak, a deciduous tree of the Verbenaceae family, is considered a valuable timber and has been widely used in India for more than 200 years due to its elegance and durability. In addition, it is an economically important tree as it has a tight grain and high oil content and tensile strength and can also be grown under various climatic regimes. Moreover, this tree occupies major worldwide areas and is also a dominant tree of tropical dry deciduous forests (DDF) of the Narmada district. Due to a paucity of ground data on this system, there is a great need to develop remote sensing techniques to characterize its biology. Phenological metrics generation of teak growing in the DDF can provide useful information about both spatial and temporal patterns of its productivity. Such forecasts can attain a substantial improvement in its productivity.
Appraisal of phenophases of any plant species mainly involves two approaches; ground-based phenology measurements and satellite-based phenology monitoring [14]. Ground-based measurements entail direct traditional insitu visual recordings of phenological events [15][16][17][18][19], manual periodic photography, or fixed-position camera-based digital repeated photography [20][21][22]. These methods, though they provide good outputs, have their limitations concerning lack of synoptic coverage and time [23]. An alternative space approach overcomes such limitations both with respect to time and coverage. Recent studies, therefore, are based on using the potential of spatial data in understanding plant phenology [24][25][26][27][28][29][30][31]. Specifically, time series normalized difference vegetation index (NDVI) data derived from moderate resolution imaging spectroradiometer (MODIS) [32][33][34][35][36][37][38] and advanced very high resolution radiometer (AVHRR) [39][40][41][42][43][44] are largely used for the generation of phenological metrics. This satellite-based vegetation index is recognized as an effective tool in the quantification of vegetation greenness [45][46][47], canopy carbon [48][49][50][51][52], and water fluxes [53][54][55]. It has proved its utility in monitoring the phenological changes in vegetation across large spatial scales and over long time periods. Many researchers have used NDVI and enhanced vegetation index (EVI) time series in the identification of different phenological parameters at both single pixel scale and large spatial scales by developing various methodologies [25,[56][57][58][59][60][61]. Studies are also available wherein Landsat time series data have been used for retrieving phenological metrics [62][63][64]. Some have built statistical relationships between NDVI or EVI and climatic parameters for monitoring the entire growing season [65][66][67][68]. The phenological parameters derived from vegetation indices may not correspond directly to conventional ground-based phenological events but do provide indications for understanding these phenophases which now have become pertinent in climate change analysis [69,70]. In addition, differences among biomes and their drivers of phenology (e.g., dry deciduous vs. cold deciduous) may necessitate different interpretations of image-derived phenophases. As such, detailed comparisons of these satellite measures and ground-based phenological events are needed across a range of biomes with different ecological drivers.

In Situ Phenological Data Collection
The study area selected for the present study is DDF of Narmada district, Gujarat, India (21.86 N, 73.56 E) ( Figure 1). This area exhibits the dominance of teak trees. Occurrence dates of different phenophase events for the year 2003-2004 were obtained from Sardar Sarovar Narmada Limited (SSNL) report [90]. The data for the year 2013-2014 were generated by monitoring ten 30 m × 30 m (900 m 2 ) homogenous teak patches at regular intervals of 20-25 days from May 2013 till April 2014. Analysis of within-population phenophase frequency at every sampling date was performed based on the modified qualitative method of Ohshan [91]. The population was randomly selected and labeled before the beginning of the sampling for ten healthy adult teak trees. During each field visit, the degree of incidence was determined for the different phenophases of teak, viz., leaf initiation, maximum greenness, and leaf fall, for which each tree was examined precisely. A frequency index was assigned for particular phenophases at the time of their presence on a tree, depending on the percentage of the crown where it occurred: 1 = presence less than 5% (leaf fall), 2 = presence between 5-25% (leaf initiation), and 3 = presence over 25% of the crown (maximum greenness).The average of the ten sampled plants' frequency indicesrepresents the calculation of the incidence of a phenophase in the whole population for each date. These ground-based phenological observations were then compared to phenological metrics derived using time series satellite data for the validation purpose.
Sens. 2021, 13, x 3 of 18 use the best reconstructed time series NDVI MODIS data to generate different phenological metrics, namely, SOS, EOS, MAX, and LOS of teak growing in Narmada forests, Gujarat, India.

In Situ Phenological Data Collection
The study area selected for the present study is DDF of Narmada district, Gujarat, India (21.86 N, 73.56 E) ( Figure 1). This area exhibits the dominance of teak trees. Occurrence dates of different phenophase events for the year 2003-2004 were obtained from Sardar Sarovar Narmada Limited (SSNL) report [90]. The data for the year 2013-2014 were generated by monitoring ten 30 m × 30 m (900 m 2 ) homogenous teak patches at regular intervals of 20-25 days from May 2013 till April 2014. Analysis of within-population phenophase frequency at every sampling date was performed based on the modified qualitative method of Ohshan [91]. The population was randomly selected and labeled before the beginning of the sampling for ten healthy adult teak trees. During each field visit, the degree of incidence was determined for the different phenophases of teak, viz., leaf initiation, maximum greenness, and leaf fall, for which each tree was examined precisely. A frequency index was assigned for particular phenophases at the time of their presence on a tree, depending on the percentage of the crown where it occurred: 1 = presence less than 5% (leaf fall), 2 = presence between 5-25% (leaf initiation), and 3 = presence over 25% of the crown (maximum greenness).The average of the ten sampled plants' frequency indicesrepresents the calculation of the incidence of a phenophase in the whole population for each date. These ground-based phenological observations were then compared to phenological metrics derived using time series satellite data for the validation purpose.

Application of Smoothing Techniques
Different smoothing algorithms were applied to the time series NDVI MODIS data to diminish noise effects: single exponential (SE), double exponential (DE), and Tukey's smoothing (TS). All three techniques are in ascending order in terms of sophistication.

Single Exponential Smoothing (SE)
SE is a simple and accessible tool for smoothing time series data. A simple average calculation is used to assign exponentially decreasing weights, starting with the most recent observations. New observations are weighted more in the average calculation than earlier observations. This method is for univariate data and does not include trend or seasonality. It uses only one parameter named alpha (α), even known as smoothing factor or smoothing coefficient. This smoothing technique is widely employed due to its simplicity and success. The notation for the same is as given below: where S i is the smoothed value of time series at time i, y i is the actual value of time series at time i, and α is the smoothing constant and 0.0 < α < 1.0.

Double Exponential Smoothing (DE)
DE is used on data sets involving seasonality and for handling trend analysis. It is an extension to exponential smoothing, adding explicit support for trends in the univariate time series. It is used when there is a linear trend in the data. This involves an additional smoothing factor along with alpha (α) parameter. This is to control the decay of the influence of the change in a trend called beta (β).
For data exhibiting linear trend as: where b 0 and b 1 can change with time at a slow pace. The basic equations named Holt's method are as below: where µ t is the exponentially smoothed value of time series at time t, y t is the actual observation of time series at time t, T t is the trend estimate, α is the exponential smoothing constant for the data, and β is the smoothing constant for the trend.

Tukey's Smoothing (TS)
TS uses running medians to provide flexible but straightforward curves and is a robust smoother. Median smoothing methods were introduced by Tukey in 1977 for extracting smooth patterns which tend to hide due to non-linear spikes in time series data [92]. Such filtering smooths any existing volatile behavior that occurs in trends or seasonal behavior. This method smooths out the data acquired from equally spaced, linearly ordered intervals such as every year, every month, every quarter, etc.
NDVI data reconstructed using all three types of smoothing techniques were compared based on their potential in removing outliers.
Performance assessment of the smoothing techniques was also carried out by calculating root mean squared error (RMSE). RMSE between observed raw NDVI time series data and predicted smoothed NDVI time series data was used for evaluating the performance of each smoothing technique. The best optimal technique that would reconstruct the best denoised data sets was considered to be the one generating the least RMSE. The best-reconstructed data was used further for extracting phenological metrics of teak by detecting the inflection point (i.e., date) when the NDVI time series begins to ascend or descend for the specific year. Occurrence dates obtained using these smoothed NDVI time series were compared to the ground-based phenological observation and these along with RMSE were considered for selecting the optimum smoothing technique.

Determination of Phenological Metrics from NDVI Time Series Data
The intra-annual variations in the NDVI time series were used as the base for determining SOS, MAX, EOS, and LOS, using the NDVI ratio. This is the derivative method where the maximum value of NDVI ratio corresponds to the greatest change of NDVI time series. Equation (5) is given as: where NDVI(t) is the NDVI value at time t, and NDVI ratio (t) is the calculated relative change at time t. SOS was determined as the time t or the day with the maximum NDVI ratio [93]. Likewise, EOS date was determined as the time t or the day having the minimum NDVI ratio. The time or duration between SOS and EOS with NDVI ratio closest to zero was identified as MAX date. Furthermore, the LOS was obtained by defining the period between SOS and EOS in each grid point.

Results and Discussion
RMSEs generated to check the potential of SE, DE, and TS are 0.072, 0.097, and 0.048, respectively. Comparison of reconstructed NDVI time series using SE, DE, and TS techniques showed that both SE and DE smoothed higher values but could not encompass all the outliers (Figures 2 and 3). These methods are comparatively less effective in accounting for missing values or correcting outliers. TS smoothing approach is identified as a more effective and robust smoothing method, bringing out high-quality NDVI time series as it fairly handled the outliers (Figure 4). The scatterplot diagram distinctly highlights the presence of outliers that needed to be removed to generate precise output. Statistically, the smoothing techniques are supposed to remove these outliers. At this point, the selection of Tukey's technique is performed in comparison to the other two smoothing techniques as it is considered to be a resistant smoothing technique which uses running medians. Despite the fact that it lacks mathematical generalization, the main purpose of the smoothing technique is to provide a general idea of relatively slow changes in values with little attention to the close matching of data values. Phenological attributes exhibit such characteristics. Further, running medians are thought to be fast exploratory tools to allow a quick view of data components. On comparing with ground data, TS again exhibited the greatest effectiveness over SE and DE ( Figure 5, Table 1). The Tukey-smoothed NDVI data provided greater vegetation phenology information, i.e., even minor changes in phenological dates which cannot be correctly identified from raw NDVI data can be obtained from Tukey-smoothed NDVI data. TS retrieved phenological data and ground phenological data showed very close values, thereby validating the TS results. Thus, TS potentially proved to be the optimal technique to best reconstruct the NDVI time series data and hence TS reconstructed NDVI time series is used for delineating shifts in SOS, EOS, MAX, and LOS of teak. Annual time series of NDVI data generated for teak enabled the differentiation of various phenological metrics such as the SOS (Figures 6 and 7), EOS (Figures 8 and 9), MAX ( Figures 10 and 11), and LOS (Figures 12 and 13). Such data applications have proven to be useful in several studies on global environmental change [94][95][96][97][98].                       For the validation of the phenological metrics generated from MODIS TS NDVI time series, the dates of different phenological occurrences are compared to ground measured crown cover. In situ crown cover assessment of 2013-2014 provides the information on different phenophases (Table 1). A plot of crown cover versus sampling dates matches the NDVI curve of the period (Figure 14). Results of the crown cover assessment can be used for the validation of the results generated through MODIS data. Average crown cover between 5 and 25% that corresponds to leaf initiation on ground and SOS on satellite data For the validation of the phenological metrics generated from MODIS TS NDVI time series, the dates of different phenological occurrences are compared to ground measured crown cover. In situ crown cover assessment of 2013-2014 provides the information on different phenophases (Table 1). A plot of crown cover versus sampling dates matches the NDVI curve of the period (Figure 14). Results of the crown cover assessment can be used for the validation of the results generated through MODIS data. Average crown cover between 5 and 25% that corresponds to leaf initiation on ground and SOS on satellite data was observed on 30 July 2013. Maximum greenness or MAX that corresponds to crown cover over 25% was noted on 27 August 2013 and leaf fall on the ground, i.e., less than 5% crown cover that corresponds to EOS was recorded on 6 March 2014. These ground-based phenological observations are comparable to phenological metrics derived using time series data, thereby validating our results. Results of in situ observations also show that teak trees considered for the present study shifted their phenophases between the years 2003-2004 and 2013-2014. Leaf initiation on the ground showed post shift of 8 days. In situ leaf fall recorded showed a delay of 24 days. Maximum greenness noted on the ground also compliments satellite-retrieved MAX results which showed a preshift of 15 days. The results highlight that phenological metrics from satellite data such as SOS, EOS, and MAX with temporal dynamics similar to those of ground phenology measurements such as leaf initiation, leaf fall, and maximum greenness through crown cover assessment can be generated.
Substantial interannual variability in the start and end of the growing season of the teak of DDF forests can be explained by studying many factors such as year-to-year variability in weather, especially climatic factors including rainfall, temperature, elevated CO 2 , altered precipitation regimes, etc. Such types of variability may be responsible for the observed shifts in phenology which ultimately can be robust indicators of the impacts of climate change. This makes derivation of phenological metrics imperative as they can serve as important inputs for better understanding of the climatic drivers controlling phenology. These alterations or shifts in phenophases can influence the climate system at both global and regional levels through feedback mechanisms of surface albedo, CO 2 fluxes, and evaporation [99]. Improved inputs of phenological measurements into global biosphere models conducted via satellite data will also enhance the understanding of temporal and spatial global carbon dynamics. This will be more useful for the areas where there exists a lack of ground phenological observations due to inaccessibility [38].

Conclusions
The present study tested the potential of three statistical techniques, viz., single exponential, double exponential, and Tukey smoothing. Comparison of these three techniques demonstrates the effectiveness of the Tukey smoothing technique in denoising the NDVI time series and removing the outliers. Statistical Tukey smoothing enhances the visibility of unique patterns present in the data and thereby aids in identifying distinct decadal shifts in the phenological metrics. Significant shifts in the growing seasons of DDF teak of Narmada district from the years 2003-2004 to 2013-2014 are identified on the basis of changes in SOS, EOS, LOS, and MAX delineated using Tukey-smoothed MODIS NDVI time series data. These shifts are validated by ground phenology measurements that are calculated using crown cover assessment. Earlier studies using remote sensing more commonly focused on phenological studies of forest types but the present study highlights the fact that satellite data has great capability and can be used for generating phenological metrics of individual plant species as well. Funding: This research was funded by ISRO RESPOND, project id OGP134" and we give thanks to MoEFCC for their added support.
Institutional Review Board Statement: Ethical review and approval were waived for this study.

Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.