Parametric Models to Characterize the Phenology of the Lowveld Savanna at Skukuza, South Africa

: Mathematical models, such as the logistic curve, have been extensively used to model the temporal evolution of biological processes, though other similarly shaped functions could be (and sometimes have been) used for this purpose. Most previous studies focused on agricultural regions in the Northern Hemisphere and were based on the Normalized Difference Vegetation Index (NDVI). This paper compares the capacity of four parametric double S-shaped models (Gaussian, Hyperbolic Tangent, Logistic, and Sine) to represent the seasonal phenology of an unmanaged, protected savanna biome in South Africa’s Lowveld, using the Fraction of Absorbed Photosynthetically Active Radiation (FAPAR) generated by the Multi-angle Imaging SpectroRadiometer-High Resolution (MISR-HR) processing system on the basis of data originally collected by National Aeronautics and Space Administration (NASA)’s Multi-angle Imaging SpectroRadiometer (MISR) instrument since 24 February 2000. FAPAR time series are automatically split into successive vegetative seasons, and the models are inverted against those irregularly spaced data to provide a description of the seasonal ﬂuctuations despite the presence of noise and missing values. The performance of these models is assessed by quantifying their ability to account for the variability of remote sensing data and to evaluate the Gross Primary Productivity (GPP) of vegetation, as well as by evaluating their numerical efﬁciency. Simulated results retrieved from remote sensing are compared to GPP estimates derived from ﬁeld measurements acquired at Skukuza’s ﬂux tower in the Kruger National Park, which has also been operational since 2000. Preliminary results indicate that (1) all four models considered can be adjusted to ﬁt an FAPAR time series when the temporal distribution of the data is sufﬁciently dense in both the growing and the senescence phases of the vegetative season, (2) the Gaussian and especially the Sine models are more sensitive than the Hyperbolic Tangent and Logistic to the temporal distribution of FAPAR values during the vegetative season, and, in particular, to the presence of long temporal gaps in the observational data, and (3) the performance of these models to simulate the phenology of plants is generally quite sensitive to the presence of unexpectedly low FAPAR values during the peak period of activity and to the presence of long gaps in the observational data. Consequently, efforts to screen out outliers and to minimize those gaps, especially during the rainy season (vegetation’s growth phase), would go a long way to improve the capacity of the models to adequately account for the evolution of the canopy cover and to better assess the relation between FAPAR and GPP.


Introduction
Vegetation regulates the carbon, energy, and water fluxes between the land and the atmosphere. Plants affect and respond to environmental and especially climatic conditions: they typically exhibit annual variations that match the seasonal evolution of temperature, water availability, solar irradiance, and other critical variables. Hence, they are also sensitive to climate fluctuations and changes. Phenology, the study of the seasonal evolution of plants, documents these variations and therefore provides evidence of the impact of climate change on the environment.
The Special Report on Emissions Scenarios (SRES) socio-economic CO 2 emission scenario A2 of the Intergovernmental Panel on Climate Change (IPCC) provides a suitable context to evaluate conditions appropriate for the African continent. Under this scenario, Friedlingstein et al. [1] estimate that climate warming during the 21st century may reduce the net ecosystem productivity of the African continent and contribute up to 38% of the global climate-carbon cycle feedback. Properly characterizing vegetation's phenology is crucial to document this evolution and to accurately parameterize the global land-atmosphere models [2]. In Southern Africa, air temperature near the ground has been increasing rapidly over the last five decades, at about twice the rate of the globally average surface temperature [3][4][5]. These changes, and associated fluctuations in the water cycle, are expected to have a significant impact on vegetation cover in the savanna biome, which occupies about 20% of the global land surface and 40% of Africa, according to Kutsch et al. [6].
Phenology, may have notable economic implications, as plant growth and development depend on the occurrence of essential weather events. Natural hazards, such as drought, flood, or fire, can jeopardize agricultural yield in a given year, or sometimes the sustainability of that economic activity. Phenology also informs on the land use and land cover type, providing useful information to manage societies, their economic systems, as well as to monitor the evolution of unmanaged areas, such as nature reserves. In the particular context of African ecosystems, savanna woody vegetation provides valuable ecosystem services [7,8], and the effective estimation of those resources is required to satisfy commitments to monitor and manage change within them [9].
Repeated observations of the density and diversity of woody plant species in African savanna regions suggest that variations in species composition and diversity are to a great extent also influenced by land use and anthropogenic disturbances [10]. In South Africa, and in the Lowveld in particular, rural households critically depend on fuelwood as a source of energy because of the prohibitive cost of gas and fuels. This socio-economic constraint has resulted in the exploitation of natural resources well beyond their capacity to regenerate, and therefore in the progressive deforestation of those environments. That situation leads to wood scarcity, as well as potential, and, in many cases, actual, conflicts over access to resources [11]; it is further exacerbated by a continuous exodus from rural areas to towns and cities, as well as by immigration from neighboring countries. As a result, settlements have been expanding and their environmental footprints have become more severe over time [12].
This paper leverages two resources to bear on this problem: remote sensing and field measurements to provide empirical evidence, and modeling to provide a rational framework for the interpretation of the data, despite the uncertainties inherent to all measurements, the irregular distribution of observations in time and missing data for various reasons. The aims of this methodological study were (i) to assess the suitability of four parametric double S-shaped functions (the Gaussian, Hyperbolic Tangent, or Logistic and Sine) to describe the phenology of savanna vegetation in the Lowveld, and (ii) to evaluate the capability of these data and models to account for the Gross Primary Productivity (GPP) of the vegetation around the Skukuza flux tower in South Africa's Kruger National Park. This is of particular interest insofar as GPP is one of the contributing factors to estimate biomass accumulation or yield (in an agricultural context).

Background
Observations of land surfaces from spaceborne platforms started in earnest in the early 1970s with the launch of the Earth Resources Technology Satellite (ERTS-1), subsequently renamed Landsat 1. In turn, the availability of multispectral measurements stimulated the development of methods to explore the significance and implications of those observations in practical applications.
Rouse Jr. et al. [16] and Reference [17] suggested that the ratio subsequently known as the Normalized Difference Vegetation Index (NDVI) was adequate to monitor "the vernal advancement and retrogradation (Green wave effect) of natural vegetation", i.e., to monitor plant phenology as we understand it today. These events mark the start of a sustained effort to monitor the biosphere on the basis of space observations [18].
Over the past 50 years, significant progress has been achieved, and thousands of publications have been published [19], to characterize the state and evolution of plant canopies using observations of the reflectance of the planet in the solar spectral range. A full review of the literature dealing with the documentation of plant phenology on the basis of remote sensing measurements is outside the scope of this paper, but it is worth noting that progress has been achieved on multiple fronts: 1. Major advances in understanding the processes of radiation transfer in complex environments resulted in the design and implementation of much improved algorithms. These aimed to effectively account for processes unrelated to vegetation (such as atmospheric clouds and aerosols, anisotropic effects, contributions to the measured signals due to variations in soil moisture, etc.) and to deliver reliable and accurate products to describe plant processes. The Fraction of Absorbed Photosynthetically Active Radiation (FAPAR) is a case in point, since it has been identified as one of the Essential Climate Variables (ECV) by the Global Climate Observing System (GCOS) [20]. These developments, in turn, led to a more quantitative representation of plant phenology (in Reference [21][22][23]). 2. The technology of measuring reflected radiation on spaceborne platforms has improved drastically, and is currently able to deliver up to hundreds of spectral bands at spatial resolutions of the order of tens of meters or less, sometimes on a daily basis, or to acquire data from multiple directions or in multiple polarizations. Data fusion with observations obtained with different instruments (e.g., LiDARs: see Reference [24][25][26]) or in entirely different spectral regions (e.g., microwaves: see Reference [27][28][29]) have also been explored, though much remains to be done in order to provide a holistic description of the environment, or of plant phenology in particular. 3. Progress in the field of information technology delivered computing, networking and data archiving resources and capabilities orders of magnitude faster and more performant than even a few decades ago, and permitted the investigation of complex issues. Although the drive to acquire more and better observations continues unabated, the major bottleneck today concerns the actual analysis of those data and the extraction of useful information from the immense archives that have already been accumulated.
As a result, Earth observing satellites have yielded a long-term record of changes in vegetation photosynthetic activity, while ground-based eddy covariance measurements are providing automated, continuous and spatially averaged measures of carbon exchanges between the surface and the atmosphere. A global increase in 'greening' has been evidenced through remotely sensed indices, such as the Leaf Area Index (LAI) and Normalized Difference Vegetation Index (NDVI), suggesting an increase in the seasonally-integrated LAI of up to 25-50%, while less than 4% of the terrestrial areas exhibit a decrease of LAI [30]. Key drivers of phenology include temperature and photoperiod in the mid-and high-latitudes [2,[31][32][33], while precipitation is regarded as the dominant driver in the tropics [34,35]. Satellite-based observations have been widely used to characterize the evolution of vegetation in the Northern Hemisphere, but other regions have received far less attention: A systematic review of phenology in Africa over the last decade reveals a recent increase in the number of phenological studies based on remote sensing but suggests that further investigations are needed across this continent, especially to focus on the relationship between climate and vegetation phenological changes [36].
Remote sensing research in the Lowveld has tended to focus on the use of mono-directional spectral measurements (from airborne and/or spaceborne sensors) which typically inform on the chemical composition of the targets (in Reference [37][38][39][40]). More recently, sporadic airborne campaigns collecting Light Detection And Ranging (LiDAR) measurements have been used to document the structural properties of the vegetation canopies, e.g., to map the differences in canopy and sub-canopy cover height and distribution [26,41,42]. One of the main advantages of using LiDAR observations is the ability to capture sub-canopy structural changes specifically to examine bush encroachment, otherwise unattainable in mono-directional spectral measurements which only account for the aerial extent of the vegetation canopy [43]. Hybrid techniques combining structural observations from LiDAR and spectral measurements from hyperspectral and multispectral data sets demonstrate the ability to classify individual tree species [25,44]. To a lesser extent, Advanced Synthetic Aperture Radar (ASAR) measurements, which can observe the surface irrespective of the cloud cover, have been used to create regional scale baseline maps of savanna woody cover in the Lowveld [9]. Cutting-edge approaches utilizing LiDAR, ASAR, and hybrid-techniques are very interesting but infrequent and limited in geographical scope, constrained in part by the limited spatial coverage and the cost of obtaining the data sets.

Multi-Angular Observations
Anisotropy is typically considered a source of noise when analyzing remote sensing data from mono-angular spectral instruments but constitutes a definite advantage when analyzing measurement from multi-angular instruments [45]. Over the last three decades, the potential benefits of multi-angular observations have been recognized through the operation of innovative Earth Observation (EO) instruments, such as the Along-Track Scanning Radiometer (ATSR-1 and ATSR-2), the Polarization and Directionality of the Earth's Reflectance (POLDER), and the Multi-angle Imaging SpectroRadiometer (MISR) instrument on-board National Aeronautics and Space Administration (NASA)'s Terra platform, which has been continuously operating since 24 February 2000. These instruments generate an improved characterization of the atmospheric and terrestrial environments because multi-angular measurements permit an accurate, quantitative decoupling of the distinct spectral and directional signatures of those geophysical media.
In this context, the recent availability of land surface products at the relatively high spatial resolution of 275 m generated by the MISR-HR (for High Resolution; version 2.01-0) processing system [46], about weekly over the period 2000-2018, offers new opportunities to explore the relation between surface properties and the fluxes of matter and energy in the environment. Appendix A summarizes the specifications of the MISR instrument and outlines the products generated by the MISR-HR processing system. Specifically, the Joint Research Centre-Two-stream Inversion Package (JRC-TIP) algorithm [47,48] derives the Fraction of Absorbed Photosynthetically Active Radiation (FAPAR) product from an analysis of albedo values obtained by integrating the Rahman-Pinty-Verstraete (RPV) model [49,50]. FAPAR is a measure of the absorption of incoming solar radiation in the canopy, within the photosynthetically relevant spectral region (400 to 700 nm) (in Reference [51,52]). The JRC-TIP inversion process returns the final value of the cost function, as well as estimates of the variance, associated with each product: the former is a measure of the distance between the model and the data, while the latter measures the uncertainty associated with the retrieved product.

MISR-HR FAPAR Product for Skukuza
This MISR-HR FAPAR product has been extensively evaluated [53][54][55][56] and serves as a benchmark in various contexts, as noted earlier. It also offers three advantages over the traditional NDVI used in many phenological studies: (1) FAPAR is much better grounded in radiation transfer theory and has been identified as an "Essential Climate Variable" by the GCOS, which is not the case for NDVI, (2) this particular product is derived using the JRC-TIP algorithm, a state of the art inversion procedure that has been extensively validated (see, e.g., https://fapar.jrc.ec.europa.eu/Home.php, last visited on 21 November 2020), and routinely used by the Committee on Earth Observation Satellites (CEOS) Working Group on Calibration and Validation (WGCV) and the European Commission Quality Assurance for Essential Climate Variables (QA4ECV) project, for instance, and (3) this product is made available at a higher spatial resolution (275 m) than any of the traditional global vegetation index products (500 m at best, and often 1 km or more) available over the last 20 years.
The Skukuza flux tower site falls within Block 110 of MISR Paths 168 and 169 (Figure 1, panel (a)). The repeat period of this instrument for a single Path is exactly 16 days, but the study area is located within the overlap of those Paths, giving a staggered revisit period of seven and nine days between those two Paths. A total of 859 combined Orbits for both Paths (i.e., 429 and 430, respectively) were acquired between 15 March 2000 and 30 December 2018; 17 of them were irretrievable from NASA's Atmospheric Science Data Center (see https://asdc.larc.nasa.gov/project/MISR/, last accessed on 14 October 2020.), due to technical issues with the platform. As a result, the longest period between successive acquisitions is 32 days. However, cloudiness further reduces the availability of usable surface observations, so the time gaps between successive observations is typically 7 or 9 days during the dry season, but may reach up to 100 days or more during the wet season.
The MISR-HR FAPAR product is made available on the Space Oblique Mercator (SOM) map projection, like all standard MISR data products. This projection minimizes shape distortion and scale errors throughout the length of the Orbit near the satellite ground track. However, each Path possesses its own unique SOM map projection parameters, so assembling data sets involving two or more Paths imply re-projecting the data onto a common grid, in this case, the geographic spatial reference system, i.e., European Petroleum Survey Group (EPSG) code: 4326, as explained below in Section 4.2.2.

Flux Tower Observations
Flux towers use micrometeorological measurements and the Eddy Covariance (EC) method (first published by Reference [57]) to directly observe air motion and gas concentrations at two or more heights, and to estimate the exchanges of gas, energy, and momentum between the terrestrial biosphere and the overlying atmosphere on an ecological scale from the covariance between fluctuations in those variables [58][59][60]. GPP (expressed in units of g C m −2 d −1 ) is derived from an analysis of those data. Figure 2 shows the relation between the GPP and precipitations, in this case, estimated as part of the European Centre for Medium-Range Weather Forecasts (ECMWF) Re-Analysis (ERA)-Interim reanalysis project.
It is important to note, from the outset, that unmanaged, protected areas in a National Park are not monitored or used like an agricultural field: there is no planting or harvesting date, and nobody observes the emergence of leaves or flowers. This would be difficult in any case, given the diversity of species involved. Hence, the characterization of the phenology of the canopy must be derived from remote sensing measurements and access to local observations from an EC flux tower provides a unique opportunity to establish possible relations between remote and in situ measurements, even though these address quite different biogeophysical variables.
The Skukuza EC flux tower is located at 25 • 1 10.54 S, 31 • 29 48.6 E within the Kruger National Park, near the Skukuza camp, at an altitude of 365 m above sea level. It has been in continuous operation since 2000, and stands on the ecotone between broad-leafed Combretum savanna on sandy soil and fine-leafed Acacia savanna on clayey soil [61]. Woody vegetation height ranges between eight and ten m, and the flux sensors on the tower are 17 m above ground, implying a sensor footprint of about 500 m [62]. The Skukuza flux tower currently makes use of the LiCor 7500 open-path infrared gas analyzer to measure carbon dioxide (CO 2 , µmol mol −1 ) and water vapor (H 2 O, mmol mol −1 ) concentrations, as well as ultrasonic anemometers (Gill Wind Master Pro and Campbell Scientific CSAT3), to measure horizontal and vertical fluctuations in wind speed (m s −1 ) and air temperature (K).
NASA erected this flux tower as part of the Southern African Regional Science Initiative (SAFARI-2000), an international field campaign investigating emissions, transport, transformation and deposition of trace gases and aerosols over southern Africa between 1999 and 2001 [61,[63][64][65]. The tower was subsequently donated to the Council for Scientific and Industrial Research (CSIR). The Skukuza EC flux tower measurements confirm the close inter-connection between the water and carbon fluxes: ecosystem respiration is highest at the onset of the wet season, continuously decreasing during the transition towards the dry season [6]. Variations in precipitation between years, which influence variations in the amount of green leaf area, seasonal patterns, and soil water availability are the main drivers of Net Ecosystem Exchange [62]. This approach has been used to estimate GPP [66], as well as to partition the energy fluxes, at the surface [67].
For the purpose of this study, the Skukuza flux tower data set processed by the FLUXNET organization (Site Id: ZA-Kru) was downloaded from the FLUXNET2015 data archive (Available from the portal https:// doi.org/10.18140/FLX/1440188, last accessed on 14 October 2020.). This organization coordinates regional and national networks to harmonize the compilation, archival and distribution of data to the scientific community. It comprises over 900 flux tower sites globally (See https://fluxnet.fluxdata.org/, last accessed on 14 October 2020.). Data were originally released in December 2015 and subsequently updated in July 2016 and November 2016, following the procedures and quality controls described in Reference [68,69]. The reliability of phenological information derived from satellite remote sensing observations is evaluated by confronting them to flux tower results [70,71]. The availability of both remote sensing and field measurements at the Skukuza site thus offers a unique opportunity to characterize the phenology of that biome.

Measurement Uncertainties and Missing Data
Each plant species typically follows its own phenological development, itself dependent on the locally prevailing conditions. For their part, spaceborne EO instruments acquire measurements (usually over areas large with respect to the dimensions of plants, hence containing many plant species), over periods of time which may or may not coincide with the specific observational needs at any given location. Individual observations are associated with measurement uncertainties, which result from systematic, as well as random, errors, due to incorrect or biased calibration, and intrinsic limitations on the precision of the instrument, respectively. In addition, optical instruments measure the combined contributions of the planetary surface and the overlying atmosphere: clouds and aerosols can perturb or prevent observations of the surface. Measurements actually useful to characterize the phenology of plant canopies are therefore scattered in space and time, especially during rainy seasons [72].
Models, such as the ones described below, are required to provide a reliable representation of the phenology of plant canopies despite these challenges. Once they have been adjusted to optimally fit the available empirical evidence, they can be used to provide a simplified stable characterization of the evolution of plant canopies. The combined empirical evidence on phenology derived from remote sensing and ground-based observations has been shown to improve the accuracy of models simulating surface-atmosphere interactions (in Reference [73][74][75]).

Screening Outliers
The reliability of an analysis based on measurements clearly depends on the accuracy of the inputs, so outliers and doubtful items should be culled out of the data. As explained above and in Appendix A.2, FAPAR values are themselves obtained through a model inversion procedure, and their accuracy and reliability can be quantitatively assessed by inspecting two pieces of information: the variance associated with the reported values, and the value of the cost function at the end of the inversion process. Whenever the latter exceeds a given threshold (in this case, set to the 90th percentile of its distribution), the mismatch between the model and the empirical evidence is deemed excessive and the corresponding data items are discarded, following the approach described in Liu et al. [76]. This is preferable to traditional statistical methods (such as box-and-whisker statistics) because it eliminates not only obvious outliers but also data values that may look reasonable but are probably questionable: a high cost function value betrays a mismatch between the JRC-TIP model and the input data and often results in one or more of the model parameters taking on unusual or obviously unbelievable values. This process, thus, tends to discard more data points than would be eliminated on the basis of extreme values only but yields a time series of high quality.
The threshold of acceptability can be fine-tuned by inspecting the histogram of cost function values, which typically exhibits a long tail of occasional large values, so the selection of a particular value within that tail is not very crucial, as it screens out a small number of points. Figure 3 shows that keeping data points associated with slightly higher cost function values (for instance 0.12 instead of 0.10) would add a few more data points to the time series, though most of them cluster in late 2016 and early 2017, at a time when they would not provide much additional constraint on the model inversion. Selecting much smaller values does have a significant impact on the number of remaining data points, as documented in Table 1.
In practice, the choice should be driven by the requirements of the user of the information: is it better to have very reliable data on a smaller number of cases, or more approximate and potentially less reliable information more frequently? The interesting feature of this approach is that there is no need to impose a single solution for all possible problems: the process can be customized to meet the specific requirements of the user.  In this case, the threshold of acceptable cost function values is set to 0.1. All cost values larger than this threshold and the corresponding FAPAR values in the top panel, are marked in red. This process eliminates the occasional slightly negative FAPAR values, but also other values that might otherwise appear to be reasonable at face value. Of course, a lower threshold would prune even more data points and jeopardize the feasibility of characterizing the vegetative season, while a larger threshold would retain more data but possibly lead to an incorrect interpretation because of the presence of dubious values. Table 1 shows how selecting progressively stricter threshold values for the cost function increases the number of data points considered outliers or potentially suspicious and decreases the number of data points remaining available for further analysis. Table 1. Relation between the largest allowable cost function value (Threshold), the number of data points identified as outliers, and the number of data points remaining available for further analysis in the time series of Figure 3.

Assembling the FAPAR Data Cube
The next step consists in merging data from the two relevant MISR Paths. Raster images of five samples by five lines, centered on the geographic location of the flux tower, were extracted from the FAPAR product for each of the 842 available Orbits and reprojected onto a common geographical grid, since both the MISR data and the MISR-HR products are provided in the SOM projection specific to each Path. Figure 6 explains this process graphically: Three grids are superimposed on a background map of the area around Skukuza, symbolized by the black triangle at the center of the display: the yellow and blue grids show idealized representations of the locations of the areas observed by MISR in Path 168 and 169 acquisitions, respectively. The red grid is the common geographical projection used as the target, and all grid squares are 275 m on the side, the ground sampling distance of MISR-HR products. This process roughly doubles the number of observations at each location, and helps constrain the phenology.

Identifying Individual Vegetative Seasons
The successive vegetative seasons must be identified before attempting to fit a double S-shaped model through the data, accounting for the fact that their growth and senescence phases can start and last differently in different years, mostly as a result of climatic fluctuations. This process is implemented as follows. A power spectrum for the entire time series is computed first, using the Lomb-Scargle algorithm [77,78] since the remote sensing data are provided on irregular dates. The resulting periodogram indicates whether the record exhibits primarily an annual variation (as is the case in Skukuza), or a biannual cycle, as would occur in agricultural regions where two crops can be grown per year. Figure 7 shows the Lomb periodogram for the FAPAR time series at the pixel location of the Skukuza Flux tower. This area exhibits a strong annual cycle (peak power: 71.72, frequency: 0.0027 day −1 , and period: 364.75 days) and a minor semi-annual fluctuation (peak power: 16.63, frequency: 0.00548 day −1 , and period: 182.37 days).  Since satellite records may start and end at arbitrary times relative to climatic or ecological cycles, the second step consists in identifying the start of the first complete vegetative season that can be documented on the basis of the data record. This is achieved as follows: • compute the median of the entire FAPAR record, • search, from the start of the record, the date of the first observation whose value is lower than this threshold, • define a time interval equal to 1/3 of the expected length of the vegetative period (as determined from the Lomb-Scargle algorithm), starting on this date, and • set the start date of the first vegetative season to the date of acquisition of the minimum FAPAR value within this limited time period.
From that point onwards, the end date of each vegetative season is automatically considered the start of the next one. The end of each season date is determined as follows: This process is repeated until the nominal date of the end of the nth vegetative season exceeds the length of the remote sensing record. As a result, some data may be discarded at the start and end of the entire record, since they only document parts of vegetative seasons.

Modeling Phenology
Once each individual vegetative season has been identified, the evolution of the plant canopy properties during that period can be modeled. This is beneficial (i) to estimate the timing of events, such as the start, end, and length of the season (SOS, EOS, and LOS, respectively), (ii) to document the probable value of biogeophysical variables, such as FAPAR, when direct observations are missing, and (iii) to characterize the long-term evolution of the environment, in this case, for the environment around the Skukuza flux tower, in the Lowveld savanna of north-eastern South Africa, during the period 2000-2018, despite the presence of noise in the data.
Global and local methods have been used in the literature for similar purposes. Local approaches take existing data at face value and generate estimates of missing observations on the basis of the values of neighboring measurements [79]. They include rank based filters, e.g., minimum, median, and maximum filters [80], linear/polynomial filters, such as moving average and adaptive Savitzky-Golay filters [81][82][83], locally weighted scatterplot smoothing (in Reference [79,84]), and Whittaker smoother (in Reference [85]) to name a few. Advantages include the applicability to any time series, irrespective of their semantic or thematic context, and an ability to match arbitrary fluctuations, but at the risk of simulating fluctuations that may be suggested by noisy or erroneous data.
Global models, by contrast, impose a functional shape that is intended to represent the likely evolution of the variable of interest, in this case, the phenology of vegetation, which is expected to exhibit a growth phase, followed by a senescence phase. They include function-fitting approaches, e.g., polynomial, double logistic, asymmetric Gaussian [86][87][88][89], and signal smoothing or decomposition techniques, e.g., wavelet transform, Fourier analysis, Lomb-Scargle [90][91][92][93][94][95]; see [79]. Signal decomposition techniques provide forecasts over a season by decomposing the time series into noise, seasonal variability and trend [96]. The Lomb-Scargle algorithm, initially developed by Lomb [97] and subsequently elaborated by Reference [98], implements an approach similar to Fourier analysis [99], adapted to estimate the power spectral density of unevenly sampled data [77,78].
Local and global phenology models both have advantages and drawbacks: The selection of a global versus a local model must depend on the intended usage of the resulting information. Global models are deemed particularly suited to represent the temporal evolution of vegetation when the empirical evidence exhibits unusual or unexpected fluctuations (as is the case here): they are used as low-pass filters to screen out the influence of extraneous or uncontrolled processes, such as biomass burning, random browsing, the possible contamination due to thin clouds or aerosols, or simply noise in the data. A local modeling approach would definitely be more appropriate if the objective were to specifically monitor biomass burning events, for instance.
This study explores the suitability of four different global models, each constructed as a pair of S-shaped functions (one to simulate the growth phase and the other to represent the senescence phase): Gaussian, hyperbolic tangent, logistic (these two functions are mathematically equivalent, but they exhibit differences in terms of numerical efficiency, as will be shown below), and sine. Of those four models, the logistic function has been used most frequently in phenology (and other ecological) studies (in Reference [21,34,88]). A smaller number of investigations explored the suitability of the hyperbolic tangent [100][101][102], probably because of its similarity with the logistic function. Rather fewer researchers seem to have considered the Gaussian [103,104] or the sine functions. This paper aimed in part to explore which of those formulations might be more appropriate or more advantageous for this purpose.
These four mathematical models are described in detail in Appendix B. The shape of each elementary S-shaped function is determined by four independently adjustable parameters, typically to specify the (Julian) date and value of the start and of the end of the simulated fluctuation. Imposing a continuity condition between the two functions of each pair reduces the number of parameters to be adjusted to seven. The process of fitting a double S-shaped model to the data reduces to an inversion procedure, where the goal is to optimize a goodness of fit criterion: it is typically implemented as an iterative algorithm to minimize a figure of merit function, such as the root mean square difference between the model and the observed values (χ 2 ). When this condition is realized, the values of the model parameters are deemed to optimally describe the seasonal evolution of the simulated geophysical variable, and the model, together with these newly established parameters, can be used to generate a complete time series, at whatever frequency is required.
Data points can be assigned weights during this inversion process, to allow those with smaller uncertainties to have more influence on the outcome than points with larger uncertainties. The inverse of the estimated standard deviation provided by the JRC-TIP algorithm for each FAPAR value was used as the weight in this investigation. The iterative inversion process is terminated when the value of χ 2 changes by less than 10 −3 between successive iterations, i.e., when there is little benefit to be gained by searching for another minimum in the current neighborhood.
This approach requires at least four FAPAR values within each of the growing and decay phases, and more than seven values over the entire vegetative season to ensure that the number of measurements exceeds the number of model parameters to be determined [105]. The average number of valid FAPAR observations across all pixel locations in the FAPAR record is 288, which, over a period of 18 years, yields around 16 observations per season. However, those are very unevenly distributed, with many more data points during the dry season (under clear skies) than during the rainy season, as was shown in Section 4.2.1. In fact, the process of inverting the models against those data can fail or lead to unrealistic results whenever there are fewer than four observations during the growing phase, which is always the most cloudy.
In addition to systematically fitting each of these double S-shaped models to each vegetative season, a combined approach was also explored, whereby the best model for each individual season is selected, to increase the number of successful model inversions and decrease the average χ 2 statistic.
Once a parametric double S-shaped model has been successfully fitted to the remote sensing measurements, classical techniques based on an inspection of the first and second derivative of this model can be used to determine the start, end, and length of the growing season (in Reference [22,106], among many others).

Results and Discussion
This section documents the results obtained from the application of the methods described in Section 4 to the data sets identified in Section 3. Section 5.1 discusses to what extent the parametric double S-shaped models are capable of representing the phenology of vegetation as reported by MISR-HR FAPAR time series. Section 5.2 explores the effectiveness of this joint approach to document the ecological evolution of the target area by comparing those results with the Gross Primary Productivity (GPP) estimates derived from in situ measurements at the Skukuza flux tower. Section 5.3 reports on the numerical efficiency of these algorithms, while Section 5.4 summarizes the results.

Simulation Capability
All four parametric double S-shaped models are capable of simulating a seasonal signal when the number of data points is large enough to individually document the growth and senescence phases. Each of those models relies on seven adjustable parameters to match the empirical evidence, so eight measurements are required at a minimum before attempting to invert the model against the data for a particular season. It is generally easy to document the evolution of the phenology during the dry season, i.e., when the vegetation senesces or becomes dormant. The difficulty arises when persistent cloudiness generates long gaps in the remote sensing measurements during the rainy season, precisely when the vegetation grows. Figure 9 exhibits the number of FAPAR values available per year over the target of interest for the entire record available. As expected, there are fewer observations during the rainy season (growth phase). Whenever too few FAPAR values are present during the growth (more rarely senescence) phases, the inversion process fails. It is interesting to note that the various models exhibit different sensitivities to the presence of those data gaps. Figure 10 summarizes the number of successful inversions for each of the four models in the target area. It can be seen that the hyperbolic tangent model offers the most frequent opportunities to successfully simulate a vegetative season, as well as that the sine model is the most sensitive to the presence of gaps in the data, i.e., to the temporal distribution of empirical data throughout the season.

Ecological Effectiveness
Comparisons between models adjusted to fit remote sensing data and in situ field measurements must take place over approximately similar areas. Since the Gross Primary Productivity (GPP) estimates derived from flux tower Eddy Covariance (EC) measurements are deemed to be representative of an area of approximately 500 m around the tower (an area of 53.5 ha), the FAPAR product is averaged over the subset of 3 by 3 pixels centered on the Skukuza flux tower (an area of 68.1 ha). The corresponding average standard deviation for each date in the remote sensing time series is computed as the square root of the average variance of the nine pixel values. The four models described earlier are then adjusted to this spatially averaged time series, and the results are described in the following subsections.   Table 2 quantifies the correlation between daily FAPAR estimates generated by the four models (PDGF, PDHyTgF, PDLF, and PDSF) and the corresponding daily GPP record retrieved from an analysis of the EC flux tower, for each of the 13 vegetative seasons available. The first column reports the dates of the start and end of those seasons and the second refers to the season number. The third and fourth columns indicate the number and the proportion (in%) of days in the corresponding seasons for which GPP measurements were available, respectively. The next four columns report the Pearson correlation coefficient between model-generated daily estimates of FAPAR and the daily GPP values derived from in situ measurements for the same days. The last column shows the same statistics for the combined model, i.e., when the best (smallest χ 2 ) model is chosen for each year. Label A in the four model columns flags cases where fewer than half of the daily GPP values were available for the concerned season (as shown in the fourth column, and also in Figure 11). Label B in those columns flags cases where the model values are unavailable, typically because the temporal distribution of the FAPAR data did not permit the model inversion. Vegetative season numbers set in bold face within the second column point to cases where the number of daily GPP was sufficient (more than 50% of the days) and where all four parametric double S-shaped functions could be successfully inverted against the FAPAR data.

Comparing Daily Values
Correlation coefficients between the daily GPP and daily estimates derived from the PDGF, PDHyTgF, PDLF and combined model approaches are always equal to or larger than 0.69 in seasons 2 to 5 and 9 to 12. This coefficient is much lower in the case of season 1 because (1) a long gap in the remote sensing data, due to persistent cloudiness, hampers the model inversion procedure and (2) the GPP data suggest a noticeable but unexpected double vegetative season during that period, unconfirmed by the remote sensing record. The number of daily GPP measurements available in seasons 6, 7, and 8 was too low to compute that statistic. Season 13 was also exceptional in two other ways: (1) the FAPAR data includes an anomalously low value near the peak of the growing phase (possibly due to incomplete or insufficient cloud screening), while the GPP data suggests a sustained activity throughout the period. The PDSF model, for its part and in its current implementation, appears to be more sensitive to the distribution of data points during the season than other models. However, when the model inversion does work, it tends to yield very good results. Figure 12 shows the non-linear relation between daily simulated FAPAR and daily GPP measurements across all seasons, as represented by each of the indicated models: remote sensing measurements (in this case, of FAPAR) appear quite sensitive to small variations of GPP when the latter is itself low (i.e., at the start an end of the vegetative seasons), as well as tends to saturate during the peak of the growing phase. This finding may need to be confirmed in other locations but suggests that FAPAR is a sensitive indicator of GPP in the range of 0 to 5 g C m −2 d −1 but then saturates and is not usable to estimate larger carbon exchanges. Table 2. Correlation coefficients (r) between the daily GPP and daily FAPAR estimates from the PDGF, PDHyTgF, PDLF, PDSF, and combined models for each vegetative season and for the area around the Skukuza flux tower. Entry A indicates that fewer than half of the expected GPP values were available, while entry B means that the model could not simulate the available FAPAR data. However, Figure 11 also shows why that relation is complex: daily model values for seasons 4 and 5 are very similar, while daily GPP values in season 5 are almost double the values in season 4.

SOS/EOS
The impact of lacking remote sensing evidence on the shape of the model can also be highlighted by comparing seasons 10 and 11: In those cases, the GPP daily values are comparable but the absence of observational constraint near the peak of the season probably resulted in substantial overestimations by the model in season 11.

Comparing Seasonally Integrated Values
A high correlation between daily FAPAR values (retrieved from remote sensing or simulated) and daily GPP estimates (derived from flux tower measurements) is not expected because the former correspond to instantaneous assessments of the reflectance of the canopy, while the latter accounts for the evolution of that canopy throughout the day. In situ observations are also sensitive to a wide range of local perturbing factors that do not affect the observations gathered from space, such as wind speed and direction, for instance. On the other hand, it is customary to integrate these time series over the lengths of vegetative seasons. This approach tends to reduce the inherent variability of those signals (mainly the GPP in this case) and to focus on the seasonal environmental properties rather than the daily fluctuations. Table 3 shows the result of this process: The first two columns are the same as for Table 2; the third one shows the integrated GPP value for the specified seasons, and the next four columns provide the integral of the four models over these same seasons, again using the results obtained for the spatially averaged 3 by 3 pixel area around the Skukuza flux tower. Table 3. Correlations coefficients (r) between the integrated GPP and Cumulated FAPAR (CFAPAR) obtained by integrating the PDGF, PDHyTgF, PDLF, PDSG, and combined models during the indicated vegetative seasons. Computations were skipped whenever fewer than half of the expected GPP observations were available during the season, or when the correlation between the daily GPP and model values dropped below 0.70 in Table 2 (see text for details).

Comparison Outcome
These results call for various comments: • The GPP estimates derived from the flux tower measurements exhibit a very high day-to-day variability, as a result of fluctuations in local conditions, including turbulence. While progressively larger values are expected during the vegetation growth phase than during senescence, there is no expectation that successive values generate a smooth sequence. • By contrast, vegetation growth and senescence occurs as a generally slow, smooth process, with more branches and leaves occurring during the rainy season, and the plants wilting progressively during the dry season. Hence, it makes sense to fit a smooth model through the FAPAR data, while this would not be appropriate for the GPP data, at least on a daily time scale. • Inspecting again Figure 11, it is seen that FAPAR and GPP both increase quickly and together at the start of the rainy seasons, but that the decrease in FAPAR tends to lag behind the decline in GPP: plants continue to appear photosynthetically active from space longer than the in situ measurements of CO 2 indicate.
• GPP values measured at the Skukuza flux tower range from 0.0 to about 16 gC m −2 d −1 , while FAPAR rarely drops below 0.1 or raises above 0.6 at this site: these variables experience quite different ranges of variability.

•
As expected, the PDHyTgF (Hyperbolic Tangent) and the PDLF (Logistic) models generate essentially the same traces, since their parameters can be adjusted to generate the same values; however, they exhibit somewhat different levels of numerical efficiency, as will be seen shortly in Section 5.3.

Numerical Efficiency
The numerical performance of the four parametric double S-shaped models described above can be assessed using four criteria.

Successful Inversions
This first criterion explores how often the selected model is actually able to fit the empirical evidence provided by remote sensing, or, equivalently, the sensitivity of the models to the temporal distribution of the input data. For this purpose, the entire record of 18 years of FAPAR data is used, over the 5 by 5 pixel target area around the Skukuza EC flux tower, so there are 450 possible cases. Table 4 provides the statistics, and it can be seen that the PDHyTgF (Hyperbolic Tangent) model provides the greatest proportion of successful model inversions for the 450 possible cases in the 5 by 5 pixel target area.

Iterations
A second performance criterion is the number of iterations required to reach a solution. This depends strongly on the algorithm used to conduct the inversion process. In this case, the IDL CURVEFIT function is used in all cases, and analytic versions of the partial derivatives of the model with respect to the model parameters were explicitly provided (see Appendix B). Table 5 shows the results, and it is clear that the PDSF model converged to a solution in fewer iterations than the other models, although it achieved a successful inversion much less often than other models, as has been shown above.

Computing Time
The third relevant criterion is the time required to achieve the inversion process. Again, this depends largely on the power of the computer, its clock frequency, the number of cores available, the ability of the software to take advantage of multithreading, etc. All results described in this paper were generated on the same computer, and the results are normalized by the fastest time achieved for each category (successful and unsuccessful), to provide a rating independent of the particular hardware and software available. Table 6 reports the average amount of time spent inverting each of the four models against the FAPAR data for each of the 25 locations in the target area and for each of the 18 seasons in the input data set. All four models can be inverted against the seasonal data in about the same amount of time when presented with an optimal data distribution. However, as the pattern of FAPAR data starts to diverge from expectations or to feature longer gaps of missing data, the inversion process takes more time to reach a solution. Interestingly, the PDLF and the PDGF models take less time to conclude in the worst cases than the PDSF and the PDHyTgF. On the other hand, inverting the PDHyTgF model is the fastest process to conclude with an unsuccessful fit, followed by the PDSF; the PDLF and especially the PDGF models appear to require more time before deciding to quit in unsuccessful cases.

Simulation Effectiveness
Last but not least, simulation models can be compared in terms of how well they are able to represent the variability present in the empirical evidence, in this case, in the MISR-HR FAPAR records. This is done by contrasting the χ 2 values at the end of the inversion. Table 7 summarizes the results and shows that the PDSF actually best matches the available remote sensing data, when they are available. However, as was shown above, it is also the model that achieved the smallest number of successful inversions. As noted above, the PDHyTgF and PDLF models yield very similar results: This was expected because they can be made to match each other exactly by manipulating the model coefficients. However, these functions are not identical: both are defined on the domain R of real numbers, or in the context of this application, on the subdomain R J of real numbers suitable to represent Julian days, but their co-domains are the intervals [−1.0, 1.0] and [0.0, 1.0], respectively. More importantly, they do differ from each other in terms of their numerical efficiency: the PDLF model achieves, on average, a slightly better fit than the PDHyTgF, though the latter did manage to achieve a successful inversion in more cases than the former.
Again, these last results depend on the availability of a floating-point coprocessor, or on the implementation of the algorithms to compute transcendental functions, such as exponential and trigonometric functions. However, since all results were obtained on the same computer, the results exhibited here may be indicative of the relative performance of these models. These findings may need to be confirmed over a larger number of cases to provide general guidance about the selection of such models.

Discussion
Comparing FAPAR values retrieved from an analysis of remote sensing data to GPP values derived from field measurements acquired at a flux tower is a complex endeavor.
One difficulty arises from the fact that remote sensing measurements correspond to instantaneous snapshots of the radiative state of the surface at the time of the satellite overpass (around 10:36 AM in the case of the MISR instrument), while in situ observations are taking place over the entire day and generate daily-averaged GPP values.
Another drawback is the irregular availability of FAPAR data, as prevailing cloudiness prevents the observation of land surfaces during the rainy season, which is the period of intense growth for vegetation. This issue could be alleviated in at least four different ways: • Spatial analysis: Terrestrial ecosystems exhibit a high degree of spatial variability, due to the large number of plant species (especially in protected environments, such as the Kruger National Park), combined with a potentially similar variability in soil properties. These spatial fluctuations may be further enhanced by patchy precipitations, in particular in convective systems. However, spatial patterns may subsist over periods of weeks or more, because plants remain in place. Hence, if relevant spatial correlations can be established (e.g., by kriging), it may be possible to infer the likely value of FAPAR at one location where it is missing on the basis of its value at another location, provided both places share a common evolution.

•
Temporal analysis: When excessively long gaps arise in the remote sensing data, it may be sufficient to interpolate the missing values on the basis of the last value available before and the first value after the gap. This approach tends to underestimate the actual variability of the variable (FAPAR in this case), as it will never generate higher (or lower) values than those effectively observed.

•
Other sources: It may also be possible to ingest other data products. Indeed, FAPAR products are available from other instruments, such as NASA's MODIS instrument (over the period of interest here). However, that approach carries its own set of complexities, as the algorithms used to generate these products (including to implement atmospheric corrections) are different so that the products may not match, or may introduce systematic biases or shifts.

•
As far as the future is concerned, a particularly promising approach consists in operating constellations of identical instruments, in order to expand the number of opportunities to observe the surface despite the cloud coverage.
Another lesson learned from this study is that it is difficult to fit a parametric double S-shaped model to a seasonal phenology where one or a few data points with an anomalously low value occurs right in the middle of the peak period of photosynthetic activity. These events can occur as a result of multiple causes, including instrumental or environmental noise (e.g., a lingering cloud or aerosol contamination effect), or a legitimate process, such as biomass burning, for example. Additional external data sources are needed to reliably attribute definite causes to these events. If those are unavailable, it may be necessary to sift those points from the data set, so that the analysis can proceed.

Conclusions
The state and evolution of plant canopies has been monitored from space for decades because the phenology of vegetation foreshadows the yield of crops and therefore serves as a useful indicator of economic output and impacts. The crucial role of vegetation in the carbon cycle has also stimulated a strong interest in the assessment of the net primary productivity of the biosphere. However, the bulk of published studies (1) are mostly concerned about agricultural productivity in monocultures over large areas in the Northern Hemisphere, where phenology follows a regular, highly predictable seasonal evolution, (2) are based on an analysis of the Normalized Difference Vegetation Index (NDVI), mostly using the logistic or hyperbolic tangent models, and (3) rely on globally available data products at a spatial resolution of 500 m or coarser.
This paper explored the feasibility of modeling natural, unmanaged vegetation canopy in the protected Kruger National Park in South Africa, where the vegetation features significant biodiversity and exhibits much larger spatial and temporal fluctuations than crops (in particular, from year-to-year and within-seasons). This study relied on a state of the art FAPAR data product available since March 2000, derived from NASA's MISR instrument, using the MISR-HR processing system, and generated at a spatial resolution of 275 m.
This paper compared the suitability of four different parametric double S-shaped models to simulate the phenology of plant canopies: the Gaussian (PDGF), the Hyperbolic Tangent (PDHyTgF), the logistic (PDLF), and the Sine (PDSF) models, and it showed that they are all capable of explaining the bulk of the variability present in the FAPAR record when constrained with ample empirical data, well distributed in time. These models do differ, though, in their efficiency and effectiveness with respect to the presence of gaps in the data, an issue that is intrinsic to the use of remote sensing observations in the solar spectral range. Specifically, PDHyTgF appears to be the least sensitive to the particular distribution of input data during the vegetative season, though it is somewhat more computationally expensive. The PDSF model fits the FAPAR data with the smallest χ 2 but is also the most sensitive to the presence of gaps in the input data. The PDLF model is functionally equivalent to PDHyTgF, in the sense that both models can be made to match each other perfectly, though they do differ somewhat in terms of their sensitivity and computational efficiency.
It has also been shown that the presence of gaps in the empirical evidence over extended periods, especially during the growing phase (rainy season), is detrimental to the successful inversion of global models over the vegetative season. This suggests that further efforts to reduce the gaps in the input data will go a long way to improve the benefits to be derived from this modeling approach.
FAPAR values simulated by those models were compared to Gross Primary Productivity (GPP) estimates derived from an analysis of the Eddy Covariance flux tower at Skukuza, as this environmental variable is useful to investigate biomass accumulation, agricultural yield, or elements of the carbon cycle over land, for instance. A general non-linear relation between FAPAR and GPP has been confirmed when all data sets are adequate for the purpose, and a useful relation exists between FAPAR and GPP when the latter varies in the range from 0 to 5 gC m −2 d −1 . That relation saturates at larger values of GPP, so FAPAR may no be appropriate to monitor large values of GPP. However, long gaps were also occurring in this GPP data set, and the uneven distribution or even absence of FAPAR data during the vegetation growth phase in multiple seasons prevented meaningful comparisons, or resulted in unrealistic assessments in those seasons. Further efforts should examine the potential for improving the association between GPP with model fits by applying spatial or temporal interpolation techniques to provide best estimates during excessively long gaps in the FAPAR record.

Abbreviations
The following abbreviations are used in this manuscript: Remote sensing from space traditionally relies on the analysis of spectral, spatial and temporal variations to characterize the state and evolution of the environment. MISR enhances this approach by also collecting quasi-simultaneous multi-angular observations through its nine separate cameras, each acquiring spectral measurements in four spectral bands (blue: 446 nm, green: 558 nm, red: 672 nm,  [45,[107][108][109][110], as well as a recent issue of the MDPI journal Remote Sensing dedicated to that instrument (See https://www.mdpi.com/journal/remotesensing/ special_issues/misr_rs, last accessed on 14 October 2020.)).
The native spatial resolution (technically the ground sampling distance) of MISR is 275 m in all 36 data channels (actually 250 m at nadir, before the resampling implemented at Level 1). However, the resulting data rate exceeds the download bandwidth assigned to that instrument. To meet that constraint, 12 of those data channels are actually downloaded at the full spatial resolution (the 4 spectral bands of the nadir-pointing camera, and the red spectral channels of the 8 off-nadir-pointing cameras). The remaining 24 data channels, i.e., the non-red spectral channels from the off-nadir-pointing cameras, are averaged on-board the satellite and downloaded at the spatial resolution of 1100 m, achieving a 16:1 compression ratio in those channels. This is known as the MISR Global Mode (GM) of operation. MISR can also occasionally be operated in Local Mode (LM), whereby all 36 data channels are acquired at the full spatial resolution of 275 m but only over a limited region of about 300 km along the track of the platform.
As a result, all standard MISR Level-2 data products available from NASA are also provided at the spatial resolution of 1100 m (or coarser). They are openly accessible and permanently archived at the NASA Atmospheric Science Data Center (ASDC) in Hampton, VA, USA.
Each location on Earth is successively observed by all nine cameras in less than seven minutes. To combine and analyze the data from multiple cameras, these observations must be projected onto a common geographical grid, in this case, the orbit-specific Space Oblique Mercator (SOM) projection. This projection minimizes the distortion resulting from assigning raw data (from the detectors) into a common grid to co-register the outputs of the various cameras. All MISR data products at Level 1B2 and Level 2 are geo-referenced to the SOM appropriate for the orbit.
Since the Terra platform repeats its orbital sequence every 16 days, the instruments observe the same locations from the same angular position every 233 orbits. These repeating orbital patterns are called Paths in MISR parlance. An Orbit refers to one Terra satellite revolution around the planet, numbered sequentially since launch. The elementary MISR data granule is a full Orbit (from pole to pole), but data are logically assigned to successive Blocks to ease the ordering and processing. Some 180 Blocks have been defined for each Path, and their geographical locations are fixed with respect to the planetary surface. At any one time, about 140 of those Blocks contain useful data: they correspond to the areas actually illuminated by the Sun on the selected day. The set of Blocks extends beyond both terminators to account for the seasonal variation in latitudinal illumination associated with the inclination of the Earth's rotational axis with respect to the ecliptic. Each Block covers an area of 563.2 km across-track by 140.8 km along-track, though only about 380 km of the data across-track contain usable data [111]. This width is somewhat larger for the more inclined cameras than for the nadir-pointing camera.

Appendix A.2. The MISR-HR Processing System
Land surfaces often exhibit significant heterogeneity at scales finer than the spatial resolution of standard MISR products (1100 m). Other instruments operating during the same period, such as the NASA Landsat program or the CNES SPOT instrument, deliver imagery at finer spatial resolutions but offer a lower revisit frequency because of their narrow swath width; they also lack the ability of systematically acquiring multi-angular observations. Recently, the trend towards finer and finer spatial resolutions has led to ever more limited spatial coverage (for engineering reasons), and stimulated the operation of constellations of satellites to remedy this problem. However, the interpretation of data products generated by very high spatial resolution instruments (e.g., at sampling distances commensurate with the size of the objects in the scene, such as trees) is very complex, as it requires radiation transfer models that represent the scene with that amount of detail.
The MISR-HR (for High Resolution) processing system [46] is designed to deliver a suite of land surface products at the native spatial resolution of 275 m, which is finer than the standard MISR products but nevertheless coarse enough to allow the use of classical (e.g., two-stream) radiation transfer models. This processing system requires four standard MISR data products as input: 1. The Ancillary Geographic Product (Earth Science Data Type or ESDT identifier: MIANCAGP) contains information on the latitude, longitude, and altitude distribution of each Block within the corresponding Path, as well as a scene classifier, for each 1100 m pixel, on the SOM map projection grid. 2. The Geometric Parameters Product (ESDT identifier: MI1B2GEOP) contains information on the applicable illumination and observation geometry (zenith and azimuth angles), as well as the scatter (angular distance between the Sun and camera directions) and glitter (angular distance to specular reflection) angles for each of the 9 cameras. The MISR-HR processing system (currently at Version 2.01) proceeds through six steps to generate four sets of output products: 1. The first step consists in replacing missing values in the MISR L1B2 GRP, as well as in the Radiometric Camera-by-Camera Cloud Mask (RCCM) data sets, as described in Reference [112,113]. These intermediary products are currently not saved but used directly in the subsequent steps. 2. A sharpening algorithm is then applied to this updated L1B2 GRP product to generate the L1B3 product, i.e., the 36 bidirectional spectral reflectance data channels at the nominal Top of the Atmosphere (ToA), at the full spatial resolution of the instrument (275 m). This product resembles the MISR Local Mode product, except that (1) it can be generated everywhere (and does not depend on Local Mode acquisitions), (2) the data values in the 24 non-red, off-nadir data channels are obviously reconstructed (since the only available observations are at the spatial resolution of 1100 m), and (3) the swath width of the L1B3 product does not exceed that of the nadir camera since the latter is used in this reconstruction. 3. An algorithm is then applied to the L1B3 product to estimate the bidirectional spectral reflectance factor (BRF) of the surface, sometimes also called Bottom of the Atmosphere (BoA) reflectance. This process takes into account the contribution of the atmosphere itself to the original measurements and yields 36 data channels at the native spatial resolution of the instrument. 4. The parametric bidirectional reflectance model of Rahman et al. [49] is subsequently inverted against the nine surface directional measurements available for each location, in the four spectral bands, to document the spectral anisotropy of the surface. Three model parameters are retrieved in each band, together with estimates of their associated uncertainties: a base reflectance ρ, which determines the brightness level of the reflectance, the Minnaert parameter k, which defines the bowl or bell shape of the reflectance as a function of the geometry of illumination and observation, and the asymmetry parameter Θ, which controls whether the reflectance is predominantly due to forward or backward scattering (also see Reference [114]). 5. Those RPV results are spectrally and directionally integrated to generate the broadband albedos in the visible (VIS) and near-infrared (NIR) broadband regions required by the next step. These intermediary results are physically contained in the output files of the last processing step. 6. Finally, the Joint Research Centre (JRC) Two-stream Inversion Package (TIP) is inverted against the albedos just computed to offer a complete description of the radiation fluxes at the surface, specifically through estimates of the reflectance, transmittance and absorptance of the plant canopy, in addition to estimates of the Leaf Area Index (LAI), the Fraction of Absorbed Photosynthetically Active Radiation (FAPAR), and optical properties of the leaves, as well as the implied albedo of the underlying surface. This inversion procedure also delivers estimates of the variance associated with each output variable, and the final value of the goodness of fit criterion, which can be used to cull data items that are obviously wrong or of doubtful reliability [76].
These products have been found useful in a variety of applications, including to describe the heterogeneity of plant canopies [115,116], the foliage projective cover of plant canopies [117], the structural parameters of forests [26], or the response of grasslands to rainfall regimes [118].

Appendix B. Parametric Double S-Shaped Functions
Each of the four models mentioned in this paper is constructed as a superposition of an initial free parameter p 0 and two elementary S-shaped functions, representing the growing and the senescence phases, respectively (see Equations (A1), (A12), (A23), and (A34)).
This Appendix documents the mathematical representation of each of the four Parametric Double S-Shaped Functions, namely the Gaussian, Hyperbolic Tangent, Logistic, and Sine models. It also provides the analytic derivatives of those functions, which are used to accelerate the process of inverting those models against time series data.
The software to compute model values, written in the IDL (IDL is a Trademark of Harris Geospatial Solutions, Inc.) language, is available under the MIT license from the GitHub web site https://github. com/mmverstraete. This Appendix is a summary of the more exhaustive User Manual for those routines, available from the same source. In each of these four models, the independent variable is named x and the dependent variable is y = f (x).
• p 4 = Amplitude of f 2 (x) (positive for an increase, negative for a decrease), independently from the base value. • p 5 = Phase shift along the x axis for the inflection point of f 2 (x): a positive (negative) value shifts the function to the right (left) of x = 0. • p 6 = Slope of f 2 (x) at the inflection point.
• p 4 = Amplitude of f 2 (x) (positive for an increase, negative for a decrease), independently from the base value. • p 5 = Phase shift along the x axis for the inflection point of f 2 (x): a positive (negative) value shifts the function to the right (left) of x = 0. • p 6 = Slope of f 2 (x) at the inflection point.