Sun-Induced Chlorophyll Fluorescence III: Benchmarking Retrieval Methods and Sensor Characteristics for Proximal Sensing

: The interest of the scientiﬁc community on the remote observation of sun-induced chlorophyll ﬂuorescence (SIF) has increased in the recent years. In this context, hyperspectral ground measurements SIF values. Results show that the SFM approach applied to high-resolution spectra provided the most reliable SIF retrieval with a relative error ( RE ) ≤ 6% and < 5% for F 687 and F 760 , respectively. Furthermore, although the SFM was the least a ﬀ ected by an inaccurate deﬁnition of the absorption spectral window ( RE = 5%) and / or interpolation strategy ( RE = 15–30%), we observed a sensitivity of the SIF retrieval for the simulated training data underlying the SFM model implementation.


Introduction
Plant photosynthesis is the primary process in terrestrial ecosystems. Accurate estimates of photosynthesis and its dynamics are pivotal to understand complex feedbacks and exchange interactions in the land-atmosphere system [1,2]. The assessment of photosynthesis also presents a key challenge of the remote sensing (RS) community. Sun-induced chlorophyll fluorescence (SIF) is considered the most direct RS signal to track photosynthetic activity and its dynamics at leaf, canopy, ecosystem, or even global scale [3,4]. Therefore, an accurate retrieval of SIF in crucial to understand photosynthesis and its dynamics.
SIF is emitted by chlorophyll-a (Chl a ), whose intensity depends on the incoming radiation and Chl a concentration. The full chlorophyll fluorescence spectrum covers the wavelength range from 650 up to 800 nm, from the red (F R ) to the far-red (F FR ) range of the spectrum [5] (Figure 1). The weak SIF signal is convolved with the vegetation's reflected radiance. This radiance is typically 10 orders of magnitude greater than SIF, making the decoupling of both a significant challenge. In order to retrieve SIF from spectroradiometric measurements, absorption features in the solar or Earth's atmosphere can be exploited. In particular, the solar Fraunhofer lines-Fe (758.8 nm) and KI (770.1 nm) [6]-or the Earth's two O 2 absorption features-O 2 B (687 nm) and O 2 A (760 nm) bands-are typically used due to their spectral proximity to the peaks of the chlorophyll SIF emission spectrum [7]. SIF retrieval using Fraunhofer lines requires high spectral resolution (SR) (e.g., << 0.1 nm), high radiometric resolution, and very high signal-to-noise ratio (SNR) [8]. Most proximal sensing instruments do not satisfy these requirements. Hence, this paper will focus on the retrieval of SIF at the oxygen absorption features, which do not require such a high-performance instrument. Meroni et al. [9] proposed that the retrieval methods used to quantify SIF can be divided into two major categories: Reflectance-based and radiance-based approaches. Particularly, radiance-based retrieval schemes using on the Fraunhofer line depth (FLD) principle, e.g., standard FLD (sFLD [10]), three bands FLD (3FLD [11]), improved FLD (iFLD [12]), and the spectral fitting method (SFM [13]) are the most frequently used methods for the proximal sensing of SIF (i.e., based on a bibliographic survey in Web of Science-core collection-during the last year 21 papers were published using the keyword sFLD, 22 using 3FLD, 22 using iFLD, and 23 using SFM) (FLD methods and SFM are described more in Section 2).
Many recent publications focus on the retrieval and interpretation of SIF from satellite platforms like GOSAT, GOME-2, and OCO-2 [6,14,15]. The first approach proposed for GOSAT took advantage of its very high spectral resolution (0.025 nm) to evaluate the in-filling by SIF of single solar Fraunhofer lines [6,16,17]. The same approach is now being used for OCO-2 [15,18]. This type of retrieval was later extended to wider fitting windows also including Earth's atmospheric absorption features in the case of GOME-2, SCIAMACHY, and more recently, TROPOMI, which have a coarser spectral resolution (0.5 nm) than GOSAT. Data-driven methods are used to empirically model atmospheric radiative transfer effects (e.g., References [19][20][21]). On the other hand, with the recent development of new in situ-based SIF measuring systems (e.g., FloX, JB Hyperspectral, Dusseldorf, Germany, PhotoSpec [8], Piccolo [22,23], and HyScreen [24]), the number of publications investigating proximal sensing approaches of SIF steadily increases [7,[25][26][27][28][29] and will possibly grow in the near future. Moreover, hyperspectral ground measurements play a crucial role in the calibration and validation of recent (e.g., Sentinel-2) and future (e.g., FLEX) satellite missions and can be used to upscale ground signals to the satellite.
Since SIF only represents a fraction of the radiance measured by the spectrometer, insufficient characterization of the sensor, inadequate measurement protocols, or an incorrect implementation of the retrieval method will lead to an erroneous estimation and interpretation of the SIF signal. For this reason, many researchers have worked within the framework of a European cooperation in science and technology (COST) action, "Innovative optical tools for proximal sensing of ecophysiological processes" (OPTIMISE, ES1309; https://optimise.dcs.aber.ac.uk/) over the last four years to address these challenges. The OPTIMISE community has compiled three papers: One on instrument characterization [30], one on measurement setups and protocols [31], and this paper on retrieval methods to make gathered information available. This compilation of papers aims to summarize the state-of-the-art for high quality SIF measurements and bring up-to-date the review papers of Meroni et al. [9] and Damm et al. [32].
This study aims to provide a guiding document to evaluate possible SIF retrieval accuracies for F R and F FR considering the characteristics of novel instrumentation (e.g., SR, spectral sampling interval (SSI), signal-to-noise ratio (SNR)) in combination with frequently applied retrieval schemes (e.g., sFLD, 3FLD, iFLD, and SFM). Further, this document aims to outline critical steps in the implementation of the retrieval schemes. The insights provided need to be considered in the total uncertainty budget of SIF retrieval. However, it is important to note that other factors such as canopy structure and atmospheric effects also need to be taken into account when estimating SIF uncertainties (cf. Reference [31]).
The paper is structured as follows: In Section 2 we give an overview of the most commonly used retrieval methods and outline their advantages and disadvantages. Section 3 describes sensitivity analysis evaluating how the retrieval accuracy depends on the retrieval methods and the dataset used to evaluate them. Section 4 presents and discusses the results of this work and, finally, the main findings are summarized in Section 5.

Background on Frequently Used SIF Retrieval Methods for Proximal Sensing
Vegetation fluorescence under solar illumination conditions adds a weak signal to the reflectance solar radiation. In the red and far-red spectral region, the up-welling radiance (L ↑ ) is the result of two contributions: The reflected radiance L R = R·E ↓ /π from the vegetation and fluorescence, where λ is the wavelength, R is the actual reflectance, E ↓ is the down-welling irradiance incident to the surface, and F the top of the canopy SIF radiance in the direction of observation. According to the international system of units, spectral L ↑ is defined as the radiant flux emitted, reflected, transmitted, or received by a surface, per unit projected area per unit solid angle per wavelength (e.g., mW m −2 sr −1 nm −1 ). L ↑ is a directional quantity. On the other hand, spectral E ↓ is defined as the radiant flux received by a surface per unit area per wavelength (e.g., mW m −2 nm −1 ). E ↓ is not a directional quantity. Therefore, to compute the reflected radiance, E ↓ needs to be divided by π. Finally, the reflectance obtained from L ↑ and E ↓ measurements and containing the contribution of F is defined as apparent reflectance (R app ) [9] (see Figure 2), (2) Figure 2. Actual reflectance (R, black line) and apparent reflectance (R app , red line). The actual reflectance was computed as the ratio between reflected radiance (L R ) and down-welling irradiance (E ↓ ); the apparent reflectance was computed as the ratio between total up-welling radiance (L ↑ = L R + F) and down-welling irradiance (E ↓ ). The small plots highlight the apparent reflectance. That is the contribution of fluorescence to the reflected radiance spectrum.
Most retrieval algorithms used to estimate SIF from ground measurements are based on the FLD principle. Conceptually, FLD approaches exploit the different relative contribution of F to the L ↑ and the E ↓ spectra inside and outside of an absorption feature ( Figure 3). FLD approaches exploit the ratio between these up-and down-welling fluxes in the form of the apparent reflectance factor (R app ). The method aims to separate F from L ↑ estimating, therefore, the actual reflectance (R), the ratio between reflected radiance and E ↓ , assuming a smooth R profile in the absorption region. Importantly, an inaccurate estimation R has a high impact on the retrieval of F [12]. Therefore, a precise and accurate function to interpolate R inside the absorption line is needed. This function should take into account the following factors: (1) The appropriate definition of both shoulders of the used absorption feature, (2) the selection of measurements outside the used absorption feature as end points for the interpolation, and (3) the selection of an appropriate interpolation method. The choice of these three factors differentiates each FLD-based retrieval scheme. In contrast, the SFM method decouples F and R using spectral curve fitting. The method relies on parameterized mathematical functions representing R and F within narrow spectral windows centered at the O 2 absorption features.

FLD-Based SIF Retrieval Methods
Retrieval methods based on the FLD principle take advantage of the reduced E ↓ in the O 2 B and O 2 A absorption features reaching the surface, which increases the relative contribution of F to L ↑ ( Figure 3). Such retrieval allows estimating F in a specific spectral band, 687 nm (F 687 ) at O 2 B and 760 nm (F 760 ) at O 2 A, which correspond to the smallest E ↓ and L ↑ value in the oxygen absorption windows ( Figure 1). While the O 2 B absorption window is close to the maximum peak of fluorescence at 685 nm (maxF 685 ), the O 2 A window is shifted towards longer wavelengths. Hence, F 760 represents only around 70% of the radiated F signal at the F FR peak (maxF 740, Figure 1).
Since the seminal review provided by Meroni et al. [9], various new SIF retrieval methods were proposed by the scientific community, for instance, the singular vector decomposition (SVD) [17,33], the nFLD approach [34], and the peak height method [35]. However, as presented in the introduction, sFLD, 3FLD, and iFLD are still the most frequently applied methods for proximal sensing of SIF. Therefore, only these approaches are included for further discussion and their main assumptions, strengths, and shortcomings are described.

sFLD
The Fraunhofer line depth (FLD) principle [10] relies on measurements of E ↓ and L ↑ inside and outside an absorption window (λ in and λ out ) such as solar Fraunhofer lines or Earth's atmosphere O 2 absorption features.
Fluorescence is calculated from a linear system of equations by comparing the measured signal inside and outside an absorption feature, A fundamental assumption is that F and R remain constant over the exploited absorption feature. The original FLD (sFLD) approach is relatively simple and requires the measurement of E ↓ and L ↑ inside the absorption window and only one measurement at the shoulder of the absorption feature. The main limitation of this approach is the frequent violation of the base assumption-that F and R are spectrally constant over the used absorption feature [12,36,37].

3FLD
In order to overcome the limitations of the sFLD approach, Maier et al. [11] proposed the use of three bands to solve Equation (4). The single reference band (λ out ) of sFLD is replaced by a linear interpolation of two bands on the left and right shoulders of the absorption feature. The advantage of this modification is that it considers a linear variation of R and F over the absorption window, which is a good approximation when the red-edge shoulder is displaced towards short wavelengths. Even though this approach is theoretically more advanced than the sFLD, non-linear variations of R and F often result in inaccurate estimates in particular for the O 2 B band.

iFLD
The iFLD method uses two correction factors (α R and α F ) to account for the non-linear variation of R and F inside and outside the absorption feature and to account for using the apparent reflectance instead of actual reflectance in the measurement. The method proposed by Alonso et al. [12] makes use of the full E ↓ and L ↑ spectral information in the region around the absorption to estimate the α R factor. F is expressed as: In order to calculate the value of α R and α F correction factors, it is necessary to know, in advance, F and the actual reflectance. Since this is not possible due to the variable nature of both parameters, Alonso et al. [12] proposed the use of R app to calculate the correction factors for reflectance (α R ) and F (α F ):α where R app (λ out ) is the apparent reflectance measured outside the absorption feature, and R app (λ in ) is the apparent reflectance inside the absorption feature λ in . R app (λ in ) is obtained from the non-linear interpolation of the apparent reflectance using the continuous reflectance spectrum at the left and right shoulders. Analogously, E ↓ (λ in ) is obtained by interpolation of the irradiance in order to predict E ↓ unaffected by atmospheric absorption. Replacing α R and α F by the apparent coefficients (α R ,α F ), F can be expressed as:

Implementation of the FLD-Based Methods
In summary, these FLD-based retrieval schemes can be distinguished by the approximation of R and F over the spectral extent of the absorption feature used, while the sFLD requires a single measurement of E ↓ and L ↑ inside and outside the absorption feature. The 3FLD requires one measurement of E ↓ and L ↑ inside the absorption feature and two outside, at the left and right shoulders. The iFLD makes use of the complete E ↓ and L ↑ spectral information to computeα R andα F correction factors. Importantly, E ↓ and L ↑ must be defined in the same units. In the case of measuring E ↓ and L ↑ pointing a fiber optics towards a Spectralon ® panel (Labsphere, Inc., North Sutton, NH, USA) and vegetation target, respectively, both measurements are directional, i.e., are dependent of the observation direction. Therefore, both are defined as the radiant flux reflected by a surface, per unit projected area per unit solid angle per wavelength (e.g., mW m −2 sr 1 nm 1 ). However, if E ↓ is measured with a fiber optic attached to a cosine diffuser pointing the sky, E ↓ is not directional, i.e., E ↓ is integrated over the entire hemisphere. It is defined as the radiant flux received by a surface per unit area per wavelength (e.g., mW m −2 nm −1 ). However, provided that the cosine receptor responds according to Lambert's cosine law, the total luminous flux (global sky and sun irradiance) F ↓ tot = π × E ↓ max , being E ↓ max the peak luminous intensity at nadir incidence of the global irradiance. It is possible to divide E ↓ by π to define E ↓ and L ↑ with the same units (e.g., mW m 2 sr 1 nm 1 ).
The first, and a critical, step is to define the wavelength range covering the full O 2 B and O 2 A absorption windows. This is particularly relevant when the spectral resolution is sub-optimal and spectral bands not affected by absorption are not easily identifiable. In our study, we determined that the spectral ranges affected by O 2 absorption are: 686-697 nm (O 2 B) and 759-770 nm (O 2 A) (see Appendix C for further discussion). We consider that outside these ranges, the influence of O 2 absorption is minimal. Once the absorption region is identified, the spectral band representing maximum absorption can be easily identified by searching for the smallest E ↓ value in the respective wavelength range (Appendix A-sFLD, 3FLD, and iFLD red dot). It is recommended not to use a fixed wavelength position to define the band representing the minimum absorption. If the spectroradiometer used to measure is not accurately spectrally calibrated or there is a spectral shift due to variable measurement temperature (cf. Reference [24]), the use of a fixed wavelength position will introduce an error in the F retrieved. In addition, in case of using a spectroradiometer with low SNR, it is recommendable to define L ↑ (λ in ) and E ↓ (λ in ) as the average value of L ↑ (λ in ) and E ↓ (λ in ) of adjacent wavelengths. The next step is to determine the spectral lines defining the left and right shoulders of the absorption feature. This can be achieved by searching for E ↓ local maxima at both shoulders. According to the method assumptions, one or several wavelengths have to be used to compute E ↓ and L ↑ outside the absorption feature. Again, it is not recommended to use a fix wavelength to define the left and right shoulders position and by computing the average values of L ↑ (λ out ) and E ↓ (λ out ) using adjacent wavelengths it is possible to increase the SNR.
The interpolation method, and its implementation, is a key step in the most advanced retrieval algorithms. The main purpose of the interpolation is to eliminate the effect of the absorption feature (Earth or solar atmosphere) by constructing the L ↑ , E ↓, and/or R app spectra shape between the absorption features shoulders ( Figure 4, green line). Through the interpolation, a reference band inside the absorption feature can be computed, mitigating the effect of having a different reflectance inside and outside of the absorption feature, which has been the weak point in the earlier retrieval algorithms, such as sFLD. In the case of the E ↓ , interpolation is easier, since the solar spectrum is very well characterized. The reflected radiance is more complex, since it is modulated by the reflectance and the fluorescence spectral shapes, which are complex, especially in the red region around the O 2 B absorption. Therefore, it requires more flexible interpolating functions capable to fit its curvature (see Figure 4B,D). For the interpolation, it is necessary to determine the known data points from which construct the new data points ( Figure 4, green dots). Properly selecting those points is the most critical step for an accurate interpolation. Similar to previous steps, this can be achieved by searching for E ↓ local maxima at both shoulders of the absorption feature (green dots and lines in Figure 4A,C). Then, those bands can be used to interpolate L ↑ and/or R app . Importantly, the small Earth's atmospheric absorptions (e.g., water) present in the left and right shoulders of the oxygen absorption feature must be avoided. Otherwise, if the points inside those absorptions are included as key points for the interpolation, the resulting curve will be shifted downwards, which can introduce an error of a similar or larger magnitude than the fluorescence itself (blue dots and lines in Figure 4A,C). This problem is great for spectrometers of finer spectral resolution, since they resolve deeper fine features. Instruments with coarser spectral resolution are less sensitive, since they cannot resolve this level of detail.
The presence of noise also needs to be considered as it distorts the spectrum, and this might, in turn, distort the interpolation. Therefore, the use of spectrometers with very high SNR is recommended. R app is particularly affected by noise, since noise will be amplified by dividing L ↑ by E ↓ (Figure 4B,D). In this case, it is possible to mitigate the effect of noise by two approaches: One would be to avoid using spline or any other method that obliges the resulting curve to pass precisely through the key points used for interpolating; the other is to use noise-reduction algorithms to mitigate noise, pixel binning being the simplest one, or by applying a low-pass filter. However, care should be exercised to ensure that spectral features are not smoothed out when denoising the spectrum.
Appendix C provides additional detail on wavelength setting, spectral window intervals, and interpolation method used in this study for the sFLD, 3FLD, and iFLD retrieval methods. We would like to highlight that these settings worked for this specific dataset (cf. Section 3) and should not be seen as a standard setting. It is important to adapt the herein-described setting to the characteristics of the used instruments and observed vegetation [38].

Background of the Spectral Fitting Method
The SFM is a technique to decouple SIF and reflectance from high spectral resolution radiance observations. The method relies on general mathematical functions representing canopy R and F within narrow spectral windows centered at oxygen absorption features. As previously defined, in the case of ground-based spectrometer, radiance detected by sensors includes contributions from canopy L R and emitted F. The parameters of the functions employed to represent F and R are thus optimized by non-linear least square optimization process, comparing instrument observations with radiance computed accordingly with Equation (1). In this way, the a-priori F and reflectance functions can be spectrally decoupled on the base of their contribution at the different spectral lines.
The use of Spectral Fitting for proximal sensing measurements is easier compared with airborne or satellite measurement because the surface irradiance is directly measured, therefore Equation (1) can be solved without additional information (i.e., indirectly estimating top of canopy irradiance). In the past, several functions were proposed to fit the SIF and reflectance spectral behavior within narrow spectral windows as suggested in References [9,13,37,39]. However, SIF is generally represented as peak-like functions such as Gaussian, Lorenzian, or Voigt; whereas canopy reflectance is usually described as second-or third-order polynomial or more complex piecewise cubic splines.
The major advantage of SFM is the exploitation of a large number of spectral bands because the mathematical system, which describes L ↑ at different wavelengths, is better conditioned and, consequently, estimations are less affected by instrument noise. The SFM method is therefore particularly useful for spectroradiometers with very high spectral resolution that suffer from higher noise levels. On the other hand, this approach has a number of limitations that are due to its complex setup and the overall computational time. Recently, the SFM approach has been further extended to a unique and broad spectral window that covers the entire SIF emission region (650−800 nm) to estimate the full SIF emission spectrum consistently [13].

SFM Retrieval Method Implementation
As previously described, this method relies on general mathematical functions representing the R and F spectrum within a narrow spectral window. Before going into the description of the SFM implementation, the mathematical functions used to model R and F are defined. Since R (not distorted by F) and F spectral emission are not available from field measurements, first, a set of representative R and F spectra are modeled using a canopy reflectance-SIF radiative transfer model. Then, the accuracy of different functions used to represent R and F reference spectra are tested using the reference dataset. For instance, Cogliati et al. [13] used the FluoSpec radiative transfer model (RTM) to build a reference dataset simulating the spectral characteristics of the FLEX spectrometer FLORIS. In their study, different functions were tested to model R (i.e., linear, quadratic, cubic spline) and F (i.e., linear, quadratic, cubic, Gaussian, Lorentzian, and Voigt). From this analysis, they concluded that a cubic spline function and a Voigt function are the best functions to model R and F, respectively. On the other hand, Julitta et al. [40], also using FluoSpec to build a reference dataset but simulating the spectral response of Ocean Optic's QE Pro spectrometer, suggest to use a cubic spline function to model R and a Gaussian function to model F.
In the study presented here, the Gaussian function performed better than the Voigt function in modeling F (and R 2 = 0.96 and R 2 < 0.5 for the Gaussian and Voigt functions, respectively). Cogliati et al. [13] propose that different sensor spectral responses (e.g., SR, SSI, and SNR) may lead to different functions to best represent R and F. This may explain why, for proximal sensing spectrometers, a Gaussian function was the most suitable algorithm to model F [40] instead of a Voigt function as suggested by Cogliati et al. [13] for FLORIS, FLEX space-borne sensor. Therefore, in this study, R was approximated using a cubic spline function and a Gaussian function was used to model F, where a stands for the heights of the red (O 2 B) or far-red (O 2 A) F curve's peak, λ represents the wavelengths of the defined F and R fitting window interval, λ 0 is the first wavelength of the defined F and R fitting window interval, c is the central wavelength of F peaks, and b controls the width of the red and far-red F spectrum (Appendix C).
To retrieve F with the SFM, high-resolution E ↓ and L ↑ spectral measurements are needed. Similar to FLD retrieval methods, E ↓ and L ↑ must be defined with the same units (cf. Section 2.1.4). For this study, the F and R fitting spectral windows are defined between 750-780 nm (O 2 A) and 680-698 nm (O 2 B) (Appendix C). The first step to retrieve SIF using the SFM is to compute R app and interpolate within the defined absorption range (O 2 A: 759-770 nm and O 2 B: 686-697 nm) (Appendix C). The output parameters of the interpolated R app are used in the optimization cost function as the R 'first guess' parameters. Secondly, by using the iFLD method (Section 2.1), F at each O 2 absorption feature is estimated. The retrieved F is used in the optimization cost function as a first-guess parameter for the height of the red (O 2 B) and far-red (O 2 A) F peak curve (a parameter of the Gaussian function used in this study). Third, the first guess as well as the lower and upper boundaries for the R and F functions need to be defined. Finally, the parameters of the functions employed to model F and R are optimized by non-linear least square optimization process, comparing instrument observations with radiance computed by Equation (1). In order to estimate the accuracy of the retrieval method, it is recommended to compare the modeled and reference L ↑ . For a good implementation of the SFM, the differences between modeled and measured L ↑ must be minimal.
In Appendix C, a diagram that sets out the most relevant steps when implementing the SFM retrieval method is presented. Appendix C then summarizes the input parameters, wavelength settings, wavelength intervals, and R and F model functions used in this study. Here, again, we would like to highlight that these settings specific for this dataset (cf. Section 3). These should not be considered as standard settings.

Assessment of SIF Retrieval Uncertainties-Sensitivity Analysis
The aim of this analysis is to assess possible SIF retrieval uncertainties for the combination of consolidated SIF retrieval approaches presented using off-the-shelf spectrometers currently used for proximal sensing of SIF. A summary of considered sensors and their characteristics is presented in Table 1 and Figure 5. Table 1. Characteristics of the typical off-the-shelf spectrometers for the measurement of sun-induced fluorescence: Spectral range, spectral sampling interval (SSI), optical resolution measured as full width at half maximum (FWHM), and signal-to-noise ratio (SNR). * OceanOptics spectroradiometers can be built with these or other characteristics by demand.

Spectrometer
Range (  For the data simulation, we employed a combination of the RMT FluorSAIL3 and MODTRAN-5 [41] to simulate radiance (L TOC ) and top-of-canopy (TOC) irradiance (E TOC ), respectively. FluorSAIL3, an extended version of the physically based FluorSAIL model [42], provides simulations of surface reflectance and emitted SIF with a spectral resolution of 1 nm. The coupling with the atmosphere was realized using the four-stream theory [43], while atmospheric functions were calculated with MODTRAN-5 [41]. Tables 2 and 3 list the simulation parameters and variables used in this study representing 16 typical vegetation targets, observed under a cloudless sky. We deliberately considered vegetation types as defined in Damm et al. [32] to relate our results with previous investigations and allow progress in the field of SIF proximal sensing to be tracked. The advantage of using simulated data is that SIF emission and surface reflectance are precisely known ( Figure 6). E TOC ↓ and L TOC ↑ were simulated assuming Lambertian surface reflectance and neglecting adjacency effects following where E o s represents the combined direct and diffuse top-of-atmosphere irradiance flux for a given sun-incident angle θ s . τ ss is the atmospheric transmittance for direct downwelling irradiance, τ sd is the atmospheric transmittance for diffuse downwelling irradiance, and ρ dd is the spherical albedo.
Angle brackets indicate functions that need to be convolved individually to avoid artefacts stemming from this processing step.
To evaluate the impact of sensor parameters on the SIF retrieval accuracy, simulated E TOC ↓ and L TOC ↑ were spectrally resampled using a two-step approach that includes the convolution of the input signal using a Gaussian function and the sampling of the convolved signal considering predefined sensor characteristics, following the approach described in Damm et al. [32]. We then added random Gaussian distributed noise with a mean value of zero and a standard deviation equal to the ratio between signal level (e.g., L TOC ↑ ) and SNR to E TOC ↓ and L TOC ↑ as: where N is a function providing a random value according to a normal distribution (µ,σ) defined by its mean (µ) and standard deviation (σ). For the SNR, we identified representative values per sensor from literature and assumed the noise contribution constant across wavelengths (cf. Table 2). Finally, retrieved SIF signals were compared to the modeled reference SIF at the wavelength representing the maximum absorption (e.g., lowest L TOC ↑ ). We then calculated several quality statistics including, the total relative error (RE), the coefficient of determination (R 2 ), and the root mean square error (RMSE).

Impact of Sensor Specification on SIF Retrieval Methods
The following Section describes the impact of the different sensor specifications (i.e., SR, SSI, and SNR) on the precision, accuracy, and relative error of four different retrieval methods: sFLD, 3FLD, iFLD, and SFM. First, we compare the reference (modeled) and retrieved F 687 and F 760 of the 16 simulated canopy cases without noise (Figure 7 and Table 4) to assess the effect of SR and SSI. Afterwards, the relative impact of SNR on F 687 and F 760 retrieval is quantified by adding the corresponding noise ( Figure 8 and Table 4). The coarse SR and SSI of the ASD only allows assessing the performance of sFLD, 3FLD and iFLD method, while for sensors with higher SR and SSI (e.g., MAYA, QE Pro and HR4000), all four approaches were tested.
Considering retrieval in the O 2 A band using high-resolution instruments, we obtain nearly identical results for F 760 estimates using 3FLD, iFLD, and SFM methods where the regression lines of F 760 ( Figure 7E-H) and R ( Figure 9E-H) are close to the 1:1 line (R 2 ≈ 0.98 and RMSE ≈ 0.08 mW m −2 sr −1 nm −1 ). In more detail, the SFM approach has a tendency to underestimate R, while 3FLD, iFLD, and SFM overestimate F 760 for higher F values. On the other hand, sFLD and 3FLD applied to low-resolution sensors constantly overestimated F 760 . However, when using the iFLD, an R 2 = 0.91 and RMSE = 0.18 mW m −2 sr −1 nm −1 was found. The sFLD performs worst in estimating F 760 with a significant overestimation for all sensors (R 2 = 0.93-0.94 and RMSE = 0.3-0.32 mW m −2 sr −1 nm −1 ). As described by Damm et al. [32], the strong overestimation of F 760 by the sFLD method is caused by a violation of the assumption that R and F are spectrally constant over the O 2 A band. The 3FLD and iFLD methods assume a linear and non-linear relationship respectively between R and SIF, which results in a better prediction of R and F 760 .    The retrieval of F 687 at the O 2 B band is more challenging and was found to be unfeasible with low-resolution instruments such as the ASD (Figure 7A-D). Higher resolution sensors MAYA, QE Pro, and HR4000, however, all perform reasonably well for F 687 retrieval (e.g., for iFLD and SFM a R 2 between 0.88-0.91 and an RMSE between 0.34-0.38 mW m −2 nm −1 sr −1 was observed). Due to the non-linear relationship of R and F across the O 2 B band, both sFLD and 3FLD show a significant larger error compared to iFLD and SFM (R 2 ≤ 0.5 and RMSE = 1-8 mW m −2 sr −1 nm −1 ). By using a more complex interpolation method, iFLD and SFM can better estimate R at the O 2 B (R 2 = 1 and RMSE = 0.001 mW m −2 sr −1 nm −1 ), a fact that substantially impacts F retrieval.
When retrieving F 687 , the iFLD shows a tendency to underestimate high F 687 values, while low F 687 values are overestimated by both iFLD and SFM. This result does not agree with results presented by Jullita et al. [40] and Zhang et al. [44]. In their study, the same SFM implementation [40] was used as in the present study, but they found a 1:1 relation between retrieved and reference F 687 . We hypothesize that this could be explained by differences on how the F 687 and R reference data were simulated. In the case of Zhang et al. [44], FluoSpec was used to generate the reference dataset. Jullita et al. [42] also defined the R and F 687 functions implemented in their R code using FluoSpec reference data. In our study, we used FluorSAIL3 to generate the reference dataset (to be consistent with the study by Damm et al. [32]) but used the SFM fitted for FluoSpec. It is known that F 687 modeled by FluorSAIL3 presents a taller and narrower red SIF peak compared to FluoSpec. Thus, most probably the shape of spectra of F 687 and R simulated with FluorSAIL3 may not be captured with the FluoSpec-trained SFM approach. To ensure that the results presented were not due to incorrect implementation of the SFM approach, we retrieved F 687 and F 760 with the original R code from Julitta et al. [40] available on GitHub platform at https://github.com/tommasojulitta and we obtained the same results as with our SFM approach. We suggest the systematic assessment of the effect in the near future to ensure that the SFM is flexible enough to cope with possible dynamics of SIF. Figure 8 shows the RE of retrieved F obtained when including noise. In spite of its high SNR, the ASD shows the highest RE for F 760 and F 687 , where sFLD performs worst (RE = 234.5 mW m −2 nm −1 sr −1 , 370.1%) and iFLD best (RE = 11.8 mW m −2 nm −1 sr −1 , 41.2%). The high-resolution sensors QEPro, MAYA, and HR4000 provide better F estimates with lower RE for SFM (F 760 = 4.5-4.9%, F 687 = 5.9-7.2%) compared to the iFLD method (F 760 = 4.7-9.3%, F 687 = 9.7-13.4%). From these three sensors, the QE Pro produces lowest RE for F 760 and the HR4000 lowest RE for F 687 . The SFM furthermore shows the lowest error range and lowest mean RE for F 760 and F 687 (6.2-6.5% and 25.3-27.8%).
Importantly, when retrieving SIF with the sFLD or 3FLD, the relative error was 2-6 (QE Pro-F 760 ) and 6-11 (HR4000-F 687 ) times higher than with the SFM. Moreover, when retrieving SIF using the ASD simulated data, the error was 2-10 times higher than when using spectrometers with a finer spectral resolution, suggesting the need for care when interpreting SIF results retrieved with the sFLD and 3FLD and/or with spectrometers with SSI ≥ 1.4 nm and FWHM ≥ 3 nm.
In summary, the results presented in this study corroborate Damm et al. [32] results, where SNR was the most important parameter affecting the retrieval accuracy, followed by SR and SSI. Nevertheless, since in the study by Damm et al. [32], the spectrometers used present a relative high SNR, the accuracy and precision of retrieved F 760 and F 687 mainly depend on the SR and SSI. A finer spectral resolution generally reduces retrieval uncertainties and allows the use of the more sophisticated iFLD and SFM methods.

Uncertainties Caused by the Setup of Retrieval Methods
This section describes the impact of an incorrect implementation of the FLD and SFM methods for the SIF retrieval. For this purpose, retrieved and modeled F signals of a representative vegetation target were compared. The selected vegetation target represent the spectrum of a planophile canopy with a leaf area index of 1, a chlorophyll content of 80 µg/cm 2 , and SIF efficiency factor of 2 ( Table 3). The major sources of error, when retrieving SIF, can be attributed to: Error source 1: The F retrieval errors related to the definition of the wavelength representing the full oxygen absorption feature are shown in Figure 10. For this experiment, the left and right shoulder limits were simultaneously changed. For both retrievals, F 687 ( Figure 10A,B) and F 760 (Figure 10C), the relative error of FLD-based methods, increases with an increased absorption range. The 3FLD was the most affected retrieval method (RE-F 687 100-1200% and RE-F 760 10-20%), followed by sFLD (RE-F 687 10-70% and RE-F 760 30-70%), and iFLD (RE-F 687 2-70% and RE-F 760 1-30%), which led to an over-and/or underestimation of E ↓ and L ↑ outside the absorption feature (Appendix D, green and blue crosses). The SFM was not affected by changes in the absorption range. For the FLD-based methods, the oxygen absorption range affects the definition of the shoulder wavelength position bands (error source 2) as well as the interpolation to model E ↓ , L ↑ , and R (error source 3). This is the reason an incorrect characterization of the oxygen absorption window has a significant impact in the retrieval of SIF when using FLD-based method. For instance, for the 3FLD method, an uncorrected definition of the O 2 B absorption range ( Figure 10A, RE-F 687 = 1200%) led to an underestimation of E ↓ and to an overestimation of L ↑ outside the absorption feature (Appendix D, blue crosses), which in the end results in an underestimation of the retrieved fluorescence (F = −0.1049 mW m −2 nm −1 ). Figure 10. Fluorescence relative error as function of the absorption range position for sFLD (black), 3FLD (cyan), iFLD (red), and SFM (blue) retrieval methods at O 2 B (F 687 , A,B) and O 2 A (F 760 , C) bands. Plot B show the same information as A but for a narrow y-axes range. For this exercise, we modified the left and right shoulders limits simultaneously. Note that Y-axes of A-C differ.
Error source 2: Retrieval errors related to definition of the shoulder wavelength position is shown in Figure 11. For this experiment, the absorption range was fixed between 686-697 nm (O 2 B, Appendix C) and 759-770 nm (O 2 A, Appendix C), and systematically changed the shoulder wavelength position and computed corresponding F retrieval errors (i.e., RE). First, the left shoulder wavelength was set to a fixed position and the right shoulder wavelength changed ( Figure 11, O 2 B from 695-710 nm and O 2 A from 770-780 nm). Second, the left shoulder wavelength was modified by fixing the right one at a specific band (Figure 11 O 2 B from 670-685 nm and O 2 A from 750-760 nm). In general, the further away the shoulder wavelength position is from the absorption range feature, the greater is the error. For F 687 ( Figure 11A,B), the 3FLD was most sensitive to changes in the shoulder wavelengths position (RE-F 687 5-2500%), followed by sFLD (RE-F 687 20-200%), and iFLD (RE-F 687 5-35%). For F 760 ( Figure 11C,D), the sFLD was most sensitive to changes of the shoulder position (RE-F 760 200-800%), followed by 3FLD (RE-F 760 10-20%), and iFLD (RE-F 760 5-10%) method. SIF retrieval errors by the sFLD were two times higher when using the right shoulder, which indicates that the use of reference bands at the right shoulder should be avoided. The iFLD method is less sensitive to the shoulder wavelength position because, in contrast to sFLD and 3FLD methods, it uses the full spectrum to interpolate E ↓ and R in the absence of the oxygen absorption feature. Another source of error can occur when selecting the local maximum on both shoulders ( Figure 11, green dots). The incorrect selection of these points can result in a RE of 60% for F 687 and RE 35% for F 760 (Figure 11, blue dots).  Figure 12D, O 2 A 720-750 nm). Figure 12 shows that the RE increase with increasing interpolation range resulting in values of 120% at F 687 and 30% at F 760 for the iFLD and a RE of 15% at F 687 and 30% at F 760 for the SFM. In this experiment, for both retrieval methods, a cubic spline function was used to interpolate R in the absorption feature. A splines function will go exactly through all the key points provided to perform the interpolation. This decreases the interpolation accuracy at the absorption feature when the key points selected are located some distance away from the absorption region. The SFM yields a lower RE than the iFLD because the parameters of the R interpolation are used as a first guess in a non-linear least square optimization function. Therefore, a small error in the R parameters first guess is corrected in the optimization process (Appendix C). Error source 4 only relates to the SFM approach. For a good implementation of this method, the crucial issue is to define a set of functions that are able to model the broad range of R and F dynamics found under natural conditions. As introduced in Section 2 in this study, and after testing a different set of functions to model R and F, a cubic spline function to model R and a Gaussian function to model F were implemented [40]. Hence, for this specific implementation of the algorithm, the parameters that could be considered for the sensitivity analysis are (i) absorption feature range, (ii) F and R fitting interval, (iii) R interpolation, and (iv) the Gaussian function parameter b, which controls the width of the red and far-red SIF modeled peaks. We will now focus on the b parameter characterization. Figure 13 shows the impact on the retrieved SIF when the b factor first guess changes from 0 to 100. The F retrieval is mostly sensitive to low b values, for instance the RE varies between 10-310% (F 687 ) and 5-20% (F 760 ) for b values between 1 to 3. When b > 3, the RE decreases to 6% (F 687 ) and 1% (F 760 ). Similar to the R parameters, the value assigned to the b factor is used as a first estimate in a non-linear least square optimization function. A small error in the b factor first estimate is corrected in the optimization process (Appendix B).

Conclusions
For the first time, the retrieval errors of F 687 and F 760 have been systematically analyzed, taking into account two known error sources, the sensor configuration (i.e., SR, SSI, or SNR), and the retrieval method applied. Our extended analysis reveals further sources of error caused by the definition and implementation of retrieval approaches, including the selection of appropriate wavelengths representing the absorption feature and related shoulder values.
We identified the HR4000 and QE Pro spectrometers in combination with the SFM approach to provide the most reliable SIF retrieval with a relative error ≤ 6% and < 5% for F 687 and F 760 , respectively, followed by iFLD with a relative error for F 687 ≤ 10% and F 760 < 5%. Regarding the retrieval method implementation, again, the SFM was the least affected by the incorrect definition of the oxygen absorption spectral window, interpolation strategy, or model parameters characterization. Although the SFM approach was found to yield highest SIF retrieval accuracies, we also observed a sensitivity of the SIF retrieval to the simulated training data underlying the SFM model implementation. In contrast to iFLD approach, the SFM is less sensitive to the instrument noise but more sensitive to R and F modeling. We suggest further investigation of this sensitivity is required.
We would like to encourage the scientific community to validate their implementation of the different retrieval methods by generating graphs of the most common error sources in the retrieval of SIF described in this paper (cf. Section 4.2). Furthermore, we recommend measuring fluorescence (e.g., grass) and non-fluorescence targets (e.g., soil) as part of the validation exercise (Dr. Andreas Burkart, personal communication). In the case of retrieving negative and/or extremely high SIF values in a fluorescence target (cf. Meroni et al. [9], Section 5 for SIF range variation) and SIF values different to zero in a non-fluorescence target, a revision of the implemented retrieval method must be performed.
Finally, we would like to emphasize the importance of instrument characterization [30], measurement protocol [31], and implementation of the retrieval methods (this paper). We propose that the insights provided in the three review papers can assist the scientific community to achieve higher accuracies and replicability of their SIF measurements to significantly contribute to advancing vegetation research.

Acknowledgments:
We are deeply thankful to all members of the ESSEM COST Action ES1309 OPTIMISE for many stimulating meetings that helped to develop the science and formed networks of collaborators and friends. This was made possible through the funding and support ofCOST (https://www.cost.eu), which we gratefully acknowledge. Moreover, we would specifically like to thank all the local organisers of workshops, teachers, working group leaders, core group members and others that helped to keep OPTIMISE running. We thank Vicente Peiro-Silva for help in preparing Appendices A and B. Finally, yet importantly, we thank the three anonymous reviewers that helped to improve the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations Terms frequently used in studies of sun-induced chlorophyll fluorescence.
Description Acronym Units Up-welling radiance Spectral canopy-leaving radiance in the observation direction, including both the reflected component (L R ) and the emitted component (F). It is defined as the radiant flux emitted, reflected, transmitted, or received by a surface, per unit projected area per unit solid angle per wavelength. It is a directional quantity.

Reflected radiance
Spectral reflected radiance in the observation direction. L R , L refl Down-welling irradiance Spectral incoming irradiance integrated over the entire hemisphere. It is defined as the radiant flux received by a surface per unit area per wavelength. It is not a directional quantity.

Reflectance factor
The spectrally resolved ratio of the amount of radiation reflected by a surface to the amount of radiation incident on the surface, for a specific observation geometry.

Apparent reflectance factor
The spectrally resolved ratio of the amount of radiation reflected and emitted by a surface (i.e., including fluorescence) to the amount of radiation incident on the surface, for a specific observation geometry.

R app , R* [-]
Spectroradiometer A device designed to measure spectrally resolved radiance/irradiance over a defined region of the electromagnetic spectrum.

Description Acronym Units Spectral resolution
Describes the ability of a spectrometer to define fine wavelength intervals. The finer the spectral resolution, the narrower the wavelength range for a particular band.

Spectral sampling interval
Distance between the central wavelength of two consecutive spectral bands.

SSI [nm, µm]
Full width half maximum Width, at half of its maximum amplitude, of a function describing the spectral response of a spectral band.

FWHM [nm, µm]
Signal-to-noise ratio Ratio between the power of a signal to the power of instrument noise, measured for the same spectral band.

SNR [-, dB]
Spectral band/Spectral line A certain region of the electromagnetic spectrum sampled with an instrument, defined by its FWHM and central wavelength. Resulting from the emission, reflection, absorption, or transmission of light in a narrow frequency range.

Spectral window
A certain region of the electromagnetic spectrum, defined inside a minimum and maximum wavelength range.

Radiative transfer model
A set of equations describing the interaction between the electromagnetic radiation and a certain medium (e.g., atmosphere, vegetation).

Solar and Earth atmosphere absorption features
Spectral regions in which the incoming radiance at ground level is strongly reduced due to absorption by specific chemical compounds. Absorption features due to absorption in the solar atmosphere (i.e., solar Fraunhofer lines). These spectral regions appear "dark" also at top of Earth atmosphere.

Shoulder of the absorption features
The closest spectral region to an absorption feature that is not influenced by the absorption, usually referred to as left shoulder (towards shorter wavelengths) or right shoulder (towards longer wavelengths). Description Acronym Units F emitted in the red region of the spectrum at a specific wavelength (not an integrated value), depending on the retrieval method used. F emitted in the far-red region of the spectrum at a specific wavelength (not an integrated value), depending on the retrieval method used.