New Spectral Fitting Method for Full-Spectrum Solar-Induced Chlorophyll Fluorescence Retrieval Based on Principal Components Analysis

The full-spectrum Solar-Induced chlorophyll Fluorescence (SIF) within the 650800 nm spectral region can provide important information regarding physiological and biochemical activities in vegetation. This paper proposes a new Full-spectrum Spectral Fitting Method (F-SFM) for the retrieval of SIF spectra based on Principal Components Analysis (PCA). Using F-SFM, both the full-spectrum reflectance and SIF within the 650800 nm region were modeled by PCA based on a training dataset simulated by the Soil Canopy Observation, Photochemistry and Energy fluxes (SCOPE) model, and the weighting coefficients of the principal components were estimated by the least-squares fitting method. An iterative process was employed to improve the accuracy of the estimation of the reflectance. In each iteration, the SIF spectra retrieved from the last run were removed from the total upwelling radiance to minimize the small contribution of the SIF to the apparent reflectance outside the absorption bands. Then, the F-SFM algorithm was tested using both simulated and field-measured data with different Spectral Resolutions (SRs) and Signal-to-Noise Ratios (SNRs). For data with an SR of 0.3 nm and without noise, the Relative Root Mean Square Error (RRMSE) was less than 14% within the spectral region that was studied, and the peak-value ratio (SIF735/SIF685) was accurately estimated with an RRMSE of 3.56%. In addition, the F-SFM algorithm proved less sensitive to the SR than the three-band Fraunhofer Line Discrimination (3 FLD) and improved FLD (iFLD) methods. In OPEN ACCESS Remote Sens. 2015, 7 10627 the case of the field spectral data with SRs of 3 nm and 0.3 nm, the double-peak shape and the diurnal variation trend of the SIF spectra could be reasonably reconstructed by F-SFM, and the retrieved SIF values at the O2-A and O2-B bands were consistent with those retrieved by 3FLD from data with a high SR (0.3 nm) and SNR (1000). Therefore, the F-SFM method can provide full-spectrum SIF information with high accuracy even at relatively low SRs and SNRs, and shows promise for use in applications involving the SIF shape information.


Introduction
A characteristic spectral emission known as Solar-Induced chlorophyll Fluorescence (SIF) can be observed in the red and far-red spectral regions.Photosynthesis is driven by the energy absorbed by chlorophyll molecules.When a chlorophyll molecule absorbs a photon, it will be transferred from the ground state to an excited state and become extremely unstable.There are three pathways for the excited molecule to return to the ground state [1].First, the chlorophyll can re-emit a photon with a longer wavelength (because part of the energy is converted into heat)-this is the process of fluorescence.Second, all the energy of the excited chlorophyll can be converted into heat with no emission of a photon.Third, the energy can be transferred to other molecules to be used in the chemical reactions needed for photosynthesis.Typically, only about 1% of the absorbed sunlight is emitted through fluorescence [2].
Together with another dissipative pathway, non-photochemical quenching (NPQ), fluorescence competes with photosynthesis for the use of absorbed light [3].Therefore, SIF is strongly related to the photosynthetic process.In earlier studies, the Photochemical Reflectance Index (PRI) [4], which is linked to the xanthophyll cycle, has been used as a proxy for photosynthesis but is strongly affected by the canopy structure, leaf pigments and background [5].In contrast, SIF seems to be a better indicator of photosynthesis [6].Numerous studies have shown that making measurements of SIF is a reliable way of obtaining physiological information about plants quickly and in a non-invasive way.For instance, Meroni et al. [7] and Ni et al. [8] showed that stress in an early phase can be detected by SIF measurements; the studies of Delalieux et al. [9] indicated that SIF measurements could be used in the early detection of vegetation disease; and a strong relationship between SIF and Gross Primary Production (GPP) has been demonstrated in several studies at both canopy and global scales [10][11][12][13][14]. Therefore, it is important to measure the SIF signal in a fast and accurate way.
The SIF spectrum covers the approximate spectral range of 650-850 nm.It has a distinct spectral shape with one peak at around 685690 nm and another at 730-740 nm [15,16].There are two photosystems in plants-PS I and PS II-which work in tandem and have different fluorescence emissions.The majority of fluorescence emissions originate from PS II.PS II contributes to the SIF emission in both the red and far-red spectral regions, whereas PS I contributes to the SIF emission only in the far-red region and has a much smaller yield [17,18].As a result, the intensity and shape of the SIF spectrum are related to the amount of energy absorbed by PS II and PS I [19,20].The left-hand peak in the red region partially overlaps with a peak in the chlorophyll absorption spectrum.This means that light emitted by plants in this spectral region is partially re-absorbed by foliar chloroplast or other biochemical components.The amount of re-absorption is determined mainly by the canopy chlorophyll content [21] and hence the ratio of the two SIF peak-values (SIF735/SIF685) can also serve as an indicator of foliar chlorophyll content [21,22].
The use of remote sensing techniques is necessary for the detection of SIF at the canopy level.Compared with the reflected radiance, the SIF signal is quite weak, so the solar Fraunhofer lines or telluric atmospheric absorption bands need to be used to decouple the fluorescence from the reflection based on the Fraunhofer Line Discrimination (FLD) principle [23,24].Consequently, most of the commonly used FLD-based methods (comprehensively reviewed by Meroni et al. [3]) can only be used for the retrieval of SIF at certain separate absorption bands-the shape of the SIF spectrum and the peak-values, which are also important, are unavailable.Therefore, a new algorithm for full-spectrum SIF retrieval is needed.
The differences between the FLD-based methods are mainly concerned with how the reflectance and SIF are assumed to vary near the absorption band.The standard FLD method assumes that the reflectance and SIF near the absorption band are constant [23], which is quite a weak assumption for data with a Spectral Resolution (SR) of about 1 nm.However, the 3FLD method assumes that the variation in the reflectance and SIF is linear [25].In the iFLD method, two correction coefficients estimated from the smoothed apparent reflectance are introduced to deal with the variation in the reflectance and SIF [26].In addition to the FLD-based methods, Meroni et al. [27] proposed the Spectral Fitting Method (SFM), in which the reflectance and SIF spectra are described with simple mathematical functions such as polynomial and Gaussian functions.In this way, the variation in the reflectance and SIF near the absorption bands can be more accurately estimated.Meroni et al. [28] investigated the performance of the SFM at the O2-A and O2-B bands at canopy level and found that the SFM was more accurate than FLD under any noise configuration that was considered.Mazzoni et al. [29] successfully used the SFM-based algorithm for SIF retrieval from satellite data.Compared to the FLD-based methods, the SFM can retrieve the SIF spectrum within the spectral window being used and provides both the SIF intensity and shape information.Therefore, the SFM algorithm has the potential to retrieve the full-spectrum SIF.Recently, the SFM algorithm was used by Cogliati et al. [30] for the retrieval of full-spectrum SIF from airborne HyPlant imagery and the performance of different fitting functions for reflectance and SIF was tested.
The accuracy of the retrieval of SIF by the SFM algorithm depends on the accurate estimation of the reflectance and SIF spectra.In earlier studies, the reflectance spectra were usually simulated with polynomials [28] or splines [29].However, the shape of the reflectance spectra around the O2-B band is quite complex because this band is located near the "red-edge" and the contribution of SIF is relatively large.This means that it is quite difficult to simulate the reflectance spectra accurately at the O2-B band.The SIF spectra were often modeled using double Gaussian or Voigt functions [29,30] but this meant that a lot of parameters needed to be determined and thus the algorithm was not so robust.To deal with these problems, Principal Components Analysis (PCA), which is widely used in the decomposition and reconstruction of signals, can serve as an alternative method for the SIF retrieval because both the canopy reflectance and SIF have some unique spectral characteristics that can be extracted from a proper training dataset.Zhao et al. [31] proposed a Fluorescence Spectrum Reconstruction (FSR) method to reconstruct the full-spectrum SIF based on the Singular Vector Decomposition (SVD) technique, which is also a method for feature extraction similar to PCA.This method uses the SIF values from within five absorption bands-these values having been retrieved by the SFM (with the reflectance and SIF spectra around each band being modeled by polynomials).Liu and Liu [32] presented a pFLD method to improve the accuracy of FLD-based methods for SIF retrieval.The core idea of the pFLD method is to simulate the reflectance at absorption bands by a linear combination of the principal components (PCs) derived from a training dataset.It has been shown that the reflectance at the O2-B band can be more accurately simulated in this way especially when the SR or Signal-to-Noise Ratio (SNR) is relatively low.
As the PCA-based method gives a satisfactory performance for the simulation of both reflectance and SIF spectra, it has the potential to be a powerful tool for the retrieval of full-spectrum SIF.Inspired by the earlier studies of SIF retrieval based on feature extraction (PCA or SVD) [31,32], here we propose a new Full-spectrum Spectral Fitting Method (F-SFM) for the retrieval of SIF.The main objectives of this paper are: (1) to present a new PCA-based full-spectrum SIF retrieval method (F-SFM); (2) to assess the accuracy of F-SFM under different SR and SNR conditions using a simulated dataset; (3) to test the performance of F-SFM using field-measured data.

Simulated Data
The Soil Canopy Observation, Photochemistry and Energy fluxes (SCOPE) model (version 1.53) by Van der Tol et al. [33] was employed to simulate the training reflectance and SIF spectra.SCOPE is a vertical (1-D) integrated radiative transfer and energy balance model, which can simulate the canopy reflectance, photosynthesis, heat flux and chlorophyll fluorescence by linking the within-canopy radiative transfer with micro-meteorological processes.In order to make ensure that both the training and test datasets corresponded to the majority of the range of conditions encountered in practice, the parameter values for the training and test datasets were set to be different but not random (details listed in Table 1).The ranges of the main leaf and canopy parameters were set according to the statistics of the LOPEX'93 dataset [34] and empirical knowledge.As the fluorescence quantum yield efficiency (FQE) has no influence on the shape of fluorescence spectra, its value was set at a fixed 0.04 in the training dataset.Consequently, a total of 2880 samples were simulated for training and 2304 for testing.
We combined the MODTRAN 5 [35] and SCOPE models to describe the interaction of the Top of Canopy (TOC) surface with the atmosphere and the incident irradiance from the sun and the sky, as described in Equations ( 1) and (2) [36].
where o s E is the direct solar flux arriving at the top of atmosphere.ρ and τ are the reflectance and transmittance of atmosphere, respectively, r is the surface reflectance, and the attached double subscripts refer to the associated incident and exiting flux types, where s stands for direct flux, and d for diffuse flux.θs is the solar zenith angle.o s E , ρ and τ can be calculated by MODTRAN 5.As we did not take the influence of the atmospheric radiation transfer into account, the values of the atmospheric parameters were given fixed values (atmospheric profile: mid-latitude summer; aerosol model: rural; visibility: 23 km; surface height: 0 m).r can be simulated by the SCOPE model.In the MODTRAN 5 calculations, SRs of 0.3 nm, 0.5 nm, 1 nm and 3 nm were used, and the Spectral Sampling Intervals (SSIs) were set to be half of the SRs.
Therefore, the downwelling irradiance and total upwelling radiance at the TOC can be calculated using Equations ( 3) and ( 4), respectively: where R is the canopy reflectance.The SSI of SCOPE is 1 nm.To match the different SSIs of the simulated Eg, the reflectance and SIF spectra were resampled by splines as they are continuous and smooth in the visible and near-infrared regions [37].
In order to simulate different SNR conditions, random Gaussian-distributed noise was added [38].The SNR is defined as the ratio of the signal intensity (outside the absorption bands) to the standard deviation of the random Gaussian-distributed noise; SNR values of 300, 500, 1000, 5000, and 10,000 were used.

Field-Measured Data
For further testing, three field experiments were also carried out.We measured the diurnal canopy spectra of winter wheat (Triticum aestivum L.) with a customized Ocean Optics QE Pro spectrometer (SR = 0.31 nm; SSI = 0.155 nm; SNR > 1000; wavelength range: 645805 nm; Ocean Optics, Dunedin, FL, USA) at the National Precision Agriculture Demonstration Base located north of Beijing (4011N, 11627E) from 8:00 to 17:30 (local time) on 14 April (jointing stage, with a Leaf Area Index (LAI) of 1.94 and leaf chlorophyll content (Cab) of 62.2 μg/cm 2 ), 25 April (booting stage, with an LAI of 2.40 and Cab of 61.3 μg/cm 2 ), and 20 May 2015 (filling stage, with an LAI of 2.23 and Cab of 56.7μg/cm 2 ).In order to test the algorithm at different SRs, the canopy spectra were also measured by an ASD FieldSpec4 spectrometer (SR = 3 nm; SSI = 1 nm; SNR > 4000; Analytical Spectral Devices, Boulder, CO, USA) at the same time as the measurements made with the QE Pro spectrometer (except for the experiment on 14 April, 2015).The measurements were taken every half hour under stable weather conditions.In total, 53 valid measurements were made by the QE Pro and 30 valid measurements by the ASD FieldSpec4 on the three days.The fibers of the two spectrometers were fixed on to the horizontal rod of a tripod at a height of about 2 m above the canopy for nadir observation.In order to measure the TOC down welling solar irradiance, a 40-cm by 40-cm BaSO4 calibration panel was used.The LAIs were measured with an in situ scanning method, while the Cab values were measured using the SPAD-502 chlorophyll meter (Konica Minolta, Tokyo, Japan) and calibrated using an empirical model [39].

Band Selection
It is challenging to decouple the SIF contribution from the total upwelling radiance because SIF emission is quite weak compared with the reflected radiance.Although we aimed to retrieve the full-spectrum SIF, only the solar Fraunhofer lines or telluric atmospheric absorption bands could be used as, here, the emitted SIF is comparable to the reflected radiance.
The SIF spectra cover the spectral region from 650 to 850 nm. Figure 1 shows the total upwelling radiance and SIF spectrum at the TOC for the configuration of the QE Pro spectrometer that was used in the field experiments.There are four absorption bands in this spectral region that have the potential for SIF retrieval, including the Hα line centered at 656.5 nm, the O2-B band at 687.1 nm, the water vapor band at 718.8 nm and the O2-A band at 760.7 nm.However, as the water vapor absorption band is quite unstable when used with field-measured data [31], it was excluded in this study.Three absorption bands (Hα, O2-B and O2-A) were, therefore, employed.

Reconstruction of Reflectance and SIF Spectra
For full-spectrum SIF retrieval, the use of the SFM is necessary.According to the principle of the SFM, the key thing is to make an accurate estimation of the canopy reflectance and SIF spectra beforehand.However, for full-spectrum SIF retrieval, it is difficult to model the reflectance and SIF spectra accurately using simple mathematical functions, as was commonly done in earlier research.Inspired by earlier studies [31,32], PCA, a commonly used method for feature extraction, was employed in this study for the accurate estimation of both reflectance and SIF spectra.
PCA is a statistical method that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called Principal Components (PCs), as described by Equation ( 5): where X is the matrix representing the training dataset.In this case, the matrix has 2880 columns representing the 2880 samples of reflectance (or SIF) spectra in the simulated training dataset.The sample mean of each column is shifted to zero.Φ is the matrix representing the full set of PCs of the reflectance (or SIF); W is a 2880-by-2880 matrix whose columns are the eigenvectors of T XX .The reflectance and SIF can then be reconstructed using a linear combination of the first few PCs in Φ, as described by Equations ( 6) and ( 7): Where () R  and () SIF  are the reconstructed spectra of the reflectance and SIF respectively; () i  and φi(λ) are the ith PC of the reflectance and SIF, respectively; ki and ji are the weight coefficients of the ith PC; and nr and nF are the numbers of the reflectance and SIF PCs that were used, respectively.Based on the results of the study by Liu and Liu [32] and the results of the test with the simulated dataset, the first eight PCs of the reflectance and the first five PCs of the SIF, which account for about 99.999% of the spectral variance, were used in this study (Figure 2).The problem now turns to the estimation of ki and ji.The reflected radiation is the main component of the total upwelling radiance, especially in the atmospheric window region, where the contribution of the SIF is very small.The apparent reflectance R , (including the contribution of the SIF) outside the absorption bands can, therefore, be used to make an estimation of i k , as described by Equation ( 8) [32]:  It needs to be noted that the reconstructed reflectance spectrum ( () R  ) is not exactly equal to the real reflectance (R), because the weights of the PCs were fitted using the apparent reflectance spectra without y = 0.8224x + 0.0134 R² = 1.0000RMSE=1.31E-36.5 6.6 6.7 6.8 6.9 absorption bands, which was still coupled with a relatively small SIF contribution.Within each narrow absorption band, the relationship between the true reflectance R(λ) and the reconstructed reflectance () R  can be assumed to be linear (as shown in Figure 3).Two correction coefficients were, therefore, employed to estimate the true reflectance R(λ) using the reconstructed reflectance () R  .As the relative intensity of the SIF is different for each band, the correction coefficients should be set separately for each narrow absorption band.The reflectance near to the three absorption bands that were used can then be expressed as:

Spectral Fitting
Using () SIF  to replace SIF(λ) in Equation ( 4), the total upwelling radiance at the top of canopy near the three absorption bands can be written as: The underdetermined conefficients α, β and j can be estimated with the least-squares fitting method.Although correction coefficients (α and β) are employed, the accuracy of the estimated reflectance () R  still has an influence on the accuracy of the SIF retrieval.The difference between R(λ) and () R  is caused by the small contribution due to the SIF outside the absorption bands, the value of which is unknown beforehand.Inspired by the work of Mazzoni et al. [29], here we introduce an iterative process to minimize the SIF contribution, as described below, (1) Calculate the apparent reflectance ( ˆ() R  ) using the total upwelling radiance and downwelling irradiance.
(2) Estimate the weights of the reflectance PCs using the apparent reflectance ( ˆ() R  ) without absorption bands according to Equation ( 8) and then reconstruct the reflectance () R  using the weights of the different reflectance PCs according to Equation ( 6).
(3) Estimate the weights of the SIF PCs according to Equation (10) using the least-squares fitting method for the parts of the spectrum within the absorption bands and reconstruct the SIF spectrum () SIF  according to Equation (7).The efficiency of the iterative process was also investigated.Figure 3 shows the correlation between the true reflectance (R(λ)) and the reconstructed reflectance ( () R  ) for the first three iterations.As a result of the iterative process, the accuracy of the estimated reflectance can be significantly improved.The Root Mean Square Error (RMSE) (integrated over the spectral range of each absorption band-the ranges are as indicated in Equaiton ( 10)) for the second run is only about 310 percent of that for the first run.For the second and third iterations, the difference between the true and reconstructed reflectance is very tiny-an RMSE of less than 0.06% for all three bands.
Figure 4 shows the how the average Relative Root Mean Square Error (RRMSE) (integrated over the full spectral range of 645805 nm) in the value of the SIF retrieved from the 2304 test datasets (with an SR of 0.3 nm and without noise) varies with the number of iterations.The results show that the RRMSE for the second run is significantly smaller than that for the first run.However, when the number of iterations is greater than 3, the RRMSE in the SIF becomes quite steady.Therefore, the iterative process is effective at improving the accuracy of full-spectrum SIF retrieval.In this study, we used three iterations because a higher number of iterations does not help to improve the accuracy and makes the algorithm less efficient.

Accuracy Assessment with the Simulated Dataset
The simulated test dataset described in Section 2.1.1 was employed to quantify the accuracy of the full-spectrum SIF retrieval algorithm, because the true values of the SIF were not available from the field measurements.
Firstly, we analyzed the full-spectrum accuracy of the retrieved SIF using simulated test data with an SR of 0.3 nm and without noise.Figure 5a shows an example of a retrieved SIF spectrum (which has an average accuracy level) together with the true SIF spectrum from the test dataset.Figure 5b shows the full spectrum RRMSE of the 2304 retrieved SIF spectra in the test dataset.The average SIF spectrum and the central position of the three employed absorption bands are also plotted for reference.It can be seen that the retrieved and true SIF spectra fit very well.The RRMSE is less than 5% in the spectral region around 650-770 nm, which covers most of the SIF emission region, and is less than 14% within the full spectral region that we studied.The error is relatively small in the region of the three absorption   The ratio of the fluorescence spectrum peaks (SIF735/SIF685) is an important shape parameter in numerous applications (reviewed by Porcar-Castell et al. [20]).For example, it has been proved that this ratio is related to the physiological status of vegetation in terms of things such as the chlorophyll content [21] and the energy exchange process between PS I and PS II [19].Therefore, we analyzed the accuracy of the value of SIF735/SIF685 retrieved by the proposed F-SFM algorithm.Figure 6 shows the correlation between the true values of SIF735/SIF685 from the test dataset simulated by SCOPE and those retrieved by F-SFM for an SR of 0.3 nm and without noise.The points are located near to the 1:1 line with an R 2 of 0.98 and an RRMSE of 3.56%, which represents a satisfactory accuracy for the retrieved value of SIF735/SIF685.These results show that it is feasible to estimate the spectral characteristics of SIF (intensity and shape information) based on the proposed full-spectrum SIF retrieval method.Secondly, we tested the accuracy of the algorithm using simulated data with different SRs and SNRs.Contour maps were employed to illustrate how the variation in the full-spectrum RRMSEs in the retrieved SIF depends on the change in SR (for the range 0.33 nm) and SNR (for the range 30010,000) (Figure 7).The RRMSE contours show that the full-spectrum SIF retrieval method is not very sensitive to the SR.The sensitivity to SNR increases with a decrease in SR because the absorption depth is small for a low SR and the relative intensity of the SIF is low.When the SR and SNR are relatively low (3 nm and 300, respectively), the full-spectrum RRMSE is still less than 30%.For the data measured by the QE Pro (SR = 0.3 nm, SNR > 1000) and ASD FieldSpec4 (SR = 3 nm, SNR > 4000) spectrometers, the RRMSE is less than 10%, which is sufficient for common applications.In addition, we also compared the accuracy of the full-spectrum SIF retrieval method with the commonly used 3FLD [25] and iFLD [26] methods at the O2-A and O2-B bands under different SR or SNR conditions.The results are also shown as contour maps in Figure 8.The contours being more vertical indicates that the algorithm is more sensitive to the SR, while the contours being more horizontal indicates that the algorithm is more sensitive to the SNR.Because of the more accurate estimation of reflectance and SIF inside the absorption bands, the full-spectrum SIF retrieval algorithm is less sensitive to the SR and SNR, especially at the O2-B band.As Figure 8c shows, F-SFM is not sensitive to the SR for the O2-A band and an SR of 3 nm is sufficient here.For the O2-B band, the influence of the SR is more obvious because this absorption band is much narrower.When the SNR is higher than 1000, the accuracy of F-SFM becomes relatively stable for an SR above a certain value (Figure 8f).The accuracy of the F-SFM algorithm is comparable to that of the 3FLD and iFLD methods when the SR and SNR are sufficiently high and the accuracy is much higher than that of 3FLD and iFLD when the SR or SNR is relatively low.This indicates that the newly proposed PCA-based spectral fitting algorithm is not only useful for full-spectrum SIF retrieval but can, in fact, improve the accuracy of single-band SIF retrieval for data with a relatively low SR or SNR.For the configuration of the popularly used ASD FieldSpec4 spectrometer (SR = 3nm, SNR > 4000), the RRMSEs of 3FLD and iFLD at the O2-B band are higher than 80% and the performance of F-SFM is still acceptable (RRMSE < 15%).

Testing of the Field-Measured Data
Although the values of the vegetation parameters were set to be different, the training and test datasets were simulated by the same model.As it is a statistically based method, it was necessary to test the applicability of the proposed F-SFM method with real observations.Therefore, the field experiment datasets obtained using the QE Pro and ASD FieldSpec4 spectrometers were used for further testing.
Figure 9 shows the full-spectrum SIF retrieved from the diurnal field measurements made by QE Pro on 25 April 2015 at intervals of one hour.The results show that the double-peak characteristic of the SIF spectrum has been successfully reconstructed; the diurnal variation trend in the SIF is also reasonable and consistent with the results from literature (e.g.[40][41][42]).
As the true values of the SIF were not available in the field data, it was not possible to assess the accuracy of the retrieved full-spectrum SIF directly using the field-measured dataset.It was only possible to estimate the accuracy of F-SFM using the SIF values retrieved at some specific absorption wavelengths with the FLD-based method.As described before, the commonly used 3FLD method is fairly reliable for SIF retrieval at the O2-A and O2-B bands if the configuration of the QE Pro Spectrometer (RRMSE < 25%) is used.Therefore, the SIF retrieved by the 3FLD method at the O2-A and O2-B bands was employed as a reference to estimate the reliability of the proposed F-SFM algorithm.Taking the SIF retrieved by 3FLD from the dataset obtained by the QE Pro as a reference, we analyzed the accuracy of the SIF retrieved by F-SFM from the datasets obtained by both the QE Pro and ASD FieldSpec4 (Figure 10).For the data obtained by the QE Pro, most of the points are located near the 1:1 line with an R 2 of 0.97 and RRMSE of 12.6% for the O2-A band and an R 2 of 0.92 and RRMSE of 15.4% for the O2-B band, which indicates that the values of SIF retrieved by the two methods match well.For the data obtained by the ASD FieldSpec4, the points are still distributed around the 1:1 line but the correlation is somewhat weaker (R 2 of 0.88 and 0.66 and RRMSE of 18.7% and 29.4% for the O2-A and O2-B bands, respectively).The SNRs of both the QE Pro and ASD FieldSpec4 are sufficient (higher than 1000) for the F-SFM algorithm, so the much higher SR of the QE Pro (0.3 nm vs. 3nm) leads to more accurate SIF retrieval, especially for the O2-B band.In addition, we also compared the retrieved peak-value ratios (SIF735/SIF685) for the data obtained by the QE Pro and ASD FieldSpec4, as shown in Figure 11.The points are located around the 1:1 line with an R 2 of 0.71.These results show that the F-SFM algorithm is not very sensitive to the SR and can be applied to data obtained by spectrometers whose SR is relatively low, such as the popularly used ASD FieldSpec series.

Advantages of the Proposed F-SFM Algorithm
Some recent studies have been proposed focusing on the full-spectrum SIF retrieval based on the SFM algorithm [30,31].The reconstruction of reflectance and SIF spectra is a major problem in full-spectrum SIF retrieval.Cogliati et al. [30] tested the performance of different mathematical functions for the reconstruction of reflectance and SIF spectra.Zhao et al. [31] used polynomials for the reconstruction of reflectance but used the SVD technique for the reconstruction of SIF spectra.However, the shape of reflectance spectra is complex and difficult to accurately determine using simple mathematical functions.This is especially true for the O2-B band, which is located near the "red-edge".To deal with this problem, in F-SFM, both the reflectance and SIF spectra are modeled by PCA based on a simulated training dataset.We have also introduced an iterative process to improve the accuracy of the reflectance estimations.In each iteration, the SIF spectra retrieved from the previous run are removed from the total upwelling radiance to minimize the small contribution of the SIF to the apparent reflectance outside the absorption bands.Using this iterative process, the accuracy of the estimated reflectance can be significantly improved.
As shown by the tests using a simulated dataset, F-SFM can provide accurate estimate of full-spectrum SIF.In addition, compared to the 3FLD and iFLD methods, F-SFM is less sensitive to the SR or SNR.For example, the widely used ASD FieldSpec spectrometer has an SR of 3 nm, which is not sufficient for SIF retrieval by the 3FLD or iFLD methods at the O2-B band.The F-SFM algorithm, however, can provide a satisfactory SIF retrieval performance using the ASD FieldSpec spectrometer, giving an RRMSE of less than 20%.

Limitations of the Proposed Method
There are still some limitations in the proposed F-SFM algorithm.Firstly, the spectral characteristics of the reflectance and SIF data are extracted from the simulated training dataset.This means that the full-spectrum accuracy of the retrieved SIF spectra depends on how y = 0.7674x + 0.7796 R² = 0.7111 RRMSE=14.8%representative the training dataset is.In addition, it is very difficult to validate the full-spectrum SIF retrieval directly by field-measured data because no direct measurement method is available.Secondly, only the SIF-infilling information at three quite narrow absorption bands can be used for SIF retrieval due to the limited spectral resolution.For spectral data with a higher resolution, it should be possible to detect more Fraunhofer lines (such as the Fe line at 757.8 nm and K I line at 770.1 nm) and, consequently, the algorithm will be more stable.However, a higher SR will normally lead to a lower SNR and so it is necessary to find an optimal balance between these two key parameters of the spectrometer.
Thirdly, the values of the atmospheric parameters used in this study were fixed, which means that the effect of varying atmospheric conditions were not included.In addition, the SRs and SNRs of the two field spectrometers used for the algorithm testing are quite high-this would be difficult to achieve for space-borne or airborne spectrometers.
In future work, optimization of the training dataset and the balance between the SR and SNR should be investigated.The F-SFM algorithm also needs to be further developed so that it can be used for space-based SIF retrieval.

Conclusions
In previous research, most remote sensing SIF retrieval methods could only be used for the quantification of the SIF at individual bands.However, the shape of the SIF spectrum is also very important in many applications.In this paper, we proposed a new PCA-based full-spectrum spectral fitting method (F-SFM) for the retrieval of SIF spectra.We used PCA to extract the features of both the reflectance and the SIF spectra, and used an iterative process to improve the accuracy of the reflectance estimations.
The F-SFM algorithm was tested using both simulated and field-measured datasets that had different SRs and SNRs.The results obtained using the simulated test dataset with an SR of 0.3 nm and without noise showed that the RRMSE was less than 14% within the full spectral region that was studied; they also showed that the peak-value ratios (SIF735/SIF685) could be accurately estimated.Compared with the 3FLD and iFLD methods, F-SFM was less sensitive to the SR or SNR and performed well under all the SR and SNR conditions that were tested.The testing of the field data showed that the F-SFM can provide credible full-spectrum results for data obtained by the QE Pro spectrometer, which has an SR of 0.3 nm and SNR of more than 1000.The results for the data obtained by the ASD FieldSpec4 spectrometer (which has an SR of 3 nm and SNR greater than 4000) showed the same trend as the results for the data obtained by the QE Pro, which indicates that the F-SFM algorithm is also applicable to data obtained by spectrometers with relatively low SRs.Therefore, the F-SFM algorithm is suitable for accurate retrieval of full-spectrum SIF information, even at relatively low SRs and SNRs, and shows promise for use in applications involving SIF information relating to both the intensity and the spectral shape.

Figure 1 .
Figure 1.Total upwelling radiance and Solar-Induced chlorophyll Fluorescence (SIF) spectrum at the Top of Canopy (TOC) for the spectral range 650800 nm with a spectral resolution (SR) of 0.3 nm and Spectral Sampling Interval (SSI) of 0.15 nm.

Figure 2 .
Figure 2. The first eight PCs of the reflectance (a) and first five PCs of the SIF (b) generated from the 2880 simulated spectra in the training dataset.
is the pseudoinverse of the matrix Φ o .ˆo R is a vector consisting of the apparent reflectance ( ˆ() R  ) without absorption bands.In this way, the global shape of the reflectance spectra can be determined before the least-squares fitting for SIF retrieval is implemented, which should make the algorithm more robust.

Figure 3 .
Figure 3.The correlation between the true reflectance (R(λ)) and the reconstructed reflectance ( () R  ) for the first three iterations ((a-c), (d-f), and (g-i)) at the Hα (left), O2-B (middle), and O2-A (right) bands.The values of the coefficient of determination (R 2 ) and RMSE are also given in each subfigure.

( 4 )
Remove the estimated SIF spectrum () SIF  from the total upwelling radiance and then calculate a new apparent reflectance ˆ() R  ; go back to steps (2) and (3) to reconstruct a new SIF spectrum () SIF  .(5) Iterate until the reconstructed SIF spectrum () SIF  is stable and then let SIF(λ) equal () SIF  .

Figure 4 .
Figure 4.The variation in the average RRMSE in the SIF retrieved from the 2304 test datasets against the number of iterations.
bands that were used to estimate the weighting coefficients of the PCs of the SIF by least-squares fitting.The results, therefore, indicate that the F-SFM algorithm is quite accurate for full-spectrum SIF retrieval.

Figure 5 .
Figure 5. (a) An example of a retrieved reflectance spectrum (black dashed line) together with the true reflectance spectrum from the test dataset (solid gray line); (b) An example of a retrieved SIF spectrum (dashed black line) together with the true SIF spectrum from the test dataset (solid gray line); (c) The full spectral RRMSE of the 2304 retrieved SIF spectra in the test dataset (black) together with the mean SIF spectrum (gray).These results were derived from data with an SR of 0.3 nm and without noise.The dashed vertical lines show the central position of the three absorption bands.

Figure 6 .
Figure 6.The correlation between the true values of SIF735/SIF685 and those retrieved from the simulated test dataset (2304 samples with SR of 0.3 nm and without noise).

Figure 7 .
Figure 7. Contours of the full-spectrum RRMSE (%) in the SIF retrieved by the F-SFM algorithm using simulated test datasets with different SRs and SNRs.

Figure 8 .
Figure 8. Contours of the RRMSE (%) in the SIF at the O2-A (a-c) and O2-B (d-f) bands retrieved by the 3FLD (left), iFLD (middle) and F-SFM (right) algorithms using simulated test datasets with different SRs and SNRs.

Figure 9 .
Figure 9.The full-spectrum SIF retrieved from the diurnal field measurements made by the QE Pro on 25 April, 2015.For clarity, only measurements made on the hour from 9:00 to 17: 00 are plotted.

Figure 10 .
Figure 10.Comparison between the SIF retrieved by 3FLD using data obtained by the QE Pro and the SIF retrieved by F-SFM using data obtained by the QE Pro (blue points) or ASD FieldSpec4 (green triangles).The graphs shown are for the SIF at the O2-A (a) and O2-B (b) bands.

Figure 11 .
Figure 11.Comparison between retrieved ratios of SIF peak-values (SIF735/SIF685) for data measured by QE Pro and ASD FieldSpec4 spectrometers.