Adjustments to SIF Aid the Interpretation of Drought Responses at the Caatinga of Northeast Brazil

: Sun-Induced chlorophyll Fluorescence (SIF) relates directly to photosynthesis yield and stress but there are still uncertainties in its interpretation. Most of these uncertainties concern the influences of the emitting vegetation’s structure (e.g., leaf angles, leaf clumping) and biochemistry (e.g., chlorophyll content, other pigments) on the radiative transfer of fluorescent photons. The Caatinga is a large region in northeast Brazil of semiarid climate and heterogeneous vegetation, where such biochemical and structural characteristics can vary greatly even within a single hectare. With this study we aimed to characterize eleven years of SIF seasonal variation from Caatinga vegetation (2007 to 2017) and to study its responses to a major drought in 2012. Orbital SIF data from the instrument GOME-2 was used along with MODIS MAIAC EVI and NDVI. Environmental data included precipitation rate (TRMM), surface temperature (MODIS) and soil moisture (ESA CCI). To support the interpretation of SIF responses we used red and far-red SIF adjusted by the Sun’s zenith angle (SIF-SZA) and by daily Photosynthetically Active Radiation ( d SIF). Furthermore, we also adjusted SIF through two contrasting formulations using NDVI data as proxy for structure and biochemistry, based on previous leaf-level and landscape level studies: SIF-Yield and SIF-Prod. Data was tested with time-series decomposition, rank correlation, spatial correlation and Linear Mixed Models (LMM). Results show that GOME-2 SIF and adjusted SIF formulations responded consistently to the observed environmental variation and showed a marked decrease in SIF emissions in response to a 2012 drought that was generally larger than the corresponding NDVI and EVI decreases. Drought sensitivity of SIF, as inferred from LMM slopes, was correlated to land cover at different regions of the Caatinga. This is the first study to show correlation between landscape-level SIF and an emergent property of ecosystems (i.e., resilience), showcasing the value of remotely sensed fluorescence for ecological studies.


Introduction
Monitoring and measuring vegetation responses to climate has long been an area of interest for biologists and ecologists and it merits attention considering the ongoing global climate change and the closely related interplay between carbon flux, climate and the biosphere [1]. Among the plethora of methods that can be used to assess vegetation responses to climate, remote sensing offers great advantage over field techniques in regard to the size of the population that can be sampled in a given time interval since it allows the sampling of large areas, including remote places and also allows for frequent revisits that would be impractical for field campaigns considering logistics and costs.
Spectral vegetation indices (SVIs) such as NDVI, the Normalized Difference Vegetation Index [19] and EVI, the Enhanced Vegetation Index [20,21] are capable of providing information about abiotic stress effects by observation of leaf pigment composition and structural changes to vegetation and, consequently, about the fraction of Absorbed Photosynthetically Active Radiation (f APAR) [22]. However, chlorophyll fluorescence is more sensitive to environmental changes and can show the presence of stress before it has caused alterations to the plants that would be detectable by other remotely sensed vegetation indices [23,24].
Nevertheless, recent works have shown that over 70% of SIF variability can be attributed to phenology-related structural (e.g., leaf shedding, leaf aging) and biochemical (e.g., chlorophyll synthesis and degradation) changes that can be sampled through reflectance and SVIs [11,25]. This is understandable considering the above-mentioned influences of leaf and plant community structure and biochemistry on the radiative transfer of SIF but, the overall effect of such influences is uncertain, particularly when interpreting results from heterogeneous vegetation [15,16,25]. Therefore, we have chosen to test photosynthetic responses of vegetation using SIF data and also through adjustments made to SIF based on work investigating the effects of vegetation structure and biochemistry on chlorophyll fluorescence [14,15,26] as well as on previous observations concerning SIF data characteristics and radiative transfer [27][28][29].
Droughts are a major source of stress for vegetation and agriculture alike and can have devastating consequences for ecosystems and for society. While the potential of SIF for monitoring the onset of drought and its effects on plant populations has already been tested by previous works [30][31][32][33], some including in their methods comparisons of SIF with proxies of structure and biochemistry like f APAR [30] and SVIs [31,32], none has attempted the combination of these data, as we propose, or has included in their analysis measures of SIF R and the ratio between red and far-red SIF (SIF R/FR ). The Caatinga region located in northeast Brazil ( Figure 1) is a suitable biogeographical area to test the usability of SIF as a proxy for photosynthesis seasonality and abiotic stress due to its heterogeneous vegetation, its pronounced water-dependent seasonality, and a recent extreme drought reported there in the year 2012 [34].
Accordingly, our specific objectives were to: • Describe seasonal and spatial patterns of chlorophyll fluorescence dynamics as estimated by the GOME-2 orbital instrument in a eleven-year period from February 2007 until December 2017; • Model Sun-induced fluorescence (SIF) and spectrally adjusted SIF, as functions of environmental parameters, testing their responses to climate in the period; • Compare the responses of vegetation from the different ecoregions of the Caatinga, as defined by Velloso et al. [35], to the observed environmental variation in the period.
Our main hypothesis was that SIF is a good proxy to photosynthesis seasonality and droughtresponses, even when measured from heterogeneous vegetation and, especially when combined with different spectral data like vegetation indices.

Study Area
The Caatinga is a semiarid region of northeast Brazil with 862,640 km 2 (larger than the sum of the areas of continental France and Italy) harbouring over 28 million inhabitants [36]. The region is bounded by the humid Atlantic forests to the east and by the Cerrado savannas to the west and south. In this extensive region the vegetation is frequently armed with thorns and also commonly deciduous, it includes a range of woody and succulent species and a few ephemeral herbs and grasses distributed in a complex mosaic of physiognomies [36][37][38][39]. Most of its area is not subject to frequent fires, common to savanna vegetation and, the local physiognomies include mostly sparse and dense shrublands (scrublands), savannas, Seasonally Dry Tropical Forests (STDF), some areas suffering desertification and also enclaves of humid tropical forests usually isolated in landscape features favoured by orographic precipitation [36,40]. The region has been scantly studied [40,41] and its relatively high levels of endemism [36,39] point to an urgent need for further study and protection. The Caatinga has also been identified as a region of extreme fluctuations in its productivity dynamics and as an area of uncertainty for GPP observations and modeling efforts [42,43]. Despite its diversity and considerable levels of endemism, this region has often been neglected by scientific research [40,41].
In this study, we subset the region into smaller ecoregions, as defined by [35], and use these as our experimental units ( Figure 1). These ecoregions were defined on the basis of geomorphologic, climatologic and ecologic characteristics [35] which will likely influence local seasonality and abiotic stress occurrence [38,44]. Apart from translating the names of these ecoregions from the Portuguese originals, we have subdivided the large Southern Sertaneja Depression into Major and Minor since we have observed marked differences on climate between these sub-regions a posteriori. The region we call the Minor Southern Sertaneja Depression (Figure 1h), is situated near the coast and locked from the rest of the Sertaneja Depression (a geomorphological feature) by a relatively drier ecoregion called Raso da Catarina ( Figure 1g) and an elevated ecoregion called the Borborema Plateau (Figure 1d).

Land Cover Classification Data
To characterize Caatinga vegetation, we used ESA GlobCover 2009 [45], which has a gridded resolution of approximately 0.0028 • by 0.0028 • . While the year 2009 is within the studied period, we assume that the changes in land cover that have taken place in the remainder of the period are negligible for our limited use of this dataset. The classification system used in GlobCover 2009 include different vegetation types within the same classes with relative dominance ranges (50-70% or 20-50%), phenological characteristics (evergreen and semi-deciduous or deciduous) and flooding regime, when applicable (permanently flooded or temporarily flooded). Correspondence to FAO Land Cover Classification System (LCCS) is supplied by authors at the data distribution website [45] but is not necessary for the interpretation of our results.

Environmental Indicators
To cover the basic temperature and water-availability aspects of environmental characterization we used surface temperature, precipitation rate and soil moisture data. While the relative influence of the different environmental variables involved is species-specific and because the effects of micro-climate can not be adequately incorporated in our experimental design, we assumed that surface temperature, precipitation and soil moisture together are the main drivers of the observed plant responses to environmental conditions. Daytime land surface temperature from the Moderate Resolution Imaging Spectroradiometer (MODIS) instrument (MOD11C3, version 6), at the monthly temporal resolution was used to represent temperature at the observed communities level. This data was used in a 0.05 • by 0.05 • grid and at monthly temporal resolution [46].
Monthly precipitation rate from the Tropical Rainfall Monitoring Mission (TRMM-TMPA 3B43, version 7), which combines active and passive measurements [47], was employed as a measure of precipitation at the observed communities' level. This data is originally gridded at 0.25 • by 0.25 • and monthly temporal resolution. We regridded TMPA data to a 0.05 • by 0.05 • grid using the Nearest Neighbor method. This method, also known as Pixel Replication, does not effectively interpolates data but only regrids it into a finer or coarser grid by repeating the original information into corresponding pixels without changing its values or location and thus, maintaining the statistical structure of the original gridded data [48,49].
Since precipitation and water availability can be uncorrelated due to various edaphic and ecological factors, we have chosen to also include a measure of soil moisture in the analysis. For that end we used ESA's Soil Moisture CCI product [50], which combines both active and passive soil moisture measurements at daily frequency and in a 0.05 • by 0.05 • grid. We, therefore, resampled daily soil moisture data to monthly means for compatibility with other data sources used in this study.
All environment-related data were obtained for the period from February 2007 to December 2017.

MODIS-MAIAC Reflectance and Spectral Vegetation Indices
Surface reflectance and spectral vegetation indices used here were obtained from MODIS MCD19A1-C6 product at 1 km spatial resolution [51], atmospherically corrected by the Multi-Angle Implementation of Atmospheric Correction (MAIAC) algorithm [52], normalized for Sun-sensor geometry effects using the bidirectional reflectance distribution function (BRDF) and the Ross-Thick Li-Sparse (RTLS) model parameters provided by Lyapustin et al. [52], and aggregated into monthly composites (for detailed information on this implementation please see Dalagnol et al. [53]). Spectral vegetation indices were regridded to 0.05 • by 0.05 • from their original resolution through the Nearest Neighbor method. The SVIs we have employed in this study were EVI [20,21] and NDVI [19]. All MODIS-MAIAC data were obtained for the period from February 2007 to December 2017.

Sun-Induced Fluorescence
Among the available data sources, we have chosen to use SIF derived from GOME-2 instruments data due to the combination of several advantageous characteristics, namely: adequate spatial and temporal continuity, favorable measurement time (more in this same paragraph) and, availability of SIF R data [27,28]. All other SIF data sources have had shorter sampling extent: OCO-2 has data for the period starting on December 2014 until the present [54], TROPOMI's data starts on November 2017 [55] and TanSat on February 2017 [56]. The instruments OCO-2, TROPOMI and TanSat sample SIF at 13:30 local solar-time [54][55][56] and GOSAT samples at 13:00 solar-time [57] and therefore, their data is likely biased by mid-day peaks on Vapor Pressure Deficit (VPD), further confounding its interpretation. SCIAMACHY's SIF data is also not ideal for our aims because of its circular footprint and spatial discontinuity [57]. For a more thorough comparison between the different instruments currently capable of measuring SIF from orbit, see Mohammed et al. [3].
The GOME-2 SIF data we used were downloaded from NASA's AVDC (Aurora Data Validation Center) at version 2.8. This data was produced with methods described in Joiner et al. [27] and Joiner et al. [28]. The period we chose to study comprises most of GOME-2 SIF available data and it spans nearly eleven years, from February 2007 to December 2017. This data was converted from in its original gridded spatial resolution of 0.5 • by 0.5 • to a finer 0.05 • by 0.05 • to match the used MODIS products using the Nearest Neighbor method.
There are two GOME-2 instruments in orbit on board satellites MetOP-A and MetOP-B from Eumetsat. Both are in Sun-synchronous orbits, with overpass time at approximately 9:30 solar-time but, although MetOP-A's GOME-2 data is available since January 2007 until now, MetOP-B data covers a shorter period, starting from March 2013. Unfortunately, the GOME-2 sensor on MetOP-A has suffered a continuous degradation since launch and an artificial negative trend of approximately −1.1% per year has been consequently introduced into MetOP-A SIF data [28,58]. While the degradation effects on source data were corrected [58], Joiner et al. [28] argue that this correction is likely imperfect due to the specifics of SIF signal disentangling. Accordingly, we have chosen to use MetOP-A data from February 2007 until February 2013, switching to MetOP-B's data for the rest of the period used in this work (i.e., from March 2013 until December 2017) since the GOME-2 instrument in MetOP-B has not presented a degradation problem [28]. Additionally, to remove this degradation-related negative trend, data analysis in this study was performed using data detrended through time-series decomposition with Locally Estimated Scatterplot Smoothing (LOESS) when adequate.
Considering the previously mentioned dependence of SIF on the emitting vegetation's biochemistry and structure [15,25] we have chosen to include alternative formulations of SIF aiming to adjust the sampled signal to better account for these effects.
Firstly, beyond standard red and far-red GOME-2 SIF, we also included in our comparison SIF adjusted to account for variations in the Sun's zenith angle (SZA), already supplied with GOME-2 level 3 dataset. This SZA normalization (SIF-SZA) is obtained by calculating the quotient of the measured absolute SIF by the cosine of the Sun's zenith angle [27,28], which has been proven to have a normalizing effect proportional to the seasonal variation in PAR incidence [29].
Secondly, we have also employed SIF-Yield. This term has been used in slightly different ways in the literature (e.g., Miao et al. [59], Ryu et al. [11] and Verma et al. [60]) but here, we have calculated it in a similar way as to the "real fluorescence" of Gitelson et al. [26]. While in that study chlorophyll fluorescence was divided by reflectance and transmitance at fluorescence's wavelengths (representing the structural and biochemical influences on its signal) we will define SIF-Yield as the quotient of SIF by corresponding NDVI, since NDVI is a known proxy of chlorophyll content and f APAR [61].
Thirdly, we have defined a tentative productivity-related modification to SIF (SIF-Prod) following from the well-known scheme of Gross Primary Productivity (GPP) from Monteith [62], Monteith and Moss [63] described in the following formula: where f APAR is the fraction of absorbed photosynthetically active radiation (PAR), PAR is the incident PAR and LUE (Light Use Efficiency) is a term relating to photosynthesis physiology that has been defined in different ways in scientific literature [64]. This classic equation has been frequently used to estimate GPP through the LUE-paradigm [65] and SIF has been proposed as a suitable proxy for LUE [2,10], although studies testing this proposition have reached confounding results [11,59]. However, preliminary results from another study testing different biochemistry-and structure-related adjustments to SIF has found that the product of SIF by a proxy of f APAR showed the highest observed correlation to FLUXNET GPP data in comparison to various tested formulations [66]. The findings of that study suggest that SIF is not proportional to LUE by itself but rather to the instantaneous light-use efficiency under a given light incidence level (i.e., PAR * LUE). Thus, we decided to used it here in the same formulation as defined by Monteith's seminal work, assuming that SIF encapsulates the product of LUE by PAR, and using NDVI as a proxy to f APAR. Beyond SIF FR and the above described SIF-SZA, SIF-Yield and SIF-Prod formulations we also included SIF at the red wavelength peak, SIF R (supplied with GOME-2 data), the ratio between SIF at both wavelengths, SIF R/FR (calculated from extracted GOME-2 data) and finally, SIF daily average based on a clear-sky proxy of PAR, dSIF FR (supplied with GOME-2 data). All tested SIF-related response variables are listed in Table 1. Due to the lower spatial resolution of its data and the lower signal-to-noise ratio of its data, SIF R was not used for testing the spectral adjustments discussed above (i.e., SIF-SZA, SIF-Yield and SIF-Prod) beyond preliminary tests that produced similar effects as those observed with SIF FR (results not shown). Table 1. List of SIF response variables used in this study. For detailed description of these variables, their source and calculation, please refer to the Introduction section and to the Sun-Induced Fluorescence sub-section of Materials and Methods.

Variable Label Description Data Source
SIF FR Sun-induced chlorophyll fluorescence at the far-red wavelength peak. GOME-2 MetOP-A + MetOP-B SIF R Sun-induced chlorophyll fluorescence at the red wavelength peak.

GOME-2 MetOP-A SIF R/FR
The ratio between SIF at both wavelength peaks.

GOME-2 MetOP-A dSIF FR
Daily average of SIF FR based on a clear sky PAR proxy.
The quotient of SIF FR by the cosine of the Sun's zenith angle (SZA).

GOME-2 MetOP-A + MetOP-B and MODIS SIF FR -Yield
The quotient of SIF FR by NDVI-analog to "real fluorescence".

Statistics and Software
All available data discussed above were used in tests for the period from February 2007 to December 2017, with the exception of GOME-2 SIF R . While red SIF was available for the period from Considering that SIF R is also produced from MetOP-A data and therefore, it also subject to the above mentioned degradation problem, we decided to include a period similar to that of MetOP-A SIF FR data to maintain correspondence of SIF at both wavelengths. Specific tested periods will also be declared at the Results section to aid interpretation. The overall study design and data application is illustrated in Figure 2.
To evaluate the seasonality and the long-term tendencies of all data we carried out time-series decomposition with Locally Estimated Scatterplot Smoothing (LOESS) [67,68]. This statistical technique also known as STL decomposition, separates the data into three components: a seasonal component, a trend component and the remainder which can be generally interpreted as random variation or white noise [68].
To test the coherence between SIF and SVIs, we carried out several ecoregion-specific correlation analysis. These were made using data detrended and deseasonalized through time-series decomposition [67,68]. Hence, all correlation tests have been applied to the remainders of time-series. Furthermore, we chose to use Kendall's rank correlation test since it doesn't assume parametric data distribution and is more conservative than the popular Pearson's Product Moment Correlation Coefficient [69].
To study the effects of environmental variables over SIF and other predictors we employed Linear Mixed Models (LMM), fitting one holistic model to all ecoregions (per response variable) and a series of simpler ecoregion-specific models, parameterized according to our data and to model quality limitations inferred from model diagnostics. The various response variables used in model selection are presented in previous sections and listed in Table 1. The selected holistic model was defined with each given response variable as a Gaussian function of "Surface Temperature", "Soil Moisture" (and interaction between these continuous variables) and "Ecoregion". Random factors were "Month" individually and "Month" nested into "Year" to ensure adequate error structure [70,71]. Ecoregion-specific models were parameterized differently due to limitations imposed by a smaller number of observations than were available for holistic models. After model selection, the best model applicable to each ecoregion was coded with the response variable as a Gaussian function of "Surface Temperature", "Soil Moisture" (and their interaction), with "Month" and "Year" as independent random factors. All data used in LMMs were rescalled for compatibility and precipitation was not included in our models after initial data exploration showed Variance Inflation Factor (VIF) and correlation favouring Soil Moisture [72]. Model diagnostics was performed according to Zuur and Ieno [71] and Harrison et al. [73], and model selection carried out by examining diagnostic plots and Bayesian Information Criterion (BIC) analysis [71]. Holm-Bonferroni post-hoc tests were also applied to adjust LMM outputs. Root Mean Square Error (RMSE) and R 2 (marginal and conditional) were calculated for all LMMs [74].  Aiming to aid the visualization of our results in general, but particularly those concerning the previously identified 2012 drought, we conducted diverse spatial analysis of SIF, SVIs and environmental indicators, calculating spatial averages and spatial correlation between the variables of interest. For this analysis we subset our data into three periods identified in preliminary time-series analysis: 2007 to 2011, 2012 and 2013 to 2017. Beyond the spatial analysis, we also used this subset scheme to present and discuss other results relating to the 2012 drought.

Caatinga Ecoregion Land Cover Analysis
Data from the ESA GlobCover 2009 product shows that the following land cover classes are the most representative among all ecoregions of the Caatinga: shrubland (averaging 27.66% relative cover among all ecoregions), mosaic of vegetation and cropland (averaging 23.93%), mosaic of cropland and vegetation (averaging 20.4%) and rainfed crops (averaging 14%). Human influence on the region can be seen in the high proportion of the crop mosaic classes (either with crop or vegetation dominance) and Rain-fed Crops observed, constituting the majority at all ecoregions but Campo Maior and Araripe-Ibiapaba Complex (Figure 3). Barren and Urban areas were relatively rare and covered less than 0.5% of any ecoregion.
Tables of relative cover and complete nomenclature of observed classes are included in the appendices for precise reference (Tables A1 and A2). within parenthesis indicate cover density within that class ("CO" for Closed to Open or cover density above 15%, "C" for Closed or cover density above 40% and, "O" for Open or cover density below 15%), Flooding regime ("RF" for Regularly Flooded) and phenology ("EvSd" for Evergreen and Semi-deciduous mix and, "De" for Deciduous). For Mosaic classes, the first class named is dominant occupying 50-70% proportion and the second class named occupies a relative proportion between 20-50% of the land thus classified. "For/Shr" stands for the mixed class Forest and Shrubland, denoting vegetation with height below 5 m. Letters on the horizontal axis are in reference to Figure 1 and the name of each ecoregion is written to the right of its respective bar.

SIF and Climate: Seasonality and Trends
The mean seasonal components of the sampled time-series (2007 to 2017) show differences on vegetation responses between Caatinga ecoregions but these differences seem proportional to each ecoregion's particular climate's seasonality ( Figure 4). The variation on timing and intensity of the rainy season shows that most northwestern areas have their dry season from July to December, while at the eastern areas they go from October to March. A clear association between SIF, temperature and precipitation is also visible (Figure 4) but, it is interesting to notice that SIF FR and SIF R respond differently to the seasonal variation in precipitation and temperature. Precipitation and soil moisture presented almost synchronous behaviour with a lag between them that varies among the different areas (results not shown) and, therefore, we did not include soil moisture in our graphs to avoid visual clutter although it was used in subsequent analysis. Results of our correlation tests and time series decomposition analysis demonstrate the synchronicity between fluorescence measured by the GOME-2 instrument and the structural and biochemical phenological processes captured by MODIS MAIAC EVI and NDVI (Appendix A Table A4 and Appendix A Figure A1). Far-red SIF was more correlated to the tested SVIs in the period than red SIF at all ecoregions of the Caatinga (SIF FR to SVIs mean τ = 0.57 and SIF R to SVIs mean τ = 0.42). Monthly values of NDVI and EVI were similarly correlated to SIF from both red and far-red wavelengths (Appendix A Table A4). Observed correlations between SIFs and SVIs were generally high but varied among ecoregions and results from Campo Maior, Minor Southern Sertaneja Depression and Chapada Diamantina were considerably lower than those from other ecoregions (Appendix A  Table A4 regions a, h and i).
The trend components show the long-term variability of climate and its effects on SIF FR and SIF R . The majority of Caatinga ecoregions presented a considerable trend shift in the year 2012 when precipitation decreased and temperature increased at most areas ( Figure 5). SIF at both wavelength peaks show a correspondent decrease that is proportional to the climate variables change, particularly in SIF FR . Red SIF is only available until December 2013 but, it seems to show a relatively weaker response to climate variation than SIF FR throughout the sampled period ( Figure 5, gray dot-dashed lines). Furthermore, most sites show a period of decreased SIF FR emission that continues after 2012, with the exception of Campo Maior-the northwesternmost area, bordering the savannic Cerrado (Figure 1a). Eastern regions like the Minor Southern Sertaneja Depression (Figures 1h and 5h), Borborema Plateau (Figures 1d and 5d) and Raso da Catarina (Figures 1g and 5g) show another noticeable drop in SIF FR output on late 2016. SIF trend responses at all areas is visually proportional to concurrent environmental variation in temperature (simultaneously) and precipitation (with apparent lags).

SIF and Climate: Linear Mixed Model Analysis
The largest fixed continuous effect at all Linear Mixed Models (LMMs) was that of "Soil Moisture" ( Table 2, Estimate and Chisquare values). The categorical factor "Ecoregion" was the most significant variable for all unmodified SIF variables as well as for the "Prod" and "SZA" modifications but, the "Yield" modification reduced its relative importance and increased that of "Soil Moisture" (as encapsulated in Chisquare values). While SIF FR and SIF R models responded in a similar manner, concerning the proportional effects of each factor, SIF R/FR was not significantly affected by any factor ( Table 2). Table 2. Linear Mixed Models' Fixed Effects-"SE" is the Standard Error of each fixed effect estimate, "Chisq" is the chisquare value from each Wald type II test, "df" is the degree of freedom of each effect in its Wald type II test, "p" is the probability value of each Wald test. SIF FR is Sun-Induced Fluorescence at the far-red peak; dSIF FR is the daily average of SIF based on a clear sky PAR proxy; SIF FR -Prod is the product of SIF FR by NDVI; SIF FR -SZA is the quotient of SIF FR by the cosine of the Sun's Zenith Angle (SZA); SIF FR -Yield is the quotient of SIF FR by NDVI; For all models above: n = 1179; Month = 12; Month:Year = 131. SIF R is Sun-Induced Fluorescence at the red peak; SIF R/FR is the ratio between SIF R and SIF FR . For the last two models: n = 747; Month:Year = 83. Significance levels are marked according to the following legend: *** for p < 0.001 and * for p < 0.05.

Model
Fixed The random factor Month explained a similar amount of variance as was ascribed to residuals in all models except those involving SIF R , capturing the seasonality effect as expected (Appendix A Table A3). The nested Month:Year random factor was not so significant to the models but was maintained to constrain the error structure as it was shown to improve model quality during model selection (results not shown).
Model analysis showed that SIF FR -Prod had the best fit among the tested variables (Table 3, R 2 values) however, SIF FR -SZA, dSIF FR and SIF FR were very close by comparison. Nevertheless, SIF FR -SZA was the tested model of highest quality as can be observed by its low RMSE, RMEL and BIC results ( Table 3). The model using SIF FR -Yield as its response variable performed worst than unmodified variables, including SIF R . The model using SIF R had worse fit than most other models and SIF R/FR was not significantly affected by any fixed or random variable and the model employing it as the response variable had the worst results in all model quality and goodness-of-fit metrics (Tables 2 and 3).

Table 3. Linear Mixed Models' Goodness of Fit-R 2 is the coefficient of determination; RMSE is Root Mean Square Error, REML is Restricted Maximum Likelihood and BIC is Bayesian Information
Criterion. SIF FR is Sun-Induced Fluorescence at the far-red peak; dSIF FR is the daily average of SIF based on a clear sky PAR proxy; SIF FR -Prod is the product of SIF FR by NDVI; SIF FR SZA is the quotient of SIF FR by the cosine of the Sun's Zenith Angle (SZA); SIF FR Yield is the quotient of SIF FR by NDVI; SIF R is Sun-Induced Fluorescence at the red peak; SIF R/FR is the ratio between SIF R and SIF FR . For all models: d f = 15. For all SIF FR models: n = 1179. For the two models involving SIF R : n = 747. Since the complete results of all these models would be too extensive to show here (6 models for each of the 9 ecoregions, totaling 54 models). Moreover, since many of the tested response variables show similar results as to what can be observed for the holistic models presented above, we chose to show only the slopes of tested fixed effects in graphical format for the models using SIF FR and SIF FR -Yield (Figures 6).

Metric
For SIF FR ecoregion-specific LMMs, soil moisture was the most significant fixed effect. Results suggest that vegetation at the Northern Sertaneja Depression (Figure 6a) and eastern ecoregions Borborema Plateau (Figure 6a), Minor Southern Sertaneja Depression (Figure 6h) and Raso da Catarina (Figure 6g) were the most sensitive to environmental influences. Temperature had significant negative effects on SIF FR measured from the vegetation of most ecoregions.
Ecoregion-specific models using SIF FR -Yield as their response variable showed a decrease in the significance of temperature and, an increase in temperature and soil moisture interaction ("T : SM") fixed effect slopes (Figure 6), in relation to models using SIF FR .
Comparing the land cover classification results ( Figure 3) and the correspondent ecoregion's LMM slopes ( Figure 6) we noticed that sensitivity to environmental variables increased in areas where land cover had smaller percentages of forest formations. To test this, we added relative cover percentage values of the different forest physiognomies (closed, open, deciduous and evergreen) and performed Kendall rank correlation tests between total forest cover percentage and the correspondent fixed effect slopes of ecoregion-specific SIF FR and SIF FR -Yield models. These tests showed that drought sensitivity (fixed effect slope declivity) was negatively correlated to forest physiognomies cover percentages (Table 4). Table 4. Kendall rank correlation tests between the GlobCover 2009 forest physiognomies cover and Linear Mixed Model fixed effect slopes, per ecoregion. τ (tau) is the correlation rank and p is the probability value of Kendall's test. SIF FR is Sun-induced Fluorescence at the far-red wavelength peak and SIF FR -Yield is the quotient of Sun-induced Fluorescence at the far-red wavelength peak by NDVI; n = 9 per fixed effect. Significance levels are marked according to the following legend: ** for p < 0.01, * for p < 0.05 and ' for p < 0.1.

SIF and Climate: The 2012 Drought
The results of our trend analysis, showing a major drought at the year of 2012 ( Figure 5), can be better understood in a spatial analysis subsetting our data into three periods: a more favorable period from 2007 to 2011, a drought period in the year of 2012 and an intermittent-droughts period from 2013 to 2017. The map of average precipitation shows a clear difference among these three periods and give a spatial perspective of the 2012 drought in the studied area (Figure 7). Spatial averages maps of SIF FR , NDVI and their correlation, illustrate the effects of that environmental variation on vegetation biochemistry, structure and photosynthesis (Figures 8-10).    The changes in studied variables observed in the year of 2012, in relation to the period from 2007 to 2011, were large and SIF FR dropped by over 40% throughout most sites (Figure 11) while NDVI and EVI averages dropped by over 22%. Even SIF R and SIF FR Yield decreased by over 25% while Temperature increased by nearly 10% and Precipitation Rate and Soil Moisture decreased by 54% and 20% respectively in 2012, in relation to the previous year's average. The ratio between red and far-red SIF increased by over 16% in the year of 2012, with large variation between the different ecoregions of the Caatinga (Figure 11 SIF R/FR ).

SIF Responses to Environmental Variation
Fluorescence sampled from Caatinga vegetation seems to vary in similar amplitude as the environmental indicators at the different ecoregions (Figures 4 and 5). The sensitivity of GOME-2 SIF to environmental variation can also be established by observing the results of our Linear Mixed Models (Table 2) and by comparing the relative change observed at the 2012 drought among the various variables ( Figure 11) and between NDVI, SIF FR and SIF FR -Yield at each ecoregion ( Figure 12). The variables SIF FR -Prod, dSIF FR , SIF FR -SZA and standard SIF FR presented a median drop of more than 40% during the year of 2012 in relation to the period from 2007 to 2011 ( Figure 11). While these variables include the phenological influences of biochemistry and structure on SIF radiative transfer, SIF FR -Yield has had most of these influences removed but it also dropped by a median value of over 25% in 2012. Considering that both MODIS EVI and NDVI corrected through the MAIAC algorithm have a finer spatial resolution than GOME-2 SIF, it is interesting to see that SIF FR -Yield mean value still dropped by a higher margin than that of the sampled SVIs ( Figure 11). Results from our study show that the interpretation of chlorophyll fluorescence emissions is complex but very informative. As this study demonstrates and as it has been argued in recent reviews [3,11], the main factors to consider for the interpretation of SIF seem to be the resolution and temporal aspect of measurements and the characteristics of the vegetation being measured. Due to the nature of our study and our choice of GOME-2 SIF data, temporal influences in our results could not be controlled and thus, could not be directly tested. Phenology however, consisting of temporal variation on biochemical and structural characteristics of vegetation, has an apparently strong influence on SIF measured at the Caatinga as we will now discuss.

SIF Seasonality and Phenological Processes
In much of the Caatinga, SIF yearly minimum values were often about a quarter of the yearly maximum values (results not shown), characterizing a large seasonal variation in SIF (Figure 4). For most plants, physiological responses to heat and drought stress generally consist in adaptive measures like: limiting water loss through the closing of stomata and the consequent lowering of the photochemical quenching (qP) and fluorescence emissions [44]; Moreover, dissipating excess-energy through the Non-Photochemical Quenching (NPQ) pathways, which lowers fluorescence emissions, since these processes "compete" for the same energy [4,7,8]. Therefore, the more energy is dissipated through NPQ the less will be left for qP and its associated fluorescence emission. Nonetheless, since we are observing seasonal phenomena, not all variation can be attributed to physiology.
A study testing the effects of the exposure of plants to the combined stresses of drought and high-temperature at the leaf-level reported a five-fold decrease in fluorescence emissions that couldn't be attributed only to photoadaptive processes such as qP and NPQ but, was instead found to be largely the result of chlorophyll degradation [88]. Similarly, the marked decrease in SIF emissions observed concomitantly with the ecoregion-specific dry seasons ( Figure 4) suggests a combined effect of seasonal photosynthesis adaptation (e.g., non-transient lowering of qP and the increasing of NPQ) and phenological processes like chlorophyll degradation and leaf-shedding. These phenological processes are known to play a major role in the Caatinga's vegetation dynamics [39] and it is reasonable to expect that they must account for much of the observed SIF seasonal variation. This is further supported by the sampled SVI seasonality which adheres closely to SIF seasonality (results not shown) and correlates well to SIF data at most ecoregions ( Figure 10 and Appendix A Table A4). Nevertheless, even at the tested monthly scale, these processes are not the exclusive source of SIF output variation as is proven by SIF FR -Yield LMM results (Table 2). While the influences of phenological processes (e.g., leaf flushing, leaf maturation, chlorophyll degradation and leaf shedding) must be considered carefully, our results show that they do not invalidate the interpretation of chlorophyll fluorescence seasonality and stress responses.
Furthermore, the reversal in SIF FR and SIF R emission levels observed at the onset of the dry season ( Figure 4) is consistent with the expected effects of leaf shedding and chlorophyll degradation on the reabsoption of SIF R [14,15] but, this effect is shown here with fluorescence emitted by a large sample of plants instead of that emitted by a single leaf as described by Buschmann [13]. To our knowledge, this is the only study showing the relationship between the SIF R/FR ratio and chlorophyll content at the community level in the seasonal scale, as it is demonstrated by the seasonality of SIF FR and SIF R emission levels (Figure 4), and also demonstrated at an anomalous drought by our trend analysis ( Figure 5, year 2012). However, LMM results show that SIF R/FR variation is mostly driven by the phenological cycle as the effect of seasonality was constrained in our models though random factors and SIF R/FR was not significantly affected by any fixed factor (Table 2).

Apparent Influence of Vegetation Differences in SIF Responses
Land cover analysis through ESA's GlobCover 2009 product agrees with the previously described dominance of shrublands and open forest physiognomies at the Caatinga area, as described by major biogeographic studies [37,89]. The prevalence of crop-related classes (mosaic classes, Figure 3 and Appendix A Tables A1 and A2) is supported by the previously reported presence of subsistence agriculture with little or no irrigation at the Caatinga [36,40]. This simple land cover analysis supports our assumptions that Caatinga vegetation is a mosaic of diverse physiognomies, including widespread human occupation and land cover by subsistence agriculture, and shows that the ecoregions identified within the Caatinga [35] are different concerning not only geomorphology, climate and soil but also in their vegetation (Figure 3 and Appendix A Tables A1 and A2).
Our results suggest that vegetation types and their particular characteristics significantly influence the observed seasonality of SIF and its apparent responses to abiotic stress (Table 4). We suggest that such influences might help to explain why some of our results seem counter-intuitive, or discrepant. For example, the seasonality of SIF FR and SIF R at the Ibiapaba-Araripe Complex ecoregion shows a discrepant pattern with a very small seasonal variation in SIF R that doesn't seem comparable to that of SIF FR at that same region (Figure 4c). While we could speculate on differences in local vegetation composition (e.g., more presence of cacti, differences in deciduousness between vegetation strata) between the local types and those at other regions, we do not know the reasons for this particular discrepancy.
Furthermore, we have also observed considerable differences in the correlation of SIF and SVIs between different areas of the Caatinga (Figure 10 and Appendix A Table A4). Considering the nature of these data we suggest that temporal differences in responses to environmental variation must influence the observed correlation discrepancies since, while SIF can respond near instantly to environmental constraints SVIs will depend on slower biosynthesis and conversion of pigments and on similarly slow structural changes ( Figure 11). Daily cycles of environmental variation and seasonal change in the ranges of such daily environmental variation have been already been shown to influence chlorophyll fluorescence output in temperate vegetation [90] as well as in the tropical semiarid vegetation of Caatinga [91]. Therefore, the time of day when measurements are made is greatly significant for the interpretation of SIF but not for the interpretation of vegetation indices like EVI and NDVI. Thus, we suggest that daily maxima of SIF should be more directly comparable to vegetation indices in the context of seasonal studies. Nevertheless, considering the spatial pattern observed in the correlation between SIF and NDVI ( Figure 10) and how that matches the long term ecoregion-specific correlation discrepancies (Appendix A Table A4), we believe it is likely that vegetation structure and composition are also influencing these results.
Despite such uncertainties, the reliability of satellite measurements of SIF is further illustrated by the coherence of LMM slopes and land cover classification data. As our results have shown, the higher the percentage of the ecoregion's vegetation was classified as forest, the weaker was that ecoregion's response to tested environmental variables (Table 4). This is reasonable considering that the environmental factors modelled in LMMs were soil moisture and surface temperature (related to variations in water availability and vapour pressure deficit effects on vegetation) and that root depths in forest physiognomies are greater and also that moisture is conserved in the system due its structural characteristics [44]. Thus, denser and taller vegetation has greater resistance to drought effects (i.e., weaker response to changes in soil moisture at the measured depth) if compared to shorter physiognomies such as shrublands and savannas: this is an emergent property of forest ecosystems and it is closely related to the concept of ecosystem resilience [92,93]. Furthermore, the observed correlations between GlobCover physiognomies and response to environmental variables were greater with SIF FR than with SIF FR -Yield, in agreement with our assumptions regarding the expected effects of normalizing SIF by a structure and biochemistry vegetation proxy like NDVI. This is, to our knowledge, the first study to show results substantiating an emergent ecosystem property through remotely sensed measurements of chlorophyll fluorescence.
Considering the consistency of SIF responses with observed climate, we believe that our study did not suffer from the limitation of using land cover data from a single year in reference to SIF and other reflectance data from an eleven year period. However, this limitation and the mixed representativeness of GOME-2's large pixels, preclude a more detailed discussion of SIF results in relation to the emitting vegetation. More studies using higher-definition SIF data (i.e., with other instruments like OCO-2, TanSat, TROPOMI, OCO-3 or the upcoming FLEX) accompanied by more precise land cover data, should be employed to advance knowledge of vegetation ecophysiological responses.

Linear Mixed Models and Our Adjustments to SIF
The LMM model using SIF FR -SZA, the normalization of SIF FR values by the cosine of the Solar Zenith Angle (SZA), effectively improved SIF's relationship to environmental data (Table 3, slightly higher marginal R 2 and considerably lower RMSE, REML and BIC). These results contrast with conclusions from previous studies suggesting that SZA normalization could amplify noise if applied to measurements taken from lower intensity emissions (as observed at the Caatinga) and influence perceived sun/shade effects [29]. Furthermore, it is interesting to notice that our study region is under direct influence from the South Atlantic Anomaly (SAA) and it is, therefore, inherently noisy and perhaps a "worst-case scenario" for SZA normalization (see PAR normalization in Joiner et al. [28]).
The SIF FR -Prod adjustment, based on the classic productivity equation of Monteith and Moss [62,63], already shown in a preliminary study to increase the correlation between GOME-2 SIF and FLUXNET GPP across diverse vegetation types [66], yielded significant improvements on LMM goodness-of-fit in relation to standard SIF (Table 3). However, due to the substantial increase it caused in the effect size of the Ecoregion factor on tested LMMs (Table 2) and, considering that this effect encapsulates part of the structural variability between the different ecoregions, we conjecture that SIF FR -Prod may be "double-counting" the influence of emitting vegetation as it is calculated by multiplying SIF by NDVI and both variables are strongly influenced by vegetation structure and biochemistry. This is also suggested by the fact that although the SIF FR -Prod LMM had the highest coefficients of determination, it had relatively worse quality than what was observed with the SIF FR -SZA LMM (Table 3, RMSE, REML and BIC are higher for SIF FR -Prod than for the SIF FR -SZA LMM).
The spectral adjustment here called SIF FR -Yield, consisting on dividing the observed GOME-2 SIF by corresponding NDVI, had the largest effect on SIF among the tested modifications (Tables 2 and 3). The observed decrease in goodness-of-fit of the SIF FR -Yield LMM in relation to models using SIF FR , dSIF FR , SIF FR -SZA and SIF FR -Prod is in agreement with the hypothesis that SIF FR -Yield normalizes SIF data according to biochemistry and structure as sampled by NDVI, removing much of these influences from SIF data in the proportion to which NDVI is able to capture them. Because in our LMMs the Ecoregion factor encapsulates all unmodelled differences between the sampled vegetation at the different ecoregions, the drop in relative effect of that factor in relation to environmental factors Temperature and Soil Moisture in Yield-normalized models ( Table 2) shows that much of these unmodelled differences concern biochemical and structural influences on GOME-2 SIF data. Certainly NDVI would not be an adequate proxy to biochemistry-and structure-related effects on SIF emissions for all types of vegetation but, it is worth noting that NDVI at the Caatinga is not subject to strong saturation effects and it seldom rises above 0.7 in the majority of the region (Figure 9). We suggest that Yield-normalization is necessary for interpreting fluorescence measurements as our results show that SIF-Yield adds another perspective on SIF measurements (Figure 12), separating the long-term effects of phenology, biochemistry and structure (related to photosynthetic capacity) from the transient photosynthetic rate-related effects (related to photosynthetic performance).
Therefore, our findings support previous suggestions that SIF has potential to improve our understanding of plant responses to drought [30][31][32] and exemplify in a case study the premise that SIF can be more informative when combined with vegetation indices.

Conclusions
In this study of the semiarid Caatinga vegetation of northeast Brazil, we attempted to characterize eleven years of Sun-induced chlorophyll fluorescence (SIF) seasonality (from 2007 to 2017) and SIF responses to a major drought in 2012. To achieve these objectives we adjusted SIF through combination with NDVI and also used angular adjustments to SIF provided with original GOME-2 data. Original and modified SIF measures were tested against environmental data using time-series decomposition, rank correlation tests and linear mixed models (LMMs).
Results demonstrated that seasonal variation and drought responses of adjusted SIF emissions at the community/landscape level correspond to previously reported leaf-level physiological responses. These observations are more relevant since they include both red and far-red SIF, as well as their ratio, and they were made in the complex and heterogeneously structured plant communities of the semiarid Caatinga of northeast Brazil. Our results with angular and spectral adjustments to SIF, although partially implemented with geometrically and radiometrically incompatible data (GOME-2 SIF and MODIS MAIAC SVIs), suggest that many of the assumptions behind the tested adjustments are likely correct. Our findings are relevant considering GOME-2 SIF's low spatial resolution and the noise inherent to measurements taken from this area due to the South Atlantic Anomaly. Here we show for the first time that environmental responses measured through adjusted GOME-2 SIF correlate with a known emergent ecosystem property (resilience), illustrating its value for ecological studies of vegetation.
We suggest that more studies testing angular and spectral adjustments to SIF using higher definition SIF data in combination with other reflectance data are necessary to improve our understanding of landscape-level vegetation responses. Furthermore, continuous chlorophyll fluorescence measurements from tower-based instruments and coordinated field campaigns are recommended to increase our understanding of vegetation ecophysiological responses at different scales, to validate our findings and to allow predictions of possible ecological shifts resulting from the ongoing global climate change.