Modeling Species Distribution Using Niche-Based Proxies Derived from Composite Bioclimatic Variables and MODIS NDVI

Vegetation mapping based on niche theory has proven useful in understanding the rules governing species assembly at various spatial scales. Remote-sensing derived distribution maps depicting occurrences of target species are frequently based on biophysical and biochemical properties of species. However, environmental conditions, such as climatic variables, also affect spectral signals simultaneously. Further, climatic variables are the major drivers of species distribution at macroscales. Therefore, the objective of this study is to determine if species distribution can be modeled using an indirect link to climate and remote sensing data (MODIS NDVI time series). We used plant occurrence data in the US states of North Carolina and South Carolina and 19 climatic variables to generate floristic and climatic gradients using principal component analysis, then we further modeled the correlations between floristic gradients and NDVI using Partial Least Square regression. We found strong statistical relationship between species distribution and NDVI time series in a region where clear floristic and climatic gradients exist. If this precondition is given, the use of niche-based proxies may be suitable for predictive modeling of species distributions at regional scales. This indirect estimation of vegetation patterns may be a viable alternative to mapping approaches using biochemistry-driven spectral signature of species.


Introduction
Remote-sensing approaches for mapping individual species or plant assemblages frequently rely on the biochemistry of the target species/assemblage [1][2][3].Together with biophysical properties, plant biochemistry is directly responsible for the spectral signature of vegetation.For vegetation mapping, this relationship is inverted and the spectral signal is used to draw conclusions of the presence of a species or assemblage (see Figure 1(a)).A unique biochemistry of species along with phenological and structural characteristics may enable a discrimination of vegetation, but is inevitable for a reliable mapping result.Problems may arise if different species feature a similar spectral response, which is a common phenomenon in remote sensing [4,5].A unique biochemistry and spectral signal is considered neither in the biological definition of a species, nor in the ecological classification of plant assemblages [6].Further, species-specific biochemistry and spectral signal show a large spatio-temporal variation due to site conditions and other external influences [4].This hampers a direct inversion of the relationship towards a quantification of the floristic composition and makes the direct mapping or classification of vegetation with remote-sensing approaches very challenging.However, mapping and modeling species distribution using remote sensors is still desirable.In particular at global and regional scales, where ground-based mapping is inefficient, remote sensing may be a practical alternative approach for vegetation mapping.Environmental conditions, such as climate, also affect the biophysical and biochemical properties of species and hence influence their spectral signal [4,7].Environmental conditions, especially temperature and precipitation, are further major determinants of species distribution at macroscales [8,9].Indirect causal relationships between species distribution/occurrences derived from the spectral response of the environmental conditions, which define the ecological niche of the species, may hence be promising (Figure 1(b)).This is because the spectral signal is a direct reflection of environmental properties, thus representing a powerful proxy for floristic composition in species distribution mapping and modeling.
Therefore, we address in this paper the question whether we are able to model the floristic composition based on the ecological requirements of plant species and the spectral response to the corresponding environmental conditions.Our goal is to integrate environmental factors/climatic variables, floristic composition, and spectral information into a causal framework which has not been investigated before.We expect that the outcome of this study might have significant implications to remote sensing of floristic vegetation patterns.
The fundamental range of tolerance of a species, which is a result of limiting factors of the environment, is a critical determinant of the resultant distribution pattern [10][11][12][13].Limiting factors are typically related to climate properties, such as temperature and water availability, at a broad geographical scale.At a finer scale, resource factors/gradients, including nutrients, amount of light energy for plants, food for animals, and moisture level, driven by topographical variations and habitat types, are the main driving forces for shaping the patterns of species distribution.Additionally, natural and anthropogenic disturbances affect species distribution at various spatial scales.
From a theoretical point of view, species distribution at difference spatial scales closely ties to the fundamental and realized niche concepts proposed by Hutchinson [14].According to him, the fundamental niche refers to abiotic conditions in which a species can persist and maintain a stable population, whereas the realized niche describes the environmental conditions in which a species is able to survive and reproduce in the presence of biotic interactions, such as competition, predation, and symbiosis.The focal part of both niche concepts suggests that ecological niches function as an ecophysiological constraint on species distribution.It is expected that individuals living under the conditions outside the niche will not be able to maintain a stable population under selection pressure, thus a decline in population size can become real and inevitable.
It is fairly common in ecological studies that direct or indirect relations between measurable and nearly unmeasurable variables are contemplated for seeking causal driving factors for species distribution and abundance.For instance, ecological variables based on field sampling are necessary inputs for niche models.However, many of these variables are impossible to measure in reality.To overcome this problem, ecologists usually rely on information derived from available maps, such as digital elevation models to obtain topographical variables to develop predictive models for species occurrence [15,16].Climatic data are also collected or interpolated from climate stations for a large spatial scale modeling such as the bioclimatic envelope models [17][18][19].Similarly, remotely-sensed reflectance data are also employed as a good proxy for floristic composition/pattern recognition in vegetation studies [20][21].However, remote sensing data have not been used frequently in species distribution modeling, even though they can provide greater coverage in space and time.
More recently, there is an increasing trend in using remote sensing information based, e.g., on various spectral indices recorded from airborne or space-borne sensors as predictor variables in species distribution modeling [22][23][24][25][26].A few studies have shown that niche models developed by incorporating remotely-sensed predictors are more robust; in particular, these data can improve the prediction accuracy and tend to refine mapped distribution of species and habitats, compared with climatic/topographical variables-only models [27][28][29].Remotely-sensed indices, such as NDVI, may hence provide the opportunity to complement or improve niche models based on climate data alone.This is even more evident considering the fact that many climatic variables are the results of interpolation from potentially sparse weather stations and derived from different statistical approaches [30].Therefore, in this paper, we attempt to model the floristic composition of vegetation using an indirect link to climate and remote sensing data and we further contemplate the causal linkages between climate, phenology detected by remote sensors, and floristic gradients expressed by principal components.

Study Area
We chose the two states of North Carolina and South Carolina in the United States as our entire study area, due to the enormous ecological and biological diversity in the region.The two states contain a wide range of land cover types, including coastal lowlands, large river floodplain forests, rolling plains and plateaus, and forested mountains.There are four EPA level III ecoregions (http://www.epa.gov/wed/pages/ecoregions/level_iii.htm#Ecoregions) in the area including Piedmont (55 counties), a part of temperate hardwood forests found in the eastern North America; Middle Atlantic Coastal Plain (35 counties), containing mostly swamps and salt marshes; Southeastern Plains (35 counties), a mosaic of forest woodland and pasture/cropland; and Blue Ridge (20 counties), including Appalachian oak forests, northern hardwood forests and spruce-fir forests.Detailed ecoregion maps including all counties in the two states can be found in He et al. [21].

Floristic Data
The floristic composition of the vegetation in North and South Carolina was extracted on a county base from the USDA plant database (http://plants.usda.gov/java/).The database listed 3,151 species with an occurrence in at least one county.No data were available for the Alamance County in North Carolina.This county was hence excluded from all analyses, reducing the data set to a total of 145 counties.The compiled floristic data consisted of binary records of species presences and absences.No information on species cover fractions, dominances, or abundances was given in this dataset.The spectral signal of vegetation assemblages is dependent on species cover fractions as reflectance captured by remote sensors.A very weak direct relation between these binary floristic data and the actual spectral signal of the assemblages was hence expected.

Bioclimate Variables
At this regional scale, we assumed that climate is a major determinant of floristic patterns in North Carolina and South Carolina.We hence extracted 19 bioclimatic variables for each county from the BIOCLIM dataset (http://www.worldclim.org/)(Table 1).These 19 bioclimatic variables were obtained from more than 4,000 weather stations between 1950 and 2000 [30].The spatial resolution of the bioclimate data was 30 arc-seconds (~1 km) and we calculated the means of each variable for each county.

MODIS-NDVI Time Series
Reflectance-derived vegetation indices have been widely used among ecologists to study compositional changes of vegetation under the changing climate at a large spatial scale.One of the most commonly used indices is the Normalized Difference Vegetation Index (NDVI), stating a ratio between red and near-infrared reflectance captured by satellite sensors [31,32].NDVI metrics have been successfully used to estimate biomass and net primary productivity [33], because NDVI values are associated with the photosynthetically-active radiation of plant canopies [31,32,34].
The MODIS (Moderate-resolution Imaging Spectroradiometer) NDVI data used in this study have a spatial resolution of 250 meters and temporal resolution of 16 days.There are 23 time points for a whole year.We downloaded the 23 NDVI images for the year 2005 (http://glcf.umiacs.umd.edu/data/ndvi).For areas covered by vegetation, the NDVI is strongly related to the phenological development and hence to the climatic conditions.The NDVI-values of each date were averaged to county-based values that were used for the subsequent analyses.

Statistical Analyses
We subjected the floristic data to a principal component analysis (PCA) to reduce its dimensionality.Binary records of species' presence and absence (1 or 0) for each county were used as the original dataset for PCA.The resulting principal components (i.e., floristic gradients) comprise hierarchically large parts of the information content of the original data.The counties' scores on the main principal components are a quantitative and continuous measure of the changing floristic composition according to Gleason's continuum concept [35,36].Counties with a similar floristic inventory feature similar scores on the floristic gradients, while dissimilar species show larger inter-distances in the PCA-space.We used the spatial distribution of PC-scores as the basis for all subsequent analyses.
The 19 climatic parameters used in this study are highly inter-correlated.This inter-correlation hampers any interpretation and analysis.We hence used a PCA of the climate data to eliminate the inter-correlation and to extract independent climatic gradients.Subsequently, we analyzed the loadings of the parameters on the PCs and identified for each PC a corresponding parameter to ease interpretation.These 'master variables/descriptors' were passed to the subsequent analyses.
We assumed causal relations between both climate and floristic patterns, as well as between climate and NDVI.In order to test whether these assumptions correspond to statistically significant relations, two sets of correlation analyses were used.First, we tested for correlations between the climatic master variables and floristic composition as expressed by the floristic gradients extracted by PCA.These analyses were used to evaluate the assumption that the distribution of floristic vegetation patterns in North Carolina and South Carolina is dependent on climatic conditions.In a second set of analyses we tested for significant correlations between the climatic variables and the NDVI pattern on different dates.
Finally, we tested the statistical relationship between the floristic gradients and the NDVI time series with regression analyses.To cope with the inter-correlation inherent to the NDVI data, we used Partial Least Squares regression (PLSR, [37]).This regression approach was originally developed in the field of chemometrics to analyze spectral data.It was subsequently adopted in remote sensing and has since then successfully been used in numerous studies targeting different vegetation properties (e.g., [6,[38][39][40][41][42][43]).
PLSR is basically a multivariate regression, including so-called latent vectors (LVs) as independent variables.These LVs are statistically independent linear combinations of the original variables (i.e., the inter-correlated NDVI time series).Contrary to the PCs in PCA, the LVs are generated under simultaneous consideration of both independent and the dependent (i.e., the floristic gradients) variables.Therefore, the LVs are optimized towards the explanation of the response variable and the model parsimony can be increased [37].Model validation took place by 10-fold cross-validation.For each floristic gradient we built multiple models that included step-wise increasing numbers of LVs.To reduce the possibility of over-fitting and to optimize the trade-off between model fit and model parsimony, we selected the model with the number of LVs resulting in the smallest root mean squared error in validation (RMSEval).A backward selection approach (following [44]) was used to further refine this model and to reduce the set of NDVI dates to the time steps showing a stable and significant relation to the floristic pattern.The residuals of the final model for each floristic gradient were tested for spatial autocorrelation by calculating Moran's i for increasing distance classes between county centroids.
Unfortunately, all data used in this study are on a per-county basis.Apart from being not ecologically meaningful, this basis is likely to affect the results of the study, since the area of the counties is unequal (ranging from 450 to 3,180 km 2 ).The unequal area may in particular affect the number of observed species per county which generally grows with increasing area.To test if such an area effect exists, we plotted the species-area curve and formally tested the relation using the Mantel test.Resemblance matrices were generated using Euclidian distance for both county area and species richness per county as the input for the Mantel test.

Results
The PCA revealed three prominent floristic gradients in the vegetation data (Figure 2(a)).The spatial distribution of the PC scores (Figure 2(b)) showed that these gradients formed a clear spatial pattern, and floristic composition gradually changed from the coastline in the east to the mountains in the west.The cumulative variance explained by the first three PCs of floristic data was 55%.A weak relationship (Mantel test: r = 0.13, p = 0.006 from Monte-Carlo permutation test) was found between county area and species richness per county (Figure 3).
The PCA of the county-based bioclimate data revealed three strong climatic gradients (Figure 4(a)).The cumulative variance explained by the first three PCs was 50%.The analysis of the loadings showed that the main PCs represent a temperature gradient, a precipitation gradient, and a gradient from marine to continental climate.Three variables: mean annual temperature, annual precipitation, and mean diurnal temperature range were identified to represent these climatic gradients (Figure 4(c-e)).We summarized the correlations between these three climatic variables and the floristic gradients.The first and second floristic gradients showed a highly significant correlation with the climatic variables.The 3rd PC of the floristic data was almost independent of climate (Table 2).We also analyzed the correlations between the three climatic variables/composite PCs and the NDVI pattern on different dates (Figure 5).Strong correlations between Tm a and the NDVI were observed throughout the year.However, the sign of the correlation coefficients changed from summer to winter and vice versa.In summer, colder temperatures were related to a high NDVI signal; in winter, the NDVI values increased with warmer temperatures.Further, we observed positive correlations between NDVI and both precipitation and diurnal temperature range.Lastly, we analyzed the relationship between floristic gradient and NDVI time series using PLSR.Results of the PLSR regressions between floristic gradients and NDVI time series are shown in Figure 6.The model for the main floristic gradient (PC1) resulted in R 2 = 0.9, the model for the 2nd PC in R 2 = 0.73, and the model for the 3rd PC in R 2 = 0.33 in cross-validation.The residuals of all three models showed positive spatial autocorrelation for shorter distances based on Moran's i analysis (Figure 7).No global autocorrelation was observed in the residuals.

Floristic Patterns along the Climatic Gradients
Both PCA analyses revealed prominent floristic and climatic gradients across the two states of South Carolina and North Carolina from the eastern coastline to the mountainous regions in the west.There is a clear parallel trend when comparing visually floristic compositional changes and climatic variations in the PCA space (Figures 2(b) and 4(b)).The floristic pattern shows a smooth transition from the Appalachian oak forests and northern hardwood forests in the Blue Ridge ecoregion, to the successional pine forests and hardwood forests in the Piedmont ecoregion, to a mosaic of forest woodland and cropland in the south-eastern Plains ecoregion, and to the swamps and salt marshes in the Middle Atlantic Coastal Plain ecoregion.
Changes in floristic composition corresponded well with changes in temperature (the first PC) and precipitation (the second PC).The mean annual temperature had a gradual increase from the west (Appalachian Mountains) to the eastern coastal regions (Figure 4(c)), while precipitation decreased to the central location of the two states first, then increased gradually along the coastline (Figure 4(d)).The results of correlation analysis showed a strong relationship between floristic gradient and climate variables (Table 2).Therefore, it is fairly clear that climate is the major determinant of vegetation distribution patterns at the regional scale.This finding is not new; various studies analyzing the vegetation of North and South Carolina and other geographical locations at a broad spatial scale came to similar conclusions [45,46].Still, here we showed that this relationship allows for an indirect modeling of vegetation patterns via climatic conditions which determine the fundamental niches of plant species.

Climatic Influences on NDVI
We found strong correlations between climatic variables and NDVI time series throughout the year (Figure 5).The strength and sign (±) of these correlations were related to the seasons of the year.This is particularly evident with temperature (Figure 5(a)).Tm a and the NDVI were positively correlated during winter months (from late fall to early spring) and negatively correlated during summer months (with strongest correlation from July to August).The change of sign of the correlation coefficients is due to changes in the developmental stages of vegetation.The negative correlation is a direct result of reaching the peak growth period in the mid summer (a decline of vegetative production after reaching the maximum growth during summer months), while the positive correlation is related to the increase in plant growth at the beginning and the end of the growing season.Similar trends were also observed by Goward et al. [45] for NDVI time series of South Carolina.They were further found in studies carried out by Wang et al. [47] for determining temporal response of NDVI to climatic variables in the central Great Plains in Kansas and by Sun and Kafatos [48] for relating NDVI to land surface temperature over North America.Our results indicate that temperature is an important limiting factor for species assembly and distribution.Further, phenological events captured by remote sensors can be effectively linked to the seasonal patterns of plant growth on the ground [49].
We observed positive correlations between NDVI and mean annual precipitation for most time periods of the year (Figure 5(b)).The highest correlation coefficients are found in the summer months for precipitation reflecting basic growth requirements of water for plant assemblages.A very weak negative correlation was found at the end (late fall and winter) and the beginning of the growing season (part of the spring months).Previous studies have concluded that temporal variations of NDVI are closely related to precipitation and a strong linear relationship between NDVI and precipitation exits on regional [47][48][49][50][51][52] and global scales [53].Wang et al. [47] observed that NDVI curves and precipitation curves are parallel, displaying an increasing trend during warm months and a decreasing pattern in cold months.
Further, we found that NDVI and mean diurnal temperature range (DR m ) are positively correlated during the whole year.The DR m for a month is defined as the long-term mean of the daily difference between the maximum and minimum surface-air temperature for that month [53].As important as the mean temperature, DR m is another critical limiting factor for plant species' survival and reproduction.Studies based on historical climatic data have shown that a decrease in DR m has been observed almost globally [54].Further, connections between decreased DR m and increased urbanization have been made according to multiple studies at different geographical locations [55][56][57][58][59][60][61].We also found a consistent trend in the two states of South Carolina and North Carolina, where a lower DR m occurs in the counties (far from the cost) with higher populations in the urban areas (Figure 4(e)).Therefore, DR m can serve as proxy for anthropogenic disturbances to natural ecosystems.In this view, it further confirms that the spatial and temporal trends of NDVI are consistent with land use patterns.On the other hand, the lowest DR m values are associated with counties along the coastline due to sea breezes, onshore advection, and low-level cloud cover.

Relationship between Floristic Patterns and NDVI Time Series
We found strong statistical relationships between floristic gradients and NDVI time series, based on results from PLSR regression (Figure 6).Three predictive models were built using NDVI time series as explanatory variables and floristic gradients (PC 1-3) as response variables.The strongest model fit was achieved for PC1 showing R 2 = 0.9 in cross-validation.PC3 had a rather weak fit between observed and predicted data with a R 2 = 0.33 in validation.This supports our assumptions that (a) plant distributions with a relationship to climate can be mapped and modeled via remotely-sensed canopy reflectance, (b) the causal relationship between vegetation and reflectance can be inferred through an indirect link to climate, and (c) remotely-sensed indices can function as powerful inputs for species distribution modeling.We also found that PC3 featured a lower variance than PC1 and PC2.A weaker model fit can hence be expected.This relationship between information load and model fit has been observed in other studies that modeled floristic gradients with remote-sensing data (e.g., [6,62] ) and was also confirmed in an experimental study using reflectance data collected in planted grassland stands with artificial gradients [63].

Spatial Autocorrelation and Area Effect
We found spatial autocorrelation for shorter distances based on Moran's i analysis in the residuals of the PLSR models (Figure 7).This is not surprising, since most species distribution data are spatially structured and observations made from shorter distances are closely related [64,65].We consider that the spatial autocorrelation generated in this study is by the endogenous process in which biotic interactions, such as dispersal, competition, and reproduction are the underlying mechanisms [66][67][68].Further, local spatial autocorrelation may indicate the influence of other factors (e.g., anthropogenic disturbance) on species distributions that are not expressed by climatic variables.We did not attempt to remove or neutralize the effect of spatial autocorrelation in the present study, since no spatial predictions on species distribution are made based on the PLSR models.Still, the presence of spatial autocorrelation may have affected the significances of all correlation and regression coefficients (Type I error might have been inflated), thus spatial autocorrection needs to be considered in the interpretation of the results.If spatial predictions are desired, the use of adapted spatial models may be recommended (see, e.g., [68] for a review).
We tested for area effects and realized that a weak but significant relationship between county area and species richness exists.However, large counties are frequently located near the coastline, while mountainous regions are organized in small counties.This unequal distribution of county areas impairs the interpretation of related effects.In particular, the question of whether coastal regions generally feature more species than the mountains or whether this observation is an effect of the county size cannot be answered with the given data.Literature, however, reports an increasing species richness from the coast to the inland [69,70].Area effects can hence be assumed, but are difficult to correct, when dealing with presence/absence data.We thus recognize the potential bias in our results due to area effects originating from an unequal county area.These effects may be responsible for a considerable amount of interference in the models.

Model Output Deviates from Realized Niche
Remote sensing derived distribution maps aim to show actual occurrences of the target species.Following Hutchinson's niche concept, the depicted pattern corresponds in theory closely to the geographical representation of the realized niche, i.e., to all locations with favorable environmental conditions that were accessible to the species, despite dispersal barriers and competition.In reality, the mapped pattern may differ from the realized niche for several reasons: first, the realized niche may be unable to accurately describe the actual species distribution (see, e.g., [71] for a review).This may be the case if species manage to survive despite unfavorable conditions, such as the sink populations.Second, remote sensing approaches may fail to detect all species occurrences.This happens frequently, because, for example, species are not present in the canopy and hence hidden in the sensor's view or the species do not feature a unique spectral signature that allows for accurate detection (e.g., [72,73]).These limitations may affect the ecological information of the map and need to be carefully considered in the interpretation.
In the present study, we used remote-sensing data as proxy for climatic variables to model the distribution of plant species in a region where clear floristic and climatic gradients exist.Climatic variables are frequently used to describe the envelope of a species and to predict its fundamental niche, i.e., the potential distribution with respect to favorable environmental conditions [74].Dispersal limits and competition are not considered in such predictions for most species distribution models due to the spatial scales used and limited information on biotic interactions.Such predictions hence correspond to the geographic representation of Hutchinsons's fundamental niche and may differ considerably from the actual distribution pattern or the realized niche.It stands to reason that the climate-proxy based prediction in the present study also differs from the actual distribution pattern.However, the models we developed are both calibrated and validated with actual field observations of species presences and absences.The models hence target the actual distribution pattern and were fitted for that purpose, although a small deviation from the realized niche may remain.

Theoretical Limitations of the Approach
In the present study we tested the potential of an indirect relationship between remote-sensing data and species distributions for vegetation modeling.This relationship resulted in sufficient model fit for the main floristic gradients of North Carolina and South Carolina.Despite these promising results, the approach features some fundamental limitations that did not apply for the present study but require further attention: (1) A fundamental prerequisite for reliable results based on indirect relationships is the presence of prominent gradients in vegetation that are related to strong environmental gradients at the investigated spatial scale.While this precondition was fulfilled for the study area and may similarly apply to many temperate regions, it may be void for other areas.For example, species distribution in tropical rain forests may show a weak relationship to environmental gradients at local and regional scales due to stable climatic conditions.Remote sensing of vegetation patterns in such areas hence requires approaches based on the biochemistry of the target species using advanced sensor technologies (e.g., [7]).( 2) Indirect relationships as used in this study are complex and hardly transferable.Even if another area features a species composition similar to North and South Carolina, other environmental gradients (and hence other remote-sensing products) may be required to describe their distributions.This missing transferability may be considered as a disadvantage compared to the comfortable use of spectral libraries, thus hampering an operational utility.The rigid need for recalibration may, however, be familiar to many ecological modelers.(3) Indirect relationships as used in the present study are more suitable to describe general vegetation patterns than to delineate the distribution of a single species.Although floristic gradients enable the conclusions on occurrences of individual species [62], the generalization of the gradient analysis (i.e., the PCA) and increasing noise along the causal chain of the indirect link may blur the actual pattern at local scales.Despite these limitations, an indirect estimation of floristic patterns may be a viable alternative to direct remote-sensing approaches for large-scale vegetation mapping and modeling.

Conclusions
Using remote sensing to map and model vegetation patterns is a difficult task in many ways.In the present study, we tested whether relationships between plant species and their environment as expressed in remote-sensing imagery/MODIS NDVI allow for an accurate description of species distribution patterns.Despite the success of the model fit as demonstrated in the regression analyses, this approach is limited to areas with prominent floristic and environmental gradients.If this precondition is given, the use of remotely-sensed proxies, such as data acquired by sensors with moderate spectral resolution, may be suitable for predictive modeling of species distributions at regional scales.This indirect estimation of vegetation patterns based on proxy variables may be a viable alternative to mapping approaches based on the biochemistry-driven spectral signature of species.However, in areas where there is a lack of clear floristic and environmental gradients, then hyperspectral sensors should be recommended in mapping and modeling vegetation patterns.

Figure 1 .
Figure 1.Direct (a) vs. indirect (b) relationships between floristic composition and spectral responses of vegetation stands.

Figure 2 .
Figure 2. Results of the PCA on the floristic data of North Carolina and South Carolina.(a) distribution of the counties in the floristic PCA space and the variances explained by the PCs, and (b) spatial distribution of PC-scores across these two states.

Figure 3 .
Figure 3. Relationship between county area and number of species per county.

Figure 4 .
Figure 4. Results of the PCA of the bioclimate data (a), spatial distribution of the PC-scores (b), and spatial distribution of mean annual temperature (c), mean annual precipitation (d), and mean diurnal temperature range (e).

Figure 5 .
Figure 5. Correlation coefficients for correlations between the NDVI on different dates (Julian days) and three climatic variables (composite PCs).

Figure 6 .
Figure 6.Results of PLS regressions between floristic gradients (PC1-3) and NDVI time series.Bar plots show the RMSE in model calibration and validation for models based on increasing numbers of LVs.Arrows in the RMSE-plots indicate the number of LVs considered in the final model.Scatterplots illustrate the relationship between actual (i.e., observed) PC scores and the model predictions.The influence of different dates throughout the year in the models is indicated by the regression coefficients.Numbers in these plots correspond to the Julian day of the respective NDVI time series.

Figure 7 .
Figure 7. Spatial autocorrelation (Moran's i) in the residuals of PLSR-models at different centroid distances between counties.

Table 1 .
Bioclimatic variables used to describe the climatic variation across counties in the two states of North Carolina and South Carolina.