Prediction of Macronutrients at the Canopy Level Using Spaceborne Imaging Spectroscopy and LiDAR Data in a Mixedwood Boreal Forest

Information on foliar macronutrients is required in order to understand plant physiological and ecosystem processes such as photosynthesis, nutrient cycling, respiration and cell wall formation. The ability to measure, model and map foliar macronutrients (nitrogen (N), phosphorus (P), potassium (K), calcium (Ca) and magnesium (Mg)) at the forest canopy level provides information on the spatial patterns of ecosystem processes (e.g., carbon exchange) and provides insight on forest condition and stress. Imaging spectroscopy (IS) has been used particularly for modeling N, using airborne and satellite imagery mostly in temperate and tropical forests. However, there has been very little research conducted at these scales to model P, K, Ca, and Mg and few studies have focused on boreal forests. We report results of a study of macronutrient modeling using spaceborne IS and airborne light OPEN ACCESS Remote Sens. 2015, 7 9046 detection and ranging (LiDAR) data for a mixedwood boreal forest canopy in northern Ontario, Canada. Models incorporating Hyperion data explained approximately 90% of the variation in canopy concentrations of N, P, and Mg; whereas the inclusion of LiDAR data significantly improved the prediction of canopy concentration of Ca (R2 = 0.80). The combined used of IS and LiDAR data significantly improved the prediction accuracy of canopy Ca and K concentration but decreased the prediction accuracy of canopy P concentration. The results indicate that the variability of macronutrient concentration due to interspecific and functional type differences at the site provides the basis for the relationship observed between the remote sensing measurements (i.e., IS and LiDAR) and macronutrient concentration. Crown closure and canopy height are the structural metrics that establish the connection between macronutrient concentration and IS and LiDAR data, respectively. The spatial distribution of macronutrient concentration at the canopy scale mimics functional type distribution at the site. The ability to predict canopy N, P, K, Ca and Mg in this study using only IS, only LiDAR or their combination demonstrates the excellent potential for mapping these macronutrients at canopy scales across larger geographic areas into the next decade with the launch of new IS satellite missions and by using spaceborne LiDAR data.


Introduction
Macronutrients (nitrogen (N), phosphorus (P), potassium (K), calcium (Ca) and magnesium (Mg)) are required for plant physiological and ecosystem processes.For example, N is an important constituent of the chlorophyll molecule and the carbon-fixing enzyme ribulose-1, 5-bis-phosphate carboxylase/oxygenase and is thus directly related to photosynthesis [1].Foliar N is also related to primary production and decomposition [2,3].P is a component of nucleic acids, lipid membranes, sugar phosphates and ATP, which all have important roles in photosynthesis and respiration [1].N and P are the major growth-limiting nutrients for plants worldwide [4].Mg, like N, is also a constituent of the chlorophyll molecule and is directly related to photosynthesis [1].K has a number of important roles in photosynthesis and respiration, including translocation of photosynthates into sink organs, maintenance of turgor pressure, activation of enzymes, N metabolism and reducing excess uptake of ions such as Na and Fe in saline and flooded soils [5,6].Ca is required for cell division and cell wall synthesis, and light stimulates its uptake into chloroplasts where it is critical to the function of photosystem I [1,7].
The ability to measure, model and map foliar macronutrients at the canopy level allows us to examine these as proxies for assessing forest condition and stress, as well as ecosystem processes such as forest nutrient cycling and carbon (C) exchange (photosynthesis and net primary production (NPP)).Photosynthesis and NPP are of critical interest for the assessment of C fixation where forests function as sinks for C in the biosphere, given that global atmospheric carbon dioxide concentrations have been increasing at accelerating rates over the last several decades [8].Macronutrients, in particular N, have been used as proxies for C exchange, or as parameters in ecosystem models.For example, canopy N concentration was used to estimate the NPP of temperate forest ecosystems coupled with field measurements and IS data by Smith et al. [3] and as a parameter in the PnET-II forest process model [9].Thus, spatially-explicit landscape and regional inputs of foliar N concentration have the potential to improve the accuracy of ecosystem models.
A substantial portion of the published work regarding the prediction of nutrients at the canopy scale has focused on N.Although most canopy level studies have been conducted using airborne IS data, a few have examined satellite IS data.For example, Coops et al. [10] examined the utility of Hyperion IS data to predict N in eucalyptus forest canopies while Smith et al. [11] and Townsend et al. [12] compared the effectiveness of AVIRIS and Hyperion IS data to predict N concentration in temperate forest canopies.Canopy scale analysis to predict N from airborne IS data has been conducted in temperate [13][14][15][16] and tropical [17][18][19][20] forest ecosystems.Work in the boreal forest has been more limited.Ollinger et al. [21] and Knyazikhin et al. [22] did include a boreal coniferous forest site (i.e., Howland Forest, ME) in their studies examining N concentration across multiple ecosystems in the US, but more work is clearly needed in this biome.
Research examining the detection of P, K, Mg and Ca using IS data also has been more limited.The majority of studies have utilized field spectroradiometers to predict P in agricultural fields [23,24] and P and K in the foliage of giant sequoia and eucalyptus trees [25,26].P, K, Mg and Ca concentrations have been examined in savannah grasses [27] as well as for willow, olive, grass and heather foliage [28].At airborne scales, AVIRIS data were used to estimate area-integrated P concentrations in tropical forests of Hawaii [29] and African savannah using HyMap data [30].Mirik et al. [31] found a statistically significant relationship between a simple ratio vegetation reflectivity index (1129 nm/469 nm) and P concentration on an area basis in the forage vegetation of Yellowstone National Park using PROBE-1 data.The small number of studies demonstrates that there is a clear need to investigate the utility of IS data for the estimation of P, K, Mg and Ca in forest canopies.To our knowledge, there is no study that investigates the utility of spaceborne IS data for the prediction of these macronutrients; hence there is a need to test the performance of spaceborne IS data for predicting these macronutrients at the canopy level.This is of interest to the scientific community because of the upcoming IS satellite missions such as the Hyperspectral Infrared Imager (HyspIRI) and the Environmental Mapping and Analysis Program (EnMAP), which will provide continuous global coverage with high fidelity IS data [32,33].
Conventionally, prediction of foliar chemicals using IS data has relied on the presence of absorption features associated with chemical constituents (e.g., pigments, N, cellulose, lignin, proteins) and water along the electromagnetic spectrum [34].However, foliar biochemical estimation at the canopy level is plagued with several challenges.First, reflectance is attenuated and absorption features are masked due to presence of water in fresh foliage especially in the shortwave infrared (SWIR) [35].Second, interior air-cell interfaces of fresh foliage randomly scatter light, which interferes with the signal from the subtle absorption features [36].Third, reflectance data collected over vegetation canopies from airborne and spaceborne platforms are contaminated because of absorption and scattering of the atmosphere as radiation travels through it.Fourth, signal-to-noise ratios (SNR) for current spaceborne IS sensors is low.Fifth, the recorded signal is a mixture of background reflectance composed of soil, water and/or other cover types in environments with incomplete canopy cover.As a result, the spectral response of the vegetation canopy is a combination of the interactions between radiation and canopy structure (e.g., leaf area index, foliage clumping and orientation), atmosphere, background reflectance, sun-target-sensor geometry and absorption features associated with foliar chemicals, which constitute only a small fraction of the spectral response [22,37,38].Therefore, it is difficult to relate and clearly identify mechanisms linking spectral data, in particular spaceborne spectral data, to nutrient concentration at the canopy level.
Progress has been made to address some of these problems.For example, atmospheric correction algorithms significantly reduce the impact of absorption and scattering.Transformations of the spectrum such as taking the first and second derivatives have been proposed to account for illumination and background effects [39,40].Continuum removal was developed to enhance the absorption features along the spectrum by Kokaly and Clark (1999).This method has been used to estimate foliar chemical concentrations using spectra collected in the laboratory, field and from the canopy using airborne sensors [30,[41][42][43].
Partial least squares (PLS) regression has become a commonly used method to relate IS data to foliar chemistry.PLS regression reduces the full spectrum to a smaller number of independent factors and takes into account the covariance between the response variable and the predictor variables during the dimensionality reduction process, which makes it superior to other methods like principal components analysis, which only takes into account the variance in the predictor variables [44,45].This reduction process also helps avoid the problem of overfitting due to the presence of many more predictor variables (wavelengths) than dependent variables (chemical concentration), which is the case with IS data.In addition, it allows the full spectrum available after preprocessing to be utilized as opposed to specific bands or spectral regions to estimate chemical concentration.PLS regression has been used with IS data to estimate biochemical concentrations in plant canopies [18,46,47].
The species composition and deciduous and coniferous forest types, i.e., functional types, of the mixedwood boreal forest may provide an indirect method to model macronutrients at the canopy scale.Species composition exerts a major control on spatial patterns of foliar and canopy chemistry and has been shown across tropical, Mediterranean and temperate forests [18,46,[48][49][50].The mechanism that links species composition and canopy chemistry is based on the concept of the 'global leaf economics spectrum' [51]; i.e., plant traits that control key chemical, structural and physiological properties such as leaf mass per area (LMA), specific leaf area, leaf lifespan, photosynthesis, leaf N and P fall onto a spectrum across plant species, and species globally converge towards these functional traits [51,52].With particular reference to K, Ca and Mg, 55% to 66% of the variation in their concentrations were accounted for by species in lowland Bornean forests [50] and approximately, 23%, 50% and 47% and 45%, 70% and 71% of the variance their concentrations were explained by species on lower fertility and higher fertility soils in the western Amazonian forests, respectively [53].This is also likely to be the case in the mixedwood boreal forest, which contains deciduous and coniferous species and forest types.The canopy N:P ratio has been predicted and mapped based on the spatial variation generated by deciduous and coniferous forest types in a mixedwood boreal forest of northern Ontario [54].
LiDAR data have been used extensively to study boreal forest structure and physiology [55][56][57][58][59] as well as species identification [60].The mixedwood boreal forest is structurally complex because the coniferous and deciduous species have different heights and canopy shapes.LiDAR can characterize this structural complexity since it provides information about the vertical and horizontal structure of the canopy [61].If a covariate that establishes a link between the macronutrient concentration and LiDAR data can be identified, then it would be possible to predict macronutrients using LiDAR data.For example, crown closure was identified as the canopy structural metric that made it possible to relate LiDAR data to canopy N:P ratio in the mixedwood boreal forest [54].
Given the importance of species and forest types in controlling forest structure and foliar/canopy chemistry, we hypothesize that canopy macronutrient concentrations for a mixedwood boreal forest in northern Ontario can be modeled using IS and LiDAR data.To test this hypothesis, we examine spaceborne IS and airborne LiDAR data.Our objectives are: (i) to evaluate the utility of spaceborne IS data to estimate canopy macronutrient (N, P, K, Ca and Mg) concentrations; and (ii) to determine the potential for LiDAR data to improve the estimation of canopy nutrients by incorporating canopy structural information.

Study Site
This research was conducted at the mature mixedwood site at the Groundhog River Flux Station (GRFS); one of the stations of the Canadian Carbon Program (formerly the Fluxnet-Canada Research Network).The site is located approximately 80 km southwest of Timmins, Ontario, Canada (Figure 1a).A flux tower, 41 m tall with a 1 km footprint, is located at the site (Figure 1b,c).The site is representative of a mature boreal mixedwood forest with a patchy mixture of five primary tree species including trembling aspen (Po-Populus tremuloides Michx.), white birch (Bw-Betula papyrifera Marsh.), white spruce (Sw-Picea glauca [Moench] Voss), black spruce (Sb-Picea mariana [Mill.]B.S.P.), balsam fir (Fb-Abies balsamea [L.] Mill.), and distinct patches of northern white cedar (Cw-Thuja occidentalis [L.]).A total of 34 circular plots (11.3 m radius) were established using a stratified sampling scheme within the 1 km radius of the tower to capture the major species associations (Figure 1c).Field and remote sensing data used in this study were collected as part of the Canadian Carbon Program and the remote sensing and forest mensuration data have been used in previous studies [54,55,62].

Field Data
Multiple foliage samples were collected from the upper sunlit portions of the canopy from five trees in each of six different sampling plots for all species except black spruce for which 13 trees were sampled in July 2005 (Figure 1c).The number of leaves collected from each tree ranged from 10 to 40 leaves for deciduous species, and 400-1500 needles for coniferous species.Samples were analyzed at the inorganic laboratory of the Ontario Forest Research Institute (Sault Ste.Marie, Ontario) to determine the foliar concentration of N, P, K, Ca and Mg.Total N in the foliage was determined through the conversion of all forms of N into N2 by dry combustion.The Kjeldahl method was used to extract the cations of P, K, Ca and Mg, which were measured using inductively coupled plasma-atomic emission spectroscopy.Average nutrient concentration for each species was calculated by averaging the results of the foliage samples.To scale the macronutrient concentration from leaf level to canopy level in each plot, the average nutrient concentration of a given species was multiplied by its species presence weighted by its biomass according to the formula below: .= . ( where i represents a given species in plot, avg.nuti represents the average nutrient concentration for species i, and fi is the biomass fraction of species i within the plot which is calculated by dividing the total biomass of species i by the total biomass of all species within the plot.Forest mensuration data were collected in 2003 and 2004 in accordance with Fluxnet-Canada protocols [63] (now known as the Canadian Carbon Program).Diameter at breast height (dbh), height to the top of the crown, crown width taken along two axes assuming an elliptical shape, and species were recorded for each tree with a dbh greater than 9 cm within each plot.Structural metrics describing canopy shape including crown closure (percentage of ground covered by tree crowns as viewed from above) and height including average height (arithmetic mean of all tree heights), dominant height (maximum tree height), mean dominant height (mean height of the 100 largest trees ha −1 ), Lorey's height (tree height weighted by basal area) and basal area were calculated for each plot from these measurements.Total aboveground biomass (kg•ha −1 ) for each plot was calculated summing the aboveground biomass of trees in the plot using the allometric equations developed for Ontario hardwoods and softwoods that incorporated dbh and tree height [64,65].

Remote Sensing Data
Hyperion EO-1 data (400-2500 nm with 10 nm spectral resolution and 30 m spatial resolution) were collected for the GRFS in July 2005 to coincide with the foliar sample collection.Hyperion data have a lower SNR than airborne sensors; hence, they require significant preprocessing.First, bands that had no data or had very low SNR were omitted from the analysis (149 spectral bands across 477-1346, 1487-1790 and 2032-2335 nm range were suitable for analysis).Second, scan lines that caused image striping were restored by taking the average of the adjacent lines.Subsequently, the data were atmospherically corrected using the fast line-of-sight atmospheric analysis of spectral hypercubes (FLAASH) program.FLAASH uses the MODTRAN4 radiation transfer code to convert radiance at the top of the atmosphere to reflectance at the surface pixel by pixel.Following atmospheric correction, geometric correction was completed using a 0.5 m resolution digital elevation model (DEM) generated from high-density (3-8 pulses m −2 ) discrete return LiDAR data with first and last returns.LiDAR data were collected at a flying altitude of 244 m in August, 2003 by Airborne 1 Corp. (El Segundo, CA, USA) using the ALTM 2050 (Optech Inc., Toronto, Ontario, Canada) with positional error of 13 cm in the x and y directions and less than 15 cm in the z direction [66] (Optech Inc, 2002).The final root-meansquare error (RMSE) of the georectification was less than one third of a Hyperion pixel size, i.e., less than 10 m.Continuum-removed spectrum of Hyperion reflectance data were obtained for each plot.
Metrics that characterize the vertical and horizontal canopy structure (e.g., percentiles of height, maximum height, mean, standard deviation (SD), and coefficient of variation (CV = SD/mean)) were derived from the same LiDAR data described above.Since raw LiDAR data are referenced to an ellipsoid or datum, they need to be converted to heights above the ground before relating them to forest mensuration data such as tree/canopy height.A DEM with 0.5 m resolution was generated from ground returns using spline interpolation and was validated using 29 independent survey-grade GPS z-coordinates (slope = 1.02,R 2 = 0.99) [67].Ground elevations for each LiDAR point derived from the DEM was subtracted from the corresponding first return LiDAR points to obtain height above ground of the canopy, i.e., canopy height model (CHM), from which the LiDAR metrics were calculated for each plot.

Statistical Analysis
Analysis of variance (ANOVA) was conducted to calculate the amount of variance explained by species and functional types in macronutrient concentrations.Multiple mean comparisons between species for each macronutrient were carried out by Tukey's honest significance test.Principal component analysis (PCA) was used to observe the distribution of macronutrient concentrations across species.PCA results were displayed using principal components 1 and 2.
We used PLS regression to relate Hyperion continuum-removed spectrum to canopy macronutrient concentration at the plot level.PLS regression reduces the full spectrum to a smaller number of independent factors and takes into account the covariance between the response variable and the predictor variables during the dimensionality reduction process, which makes it superior to other methods like principal components analysis, which only takes into account the variance in the predictor variables [44,45].It is an appropriate method to analyze IS data, which usually contains many more predictor variables (wavelengths) than dependent variables (macronutrient concentration).PLS regression has been used with IS data to estimate biochemical concentrations in plant canopies [18,42,47].We first conducted PLS regression analysis only using Hyperion continuumremoved spectrum.To test the contribution of LiDAR data to explain the variance in macronutrient concentration, we then carried out PLS regression analysis combining Hyperion and LiDAR data.Variables were standardized (i.e., mean-centered and scaled) prior to PLS regression analysis.We used the minimum predicted residual sum of squares (PRESS) RMSE criterion to decide on the number of factors selected for a given PLS model.PRESS RMSE is calculated by performing the leave-one-out cross-validation method where residuals are calculated by iteratively leaving one plot out of the analysis.PRESS RMSEs (hereafter called prediction RMSE) were reported on the plots showing model fits.For each macronutrient predictive model, standardized regression coefficients and variable importance of projection (VIP) values were calculated.The regression coefficients indicate the importance of each predictor variable for the prediction of the response variable.VIP shows the contribution of each independent variable to the model for both predictor and response variables.VIP values less than 0.8 indicate insignificance in terms of variable contribution according to Wold's criterion [68].Thus, a small coefficient and VIP value for a predictor variable suggests that the given variable is not contributing to the PLS model.
Canopy macronutrients and structure relationships were examined at the plot level to help identify suitable LiDAR metrics.Since most of these metrics were not normally distributed, non-parametric correlation test (i.e., Spearman's test) was used.Predictive models of canopy level macronutrient concentration were generated from LiDAR metrics using multiple linear regression.Models containing a maximum of two variables were generated to avoid overfitting and multicollinearity was avoided by allowing a maximum variance inflation factor (VIF) of 5.The residuals were tested for normality using the Shapiro-Wilk test to satisfy the normality assumption of the F distribution, which is used to test for model significance.As in PLS models, model validation was performed by leave-one-out cross-validation and was reported as percent of the mean.

Generation of Canopy Macronutrient Maps
We generated maps to be able to visualize the spatial distribution of macronutrient concentration at the canopy scale at GRFS.The maps were generated by applying the standardized coefficients derived from the PLS regression models to the respective bands of the Hyperion image for each macronutrient.

Variation of Macronutrient Concentration by Species and Functional Type
Foliar concentrations of N, P and Mg are higher for the deciduous species.Trembling aspen and white birch have significantly greater concentrations of these macronutrients than coniferous species.Black and white spruces have the lowest concentrations of these macronutrients.There is more overlap between the functional types for K and Ca foliar concentration and the difference in Ca concentration between functional types is insignificant (Figure 2).The PCA results showed that 87% of the variance in macronutrient concentration could be explained by two principal components (Figure 3).The first component is an average of all macronutrients except Ca, which only has small contribution to the first component.The second component is dominated by Ca.It can be seen that coniferous species are clustered together at the lower end of principal component 1 indicating lower concentrations of N, P, K and Mg.Deciduous species are grouped together at the higher end of component 1 indicating that they have greater concentrations of these four macronutrients.However, deciduous species are distinctly separated from each other along component 2. This indicates that Ca concentration of trembling aspen is greater than white birch.Indeed, white birch has the lowest Ca concentration among all species.Coniferous species tend to have a gradient of Ca concentration.Species and functional types explain most of the variance in macronutrient concentration at the foliar level.Variance ranging from 67% to 97% was explained by species; the lowest for Ca and the highest for N. Functional type explained lower macronutrient variance than species.In particular, only 4% and 33% of the variance were explained by functional type for Ca (p = 0.23) and K concentration (Figure 4).

Relationships between Canopy Structure and Macronutrients
All of the macronutrients, with the exception of Ca, have significant correlation with crown closure, whereas Ca exhibits significant correlation with height, as do K and Mg (Table 1).The relationship of macronutrients with basal area is stronger for K, Ca and Mg compared to N and P and there is no significant correlation between biomass and N and P (Table 1).When N and Ca concentration are plotted against crown closure and dominant height, the relationship of species composition and functional type with macronutrient concentration becomes clear (Figure 5).Low N levels are characteristic of plots dominated by black spruce and other coniferous species, whereas high N levels are characteristic of plots dominated by white birch, trembling aspen and their mixtures.On the other hand, white birch dominated plots represent the lowest end of the range for Ca concentration.The relationships of P, K and Mg with crown closure were very similar to that of N and crown closure.

Predictive Models with IS Data
PLS models derived using Hyperion data explained 91%, 93% and 86% of the variation in canopy N, P and Mg concentrations with four, five and four factors, respectively.On the other hand, only 8% and 31% of the variation in canopy Ca and K concentrations were accounted for IS data with large prediction RMSEs (Figure 6).These graphs reflect the transition between functional types and species composition at GRFS.Basically, PLS analysis is capturing this functional type variation.Plots dominated by trembling aspen and mixtures of aspen and birch tend to cluster at the high end while plots dominated by black spruce and mixtures of the coniferous species tend to cluster at the low end of the range for N, P, K and Mg for the PLS models (Figure 6).Examination of the coefficients and VIP plots reveal that all portions of the spectrum, although with greater contributions by the visible and SWIR regions, contributed to explain the variance in concentrations of N, P and Mg (Figures 7 and 8).Only visible and SWIR regions of the spectrum contributed for predicting K and Ca concentrations and the coefficients for their predictive models were two orders of magnitude smaller than the coefficients of other macronutrients' models (Figures 7 and 8).

Predictive Models with LiDAR Data
LiDAR models did not result in any improvement in terms of variance explained compared to the PLS models derived using IS data for N, P and Mg.However, there was significant improvement for the prediction of K and Ca using LiDAR data.The variance explained in Ca concentration by the LiDAR model was 10-fold greater than the PLS model and the prediction RMSE was the lowest among all of the macronutrients (Table 2).The reason Ca concentration is much better predicted with the LiDAR data is because dominant height follows a similar distribution across species to the foliar concentration of Ca.For instance, unlike the other macronutrients, white birch is at the lowest end of the Ca range instead of black spruce (Figure 5b).The amount of variance explained for K concentration more than doubled compared to the IS model (from 31% to 68%) with a single variable LiDAR model.Fiftieth percentile of height was consistently selected for all macronutrient predictive models and maximum height was selected in three of the models (Table 2).

Predictive Models with IS and LiDAR Data
Combination of IS and LiDAR in the PLS regression analysis further improved the prediction accuracy for N, K, Ca and Mg compared to models derived from only using IS and only using LiDAR data.For these macronutrients, the amount of variance explained increased and the prediction RMSEs decreased (Figure 6).However, prediction accuracy for P decreased when IS and LiDAR data were combined in the PLS regression analysis compared to models derived from only using IS and only using LiDAR data.Amount of variance decreased from 95% to 43% with increase in prediction RMSE to 1.01 from 0.84 compared to the IS models (Figure 6).Most of the LiDAR metrics contribute to explain variance in macronutrient concentrations though it is most evident in Ca and K predictions where model coefficients increased by one-to two-fold compared to the IS only models and there are sharp increases in the VIP values (Figures 7 and 8).On the other hand, model coefficients decreased by two-fold and the contribution of NIR portion decreased for P prediction (Figures 7 and 8).

Spatial Distribution of Canopy Macronutrient Concentration
The general pattern of spatial distribution of macronutrients at the canopy of GRFS mixedwood forest is very similar (Figure 9).This pattern basically reflects the functional type distribution at the site.The areas that are low in the concentration of macronutrients are dominated by black spruce and mix of other coniferous species.Likewise, the areas that have high concentrations of macronutrients are dominated by deciduous species.The pattern appears to be slightly less evident for Ca and K for which the very low concentration areas to the north and west are diminished in size (Figure 9).

Discussion
Foliar macronutrient concentrations vary by species composition and functional types at GRFS, which agrees well with similar studies conducted in different ecosystems.McNeil et al. [48] showed that 93% of the spatial variation in foliar N was explained by species composition in temperate mixedwood forests of the Adirondack Park, New York.Similarly, 58% of the variation in foliar N was accounted for by plant community type in a Mediterranean ecosystem in California [46].Species composition explained 66%, 60%, 63%, 57% and 60% of the variation in foliar N, P, K, Ca and Mg, respectively, in a lowland tropical forest in Borneo [50].In Amazonian tropical forests, species composition explained 78%, 38% and 28% of the variation in foliar N, P and Ca, respectively [18].Asner et al. [49] also determined that species were the major factor causing chemical variance in Australian tropical forests.Site characteristics including soil quality, slope, aspect, and elevation also contribute to chemical variation, which we did not account for in our study.However, from our observations we know that pure black spruce patches are found on the boggy, low-nutrient sections of the site and thus reflect the soil quality indirectly.Since the GRFS site has minimal elevation gradients, it is unlikely that site characteristics will contribute much to the variation in foliar macronutrients at the site.
Our estimates of N, P and Mg using spaceborne Hyperion data are better than the few other studies that were conducted using spectroradiometer and airborne data whereas our estimates Ca and K are worse.For example, canopy P concentration in African savannah grass was predicted with an R 2 of 0.63 and RMSE of 28% using airborne IS data [30].The simple ratio vegetation index (1129 nm/462 nm) calculated from airborne IS data was correlated with P (R 2 = 0.65) in the forage vegetation canopy composed of grasses, sedges, forbs, sagebrush and willow in Yellowstone National Park [31].Predictive models with R 2 values of 0.73, 0.67, 0.77, 0.17 and 0.33 for N, Ca, Mg, P and K, respectively, were reported for greenhouse-grown tropical grass canopies using the continuum removal technique in the 550-750 nm range using spectroradiometer data [43].One reason for the difference between the amount of variance explained in Ca concentration between our study and this study may be related to the low correlations of Ca with other macronutrients in our dataset compared to theirs where Ca exhibited very significant correlations with N and Mg.In a previous study using spectroradiometer data and the continuum removal technique, the authors successfully modeled N, Ca, Mg, P and K concentrations (i.e., R 2 values of 0.70, 0.50, 0.68, 0.80 and 0.64, respectively) for a canopy of five grass species [27].Meanwhile, Ferwerda and Skidmore [28] modeled N, Ca, Mg, P and K (R 2 values of 0.76, 0.65, 0.56, 0.85 and 0.73, respectively) from reflectance collected from olive, heather, willow and mopane leaf samples using stepwise regression where up to four bands were included in the models.In the studies cited so far, the prediction accuracy for Ca is consistently better than reported here.In addition to the correlation between macronutrients, this may be a result of the macronutrients having significant relationships with crown closure at GRFS.Crown closure reflects the amount of green foliage in the canopy and directly controls the spectral response collected over the canopy.Crown closure was identified as the biophysical variable linking spectral data to canopy N:P ratio over two seasons at the same site [54] and it was also found to be controlling the spectral response in jack pine stands of northern Ontario, Canada [69].
Observed vs. predicted graphs of PLS models reflect the transition between functional types and species composition at GRFS.Basically, PLS analysis is capturing this functional type variation.The relationship between LiDAR and macronutrients is also based on the canopy biophysical variables, which display a gradient across functional types.The reason Ca concentration is much better predicted with the LiDAR data is because dominant height follows a similar distribution across species to the foliar concentration of Ca.This is clearly visible in the observed Ca vs. dominant height graph across plots.For instance, unlike the other macronutrients, white birch is at the lowest end of the Ca range instead of black spruce (Figure 5b).Spatial distribution of canopy macronutrient concentration at GRFS mimics the functional type variation and thus, the spatial patterns of macronutrients are driven by the functional types' distribution at the site.Forest type variation was also found to be the underlying factor in explaining the observed spatial patterns of canopy N:P ratio at this site [54].
The combination of IS and LiDAR data improved the prediction accuracy of N, K, Ca and Mg with very significant improvements especially for Ca and K.This is very important because information on the spatial distribution of these macronutrients at the canopy scale will allow researchers and forest managers to assess forest condition and stress and use this information to relate it to ecosystem processes such as forest nutrient cycling and C exchange.
The ability to predict N, P and Mg concentration at the canopy scale using spaceborne IS data is very encouraging as it demonstrates the potential for mapping these macronutrients at the canopy scale across large geographic areas at lower cost.This study is a precursor to the feasibility of canopy scale macronutrient mapping across large regions of forest harvesting in the mixedwood boreal forest, which is not too far given the planned satellite IS missions to be launched within this decade such as NASA's HyspIRI and ESA's EnMAP.The data from these sensors will have higher SNR, and temporal and spatial resolution than currently available making it more likely to have success at predicting macronutrients.
We used airborne LiDAR and spaceborne IS data (i.e., Hyperion) in this study.Both have limited geographic coverage since availability is based on user request.LiDAR data are costly to acquire.Hyperion data have low SNR, require significant preprocessing.These factors limit their widespread use by researchers.Given the cost disadvantage and geographic limitation of airborne LiDAR data, spaceborne LiDAR data such as Geoscience Laser Altimeter System (GLAS) should be tested to predict macronutrients.This study was conducted at one site representing the mixedwood boreal forest.It remains to be seen whether some canopy structural metrics can be identified linking foliar chemistry and remote sensing data in other ecosystems as well.Despite being collected two years prior to foliage sampling, the LiDAR data to macronutrients relationship is still strong because the canopy structure and height do not change substantially over that time period.
One unexpected result of the study was the decrease in the prediction accuracy of P when IS and LiDAR data were combined.This is perplexing given the fact that P can actually be predicted with moderate accuracy (R 2 = 0.63, prediction RMSE = 13%) using LiDAR data and with very high accuracy with IS data alone (R 2 = 0.93, standardized prediction RMSE = 0.84).This needs to be tested in other studies to see whether similar results are obtained.
Currently, scientific community is limited by the quality and continuous availability of IS data.There will be big strides in the field of canopy chemistry research if high quality and continuous IS data become available along with improvements in the methods to analyze spaceborne LiDAR data.

Conclusions
This study has examined the utility of spaceborne IS and airborne LiDAR data for modeling canopy macronutrient (N, P, K, Mg and Ca) concentrations in a mixedwood boreal forest.The mechanism that underlies the relationship between IS and LiDAR data and macronutrient concentration is the variation in macronutrient concentrations generated by interspecific and functional type differences at the site.Specifically, crown closure and canopy height are the structural metrics that determine the relationship between macronutrient concentration and IS and LiDAR data, respectively.Spaceborne IS (i.e., Hyperion) data accounted for 91%, 93% and 86% of the variation in canopy N, P and Mg concentration, respectively.Combination of IS and airborne LiDAR data significantly improved the prediction accuracy of canopy Ca and K concentration but decreased the prediction accuracy of canopy P concentration.Using only LiDAR data, variance ranging from 65% to 80% in macronutrient concentration could be explained.The spatial distribution of macronutrient concentration at the canopy scale mimics functional type distribution at the site.The ability to predict canopy macronutrients using only IS, only LiDAR or their combination demonstrates the potential for mapping these macronutrients at the canopy scale across larger geographic areas with the availability of high quality, spatially and temporally continuous IS data such as HyspIRI and EnMAP planned to be launched within this decade and by using spaceborne LiDAR data such as GLAS.Studies testing spaceborne IS and LiDAR data should be conducted to predict macronutrients and other biochemicals in other ecosystems.

Figure 4 .
Figure 4. Amount of variance explained by species and functional type in foliar macronutrient concentration.ns indicates insignificant difference (p > 0.5).

Figure 7 .
Figure 7. Standardized coefficients for the partial least squares (PLS) regression predictive models.Left column shows the coefficients for the models derived using only IS data and right column shows the coefficients for the models using IS and LiDAR data together for N, P, K, Ca and Mg, respectively.The black line in the right column graphs represents LiDAR data.The x-axis in the right column graphs represents the same 149 wavelengths in the left column graphs and additionally the LiDAR data.The red line indicates zero.

Figure 8 .
Figure 8. Variable importance of projection (VIP) values for the partial least squares (PLS) regression predictive models.Left column shows the values derived using only IS data and right column shows the values using IS and LiDAR data together for N, P, K, Ca and Mg, respectively.The black line in the right column graphs represents LiDAR data.The x-axis in the right column graphs represents the same 149 wavelengths in the left column graphs and additionally the LiDAR data.The red line indicates Wold's criterion value of 0.8.

Figure 9 .
Figure 9. Spatial distribution of macronutrient concentration at canopy level at Groundhog River Flux Station (GRFS).Units of concentration are %.Maps are generated from partial least squares (PLS) regression models using Hyperion data.

Table 2 .
Predictive models generated from LiDAR structural metrics using multiple linear regression.RMSEs are reported as percent of the mean.