Forest Canopy Heights in the Pacific Northwest Based on InSAR Phase Discontinuities across Short Spatial Scales

Rapid land use changes are substantially altering the global carbon budget, yet quantifying the impact of these changes, or assessing efforts to mitigate them, remains challenging. Methods for assessing forest carbon range from precise ground surveys to remote-sensing approaches that provide proxies for canopy height and structure. We introduce a method for extracting a proxy for canopy heights from Interferometric Synthetic Aperture Radar (InSAR) data. Our method focuses on short-spatial scale differences between forested and cleared regions, reducing the impact of errors from variations in atmospheric water vapor or satellite orbital positions. We generate time-varying, Landsat-based maps of land use and perform our analysis on the original wrapped (modulo-2π) data to avoid errors introduce by phase unwrapping and to allow assessment of the confidence of our results (within 3–4 m in many cases). We apply our approach to the Pacific Northwest, which contains some of the world’s tallest trees and has experienced extensive clearcutting. We use SAR imagery acquired at L-band by the PALSAR instrument on the Japanese Aerospace Exploration Agency’s (JAXA) Advanced Land Observation Satellite (ALOS). As SAR data archives expand, our approach can complement other remote-sensing methods and allow time-variable assessment of forest carbon budgets worldwide.


Introduction
The contribution of ongoing global deforestation to climate change has been of increasing concern over the past few decades (e.g., [1][2][3]).An estimated 650 billion metric tons of carbon are stored in forests globally, more than is present in the atmosphere [4].Approximately 12% of anthropogenic greenhouse gas emissions can be attributed to CO 2 emitted as a result of deforestation [5].Any legislation that seeks to curb deforestation rates must be accompanied by a means for verifying compliance and taking inventories of current forest stocks.Such inventories also allow assessment of the efficacy of forest management practices and monitoring of overall forest health.
Currently, many of the statistics on global forests depend on the self-reporting of nations, making them subject to the ability and desire of a nation to accurately report on its forest inventory [4].Vegetation structure and canopy height are two parameters of interest when assessing a forest inventory, and can be combined with allometric relationships to determine the biomass and, in turn, carbon stock of a forested region (e.g., [6][7][8][9]).These inventories are typically undertaken through comprehensive ground surveys or remote sensing methods that rely on LiDAR (e.g., [10][11][12][13][14][15][16][17][18][19]), optical or radar imagery (both the amplitude and interferometric products) (e.g., [11,12,[15][16][17][19][20][21][22][23][24][25]).Ground surveys on a local scale are cost-effective and allow for measurements and observations of forest characteristics such as species diversity, overall health, and forest density to be made with relative ease; however, surveys with the desired short repeat times would require a great deal of time and human labor and, for logistical and political reasons, are difficult to perform in many critical forested regions (e.g., [14,26]).In light of these challenges, remote sensing methods that can be conducted at a global scale with frequent repeat intervals can serve as a powerful complement to ground surveys [9].
Numerous studies have investigated the capabilities of Light Detection And Ranging (LiDAR) to determine forest characteristics such as vegetation density, structure, and canopy height (e.g., [16,[27][28][29]).These studies suggest that, under optimal conditions, canopy height within the footprint of the LiDAR beam can be measured with centimeter-scale accuracy, at least in areas where slopes are <5 degrees [18].A global canopy height map produced using Geoscience Laser Altimeter System (GLAS) data combined with Moderate Resolution Imaging Spectroradiometer (MODIS) data indicates that canopy heights within our study region in the Pacific Northwest (Figure 1) are among the tallest in the world, with many regions containing 35-65 m tall trees [16].Both satellite-and airplane-based platforms are capable of frequent repeat LiDAR measurements, potentially on global scales.However, the interpretation of LiDAR data is complicated by cloud cover and steep slopes, both of which are frequent conditions in many forested regions of interest (e.g., [14,[29][30][31][32]).
Satellite-based synthetic aperture radar (SAR) complements the approaches listed above, since active imaging at microwave wavelengths is possible at night (when optical imagery is not available) and in regions with dense cloud cover (which presents challenges for optical imagery and LiDAR systems).SAR systems generate images that are ~100 km × 100 km in scale with pixels on the order of a few tens of meters or smaller (e.g., [33]).The repeat intervals for SAR satellites range from days to months, potentially allowing for frequent, global repeat observations of forest properties [34].Previous studies have investigated the feasibility of using the backscattered amplitude from SAR as well as the coherence information from pairs of images as a proxy for biomass and vegetation structure, particularly when data from multiple polarizations is available (e.g., [20][21][22][23]27,32,[35][36][37][38][39]).Comparisons of bare-earth digital elevation models (DEMs) with the elevation product produced by the Shuttle Radar Topography Mission's (SRTM) or other SAR platforms have also enabled determination of canopy heights (e.g., [12,15,28]).The use of interferometric phase values themselves, rather than just coherence, has also been explored (e.g., [22,35,36]), particularly for systems where the repeat times are short relative to the timescales of temporal decorrelation (e.g., [33]).Here, we present an approach that uses the phase information from time series of InSAR data acquired on satellite platforms to constrain tree heights over the Pacific Northwest.We focus on a set of SAR imagery acquired at L-band by the ALOS satellite, which results in high coherence in forests relative to the C-band imagery that dominated the available satellite-based SAR data catalogs before the ALOS mission.We focus on differences in interferometric phase averaged between forested and cleared regions (e.g., [35]) assimilated with satellite-based maps of land cover that allow us to apply the phase differencing over large regions affected by time-varying land use.This method, which does not require simultaneous imaging or polarimetric data, capitalizes on the growing sets of SAR time series and related optical imagery datasets that are available worldwide, providing a proxy for canopy height that differs from the approaches described above.

Method: Extracting Tree Heights from Short-Spatial Scale InSAR Phase Variations
Interferometric phase in repeat-track InSAR depends on a combination of the geometry of the satellite or airborne platforms at the time of image acquisition, topographic relief, the reflective properties of the Earth's surface, and changes in the travel time of radar waves between the ground and the imaging platform (e.g., [33]).The latter can include variations in atmospheric water vapor or the ionosphere, as well as deformation of the ground surface due to processes such as earthquakes, landslides or anthropogenic activities (e.g., [33,40,41]).Because the contribution from radar path delays and errors in our knowledge of satellite locations often dominates interferograms (e.g., [42][43][44]), particularly at longer spatial scales (>10 s of km), the value of the interferometric phase itself (rather than just the coherence) has so far been of limited utility in forest studies.In this paper, we present an algorithm for determining forest canopy heights that focuses on short-spatial-scale (<few km) interferometric phase variations between adjacent cleared and forested regions in the Pacific Northwest.Most of the error sources for InSAR that result in coherent biases to the phase (e.g., satellite orbital errors, variations in atmospheric water vapor) cancel out over these short distances.
As described above, interferometric phase is affected by many variables, including errors in the digital elevation model (DEM) used to remove the effects of topography (e.g., [45]).Trees present within a particular pixel will act as sources of apparent -DEM error‖, since they raise the height of the effective phase scatterers within that pixel above the bare ground surface.Differencing of SAR-based DEMs such as SRTM and -bare-earth‖ DEMs (available throughout much of the United States) have resulted in maps of canopy height (e.g., [12,15]).The simultaneous acquisition of images that occurred as part of SRTM resulted in negligible contributions from atmospheric noise and temporal decorrelation, whereas SAR acquisitions from typical satellite platforms are separated by days to months.
In the absence of frequent repeats of missions such as SRTM, we present an approach for extracting information about tree heights using many interferograms and focusing on short spatial scales.The magnitude of the topographic effect on interferometric phase depends linearly on the spatial separation between the satellite's positions at the time of image acquisition (also known as the perpendicular baseline, B ┴ , (e.g., [46]).The contribution from all other factors affecting the interferometric phase is expected to be random with respect to B ┴ .The difference in phase between a cleared region and the immediately adjacent forest for sets of interferograms with a range of B ┴ has been shown to be directly related to the height of the effective scattering phase center for that group of trees (e.g., [35]).Differencing over such a short spatial scale cancels out errors from atmospheric effects and orbital configurations.There are other significant sources of error, such as temporal decorrelation and changes in the dielectric constant of the surface due to factors such as variations in soil moisture within the cleared areas, but none of those results in a change in phase that correlates positively with baseline.
An example interferogram covering a 46-day time interval is shown in Figure 2a.The coherence is typical for interferograms of this time span, with most of the decorrelation occurring in regions of open water or the higher elevations that sometimes experience snowfall.The signals at large spatial scales (10's of km) in Figure 2 can be attributed to variations in atmospheric water vapor and/or inaccuracies in our knowledge of the satellite location at the two acquisition times-the phase value at any particular point provides essentially no information about canopy heights.Also apparent are small (~1 km 2 area), quasi-rectangular features with an interferometric phase value that differs abruptly from surrounding pixels.We attribute these features to the differences in phase scattering height relative to the bare earth in forested vs. cleared regions.Comparisons with optical imagery verify the relationship to logged regions (Figure 2b,c).As expected, the difference in phase between adjacent cleared and forested regions increases linearly with B ┴ when we examine multiple interferograms covering the same area (Figure 3).This result agrees with previous work from [35] using ERS data (C-band) with repeat passes between 3-12 days.Note that in Figure 3 we have chosen the location of forested and cleared regions by hand for clarity-our automated approach skips this time-consuming step.

Data
We use Phased Array type L-band Synthetic Aperture Radar (PALSAR) data acquired by the ALOS satellite between January 2007 and March 2011.The number of acquisitions varies across our study area (Figure 1).We generate interferograms using the freely available Repeat Orbit Interferometry PACkage (ROI_PAC, [47]).Interferograms are downsampled (multilooked) 12 times in the azimuth direction and 4 times in the range direction.We remove topography with the 1 arc-second National Elevation Dataset (NED) product.For each frame, we generate all interferograms with B ┴ < 2500 m and temporal baselines <1 year.We rectify all interferograms to a common grid in radar range and along-track coordinates.

Automated Identification of Cleared Regions
The first critical step during our analysis is the process of determining where forests exist across the entire regions covered by our interferograms.Isolating the cleared regions used in our analysis by hand for all interferograms spanning a region as large as the Pacific Northwest would be prohibitively expensive.Instead, we use a combination of publicly available remote sensing datasets to automate classification of regions within each interferogram as -forested‖, -bare‖ or -unclassified‖.Timber harvesting occurred throughout the timespan of the available SAR data, so we include a time-variable component that ensures that we only average tree heights across time interval when no clearcutting occurred.
In order to characterize the temporal variability of land cover in our study area during the 2007-2011 time frame of our study, we also use yearly Landsat 5 Thematic Mapper (LTM5) acquisitions.Cloud cover was restricted to less than 20%, and for each frame we used the least cloud-and ice-covered acquisition available for that year.We generate our augmented classification map for each year using the ratio of LTM5 band 2 (visible, 0.52-0.60μm) and band 7 (mid-infrared, 2.08-2.35μm).This band ratio allows identification of areas where logging occurs during the time span of our interferograms (Figure 4).However, within a few years after logging, some cleared areas regrow enough vegetation so that they are no longer flagged as -bare‖ by this measure.Therefore, we discard points within interferograms if they are flagged as -forested‖ after previously being categorized as -bare‖ and do not use them for the remainder of the time series analysis (black region within red box, Figure 4).Visual inspection of the data shows that this approach is very successful at identifying the state of, and changes to, the landscape, and that areas where the metric is unreliable are marked as -unclassified‖ (black areas, Figure 4) rather than being incorrectly flagged as forested or bare.

Estimation of Canopy Height and Associated Errors
We estimate canopy height with a running window of 40 × 40 pixels across each frame.At the resolution of the multilooked interferograms used in this study, this box corresponds to an area of approximately 1.5 km × 800 m.At each location, we extract the average phase of pixels flagged as forested or bare for each interferogram.As described above, this set of pixels is time dependent, capturing the progression of logging and regrowth at each site.Any pixels that are flagged differently for the two dates spanned by the interferogram are discarded.If the number of either forested or bare pixels for a given interferogram within the running window is below a set threshold (50 pixels), that particular interferogram will not be included in the height estimation at that location.We only generate height estimates at a given location if more than 10 interferograms meet our criteria.
Phase unwrapping-the process of converting the interferometric observations, which are -wrapped‖ from −π to π (Figure 5), to the total amount of range change-carries with it many potential sources of error.This is particularly true for clearcut signals, since they have a sharp boundary across which it is not clear how many cycles of phase are represented.Therefore, we estimate the average phase in the 40 × 40 window at each location within the interferograms using the full complex phase values at each pixel.We illustrate the process described below in Figure 6, but focus on the difference in phase between two small boxed regions (solid and dashed boxes in Figure 6) for clarity.The phase differences calculated using the 40 × 40 window centered on that particular set of boxes are nearly identical to those shown in Figure 6, but with smaller error bounds because of the larger number of points used.Each pixel in an interferogram is associated with a complex phase value,   .Because of phase wrapping, we cannot simply average the angular values of   -a distribution that is truly centered near a value of π would result in an average value near zero, since there would be equal numbers of values near π and −π (Figure 5a,b, gray circle).However, if we treat each phase value as a complex unit vector, their average is an output vector that does capture the true mean phase of the distribution (Figure 5a,b, black circle).Conveniently, if all pixels within the averaging region have approximately the same phase value, the magnitude of   will be near unity.If the phase values are random, this magnitude will approach zero.For a wrapped Gaussian circular distribution of unit vectors, the variance of the mean, σ 2 , is given by (e.g., [49,50]): At each location, we find the mean complex phase (   ) of all pixels within the averaging region.We estimate the mean phase for both forested (   f ), and bare (   b ) regions and take their difference,   d , also in the complex plane.We determine   f 2 and   b 2 for forested and bare pixel groups, respectively, with the total variance for the phase difference,   d , being the sum of the two variances.The variance of each estimate is limited by the fact that phase values are restricted to the range −π to π. Above a level of σ 2 ≈ 0.6 × 2π, the phase values are essentially random (Figure 5c).We use a more conservative threshold and remove all phase values for interferograms where σ 2 ≥ 0.45 × 2π from our analysis at each location.
The relationship between scattering phase center height and the slope, m, of the line (which goes through zero) relating phase change   d to B ┴ is related to scattering phase center height, z, as follows: where R is range between the satellite and ground, θ is the satellite look angle and λ is the wavelength of the sensor (e.g., [42]).Since   d is -wrapped‖ and can only take values between −π and π, the predicted dependence of inferred phase on baseline takes a sawtooth form.Therefore, the determination of canopy height at each location is nonlinear.
To address this nonlinearity, we perform a grid search through heights ranging from 0 to 100 m.At each height we compute the weighted root-mean-square error between the predicted and observed complex phase differences using the data variances described above, and select the height associated with lowest error at each location (Figure 7).By using the complex phase values, we avoid issues with unwrapping.

Results and Discussion
Our map of inferred height of the L-band phase scattering centers determined for the Pacific Northwest (Figure 7) contains several key features.The Olympic Peninsula, a region known for dense forests and tall trees, is associated with a low density of observations.This is due to forest conservation efforts over much of the Olympic Peninsula.Without cleared areas to use as a local reference for the Earth's surface beneath the forests, height estimates cannot be made using our method.The Tillamook and Clatsop state forests in NW Oregon are similarly associated with low densities of observations.In general, our height estimates are largest within the mountain ranges, and decrease towards the eastern part of Washington and Oregon.This is not surprising, as the eastern parts of Oregon and Washington are very arid compared to the coasts.The Willamette Valley Basin, home to approximately 70% of Oregon's population and a great deal of its agricultural production, is associated with relatively low estimated heights.
We validate our results in two ways-through comparison of adjacent satellite paths and with LiDAR data.The adjacent satellite paths used in this study overlap by ~30 km, allowing us to assess the consistency of our results between independent sets of SAR imagery (Figure 8).The 1σ error bounds for nearby height estimates from different tracks overlap in most instances.In Figure 8, we illustrate the errors on height for both a pixel where the estimates from overlapping tracks agree well (point B) and where they differ by a significant amount (point C).Examination of the error bounds show that the discrepancy at point C results from a larger uncertainty on the heights at that location.This could be due to several factors-the two tracks will likely contain a different number of interferograms that sample different dates, and one set may have experienced more decorrelation or changes in land use than the other.The L-band (λ = 23.6 cm) radar data provided by ALOS does not interact with the very top of the canopy, so we expect the apparent heights to be lower than those obtained in the field, from LiDAR data, or by using C-or X-band SAR sensors [28].The scattering phase center for a given sensor and location will vary depending on factors such as forest density, canopy morphology, wavelength, and incident angle of the sensor (e.g., [51][52][53]).We find that our inferred canopy heights are consistently ~50% smaller than LiDAR-based height estimates using data obtained by the Oregon Department of Geology and Mineral Industries (DOGAMI), available online through the NSF OpenTopography Facility (Figure 9), a result that is consistent with the expected depth of interaction for L-band SAR [28].Application of our approach to global climate studies would require further calibration of the relationship between estimates of the L-band based phase scattering height and stored biomass, as is necessary for most metrics.While our approach compares pixels using a binary -cleared‖ or -forested‖ approach, many regions in the Pacific Northwest have been harvested sequentially at different times in the past decades.Adjacent plots of land often contain trees with substantially different heights (Figure 9), resulting in sharp phase boundaries between regions that are all treated as -forested‖ by our algorithm.Currently, our method extracts an average height for these regions of multi-stage regrowth.As data quality improves and the number of available SAR acquisitions in the archive increases, it may be possible to assess these effects by examining the confidence intervals on our height estimates.Given a sufficiently long time span of SAR acquisitions, determination of changing canopy height as trees regrow after clearcutting should be possible.

Conclusions
Monitoring of forest stock globally and at frequent time intervals is critical to efforts to study and mitigate anthropogenic effects on climate change.We have presented a method for determining regional the height of the L-band phase scattering center using phase differences between adjacent cleared and forested regions observed in L-band interferograms.Our method is restricted to regions that have undergone clearing within the past few years, providing no canopy height estimates in other regions of importance, such as protected forests.We combine the available SAR imagery with time-variable maps of landcover, allowing us to automatically extract differences in phase over large areas without the time-consuming process of hand-picking targets.As described above, in cases where we benchmark the automated approach against hand-chosen forested and cleared regions, the results are indistinguishable.
Our approach also uses the full complex interferometric phase values (instead of unwrapped values) to estimate the error on the average phase estimates in each region.This avoids phase-unwrapping errors that would occur frequently due to the high phase gradient that occurs between forested and clearcut areas.The errors on our height estimates depend on the data coverage at each location, but are typically around 5 m.We compare our results between overlapping InSAR tracks with completely independent sets of SAR imagery and demonstrate that the inferred heights are consistent within our stated errors.Comparisons with LiDAR-based tree heights (difference between first and ground returns) suggest that the L-band height is ~50% of the total height, a result that is consistent with previous work.Further calibration work would be needed before this method could be used to estimate characteristics of carbon stocks, including examination of the effects of tree spacing, species, and identification of any seasonal variation.The existing data catalog is not optimized for this sort of work-unfortunately, areas dominated by different tree species, such as the eastern United States, were only observed by the ALOS satellite around 5 times.We anticipate that the upcoming ALOS-2 mission, set to launch in early 2014, will extend this catalog.A strength of this approach is that it can be applied at any time of day or under any cloud cover conditions, and in regions with high biomass or steep slopes, where application of other approaches that rely on remote sensing can be challenging.
The steadily growing catalog of publicly available SAR imagery will allow further improvement to this and other SAR-based approaches.The assimilation of C-or X-band data would allow better constraints on canopy height than would be possible using L-band-based proxies alone.Shorter repeat intervals such as those proposed for satellites set to launch in upcoming years will result in improved coherence and time series that can better identify changes in forest coverage and heights.

Figure 1 .
Figure 1.Study region and data coverage.Colored boxes indicate the extent of coverage and data density of the five tracks of L-band synthetic aperture radar (SAR) data used in this study.A: Olympic Peninsula; B: Tillamook and Clatsop State Forests; C: Willamette Valley Basin.Area shown in Figure 2 is shaded gray.

Figure 2 .
Figure 2. Example wrapped interferogram (a) from our study region (track 222, frame 890), spanning 18 July-2 September 2009 with perpendicular baseline, B ┴ = 470 m, displayed in radar coordinates (Location in Figure 1, distance scale approximate).Phase values over water (Pacific Ocean) are decorrelated/random.Box indicates location of comparison between phase (b) and optical imagery (c).

Figure 3 .
Figure 3. Two cleared regions (top and bottom rows, respectively) observed in three interferograms.Dates and B ┴ for each: (a,e) 14 April-30 May 2008; −139 m (b,f) 28 November 2007-13 January 2008; 517 m (c,g) 30 November 2008-15 January 2009; 566 m.Each panel is 3 km × 3 km; (d,h) Phase difference between hand-picked bare (dashed box) and forested (solid box) regions for all interferograms (black dots) vs. B ┴ , with the three interferograms shown here highlighted in yellow.Red and blue bars indicate the 1σ values of phase for bare and forested pixels, respectively.Note larger errors on forested pixels, due to volumetric decorrelation.Solid and dashed lines indicate best-fit height and 1σ error bounds, respectively.

Figure 4 .
Figure 4. Example of temporal variability of Landsat-based classification as bare (white), forested (gray) and unclassified (black).Red box indicates region marked as -bare‖ at the beginning of the time series, but that had grown enough vegetation by the end to be classified as -forested‖.We remove pixels that change from -bare‖ to -forested‖ during the time series.Yellow box indicates a region that was harvested between 2009 and 2010.

Figure 5 .
Figure 5. (a) Simulated histogram of wrapped phase values in radians, with σ = 0.3π (heavy black line).Average value estimated from real-valued phase (gray circle) is offset from the true value (black circle); (b) Data from (a) shown on complex plane (rose diagram, scaled to max = 1).Amplitude of complex-averaged value (black circle) increases as σ approaches 0. Gray circle as in (b); (c) Inferred σ (Equation (1)) using simulated noise with a range of input σ, wrapped at −π to π. Inferred σ saturates above a threshold of ~0.6 × 2π.Black dotted line indicates 1:1 relationship and horizontal gray dashed line indicates the cutoff we use in our analysis.

Figure 6 .
Figure 6.Comparison of variance in phase (a,d) between interferograms of varying quality (top row, 18 October 2009-18 January 2010, B ┴ = 603, bottom row, 30 November 2008-15 January 2009, B ┴ = 566) for the same 2.5 km × 2.5 km region.Note that they have similar baselines, so we expect the phase difference between hand-picked bare (dashed line) and forested (solid line) regions to be similar.Phase values for all pixels within the cleared (b,e) and forested (c,f) regions (dots, rose diagram) on the complex plane show different amounts of spread, with cleared regions having less noise on average.Yellow stars in each plot indicate the average complex phase, (

Figure 7 .
Figure 7. Study region and inferred phase center heights.Black boxes indicate the five tracks of L-band SAR data used in this study.A: Olympic Peninsula; B: Tillamook and Clatsop State Forests; C: Willamette Valley Basin.Colored dots indicate the inferred height of the L-band phase scattering center at each location.

Figure 8 .
Figure 8.Comparison of inferred phase scattering center height (a) between two overlapping tracks, T221 (circles) and T222 (stars), overlain on digital orthophoto quad (grayscale).Color indicates height in meters; The error bounds (shaded) on heights (solid red, black lines) for two points where nearby estimates are similar (b) and dissimilar (c) overlap at the 1σ level.

Figure 9 .
Figure 9.Comparison of InSAR-and LiDAR-based results.(a,b) LiDAR-based canopy height map (difference between first and ground return) covering two different areas (color) and location of profiles X-X' (white dashed lines, box shows total area sampled) used in (c,d).Profiles show elevation of ground return (red), individual first returns (trees, green dots) from the LiDAR data.Blue curve indicates the InSAR-based height, offset from the LiDAR-based ground surface.Note that the InSAR height is an average estimated over the entire area shown in panels (a,b).