Estimation of Boreal Forest Growing Stock Volume in Russia from Sentinel-2 MSI and Land Cover Classification

: Growing stock volume (GSV) is a fundamental parameter of forests, closely related to the above-ground biomass and hence to carbon storage. Estimation of GSV at regional to global scales depends on the use of satellite remote sensing data, although accuracies are generally lower over the sparse boreal forest. This is especially true of boreal forest in Russia, for which knowledge of GSV is currently poor despite its global importance. Here we develop a new empirical method in which the primary remote sensing data source is a single summer Sentinel-2 MSI image, augmented by land-cover classification based on the same MSI image trained using MODIS-derived data. In our work the method is calibrated and validated using an extensive set of field measurements from two contrasting regions of the Russian arctic. Results show that GSV can be estimated with an RMS uncertainty of approximately 35–55%, comparable to other spaceborne estimates of low-GSV forest areas, with 70% spatial correspondence between our GSV maps and existing products derived from MODIS data. Our empirical approach requires somewhat laborious data collection when used for upscaling from field data, but could also be used to downscale global data.


Introduction
Growing stock volume (GSV), defined as the total volume of all living tree stems (excluding branches, including bark) in an area of interest or unit area such as a hectare [1], is an essential structural parameter describing a forest. Its use in assessing commercial forestry is well established [2]. It is also of direct ecological and climatological significance, being closely related to the concept of above-ground biomass (AGB) and hence carbon storage [3]. Remote sensing methods have a long history of development for estimation of GSV [1,[4][5][6][7][8][9][10], and a number of approaches have evolved since the 1990s, exploiting spaceborne visible and near-infrared (VNIR) imagery, radar, and more recently the incorporation of airborne measurements from airborne laser scanners and UAV (unmanned aerial vehicle, commonly referred to as a 'drone') observations. The simplest approaches are based on multispectral analysis of freely-available VNIR imagery having a spatial resolution of the order of 10 m or coarser [11][12][13][14][15][16][17]. Useful enrichment of the available feature space has been demonstrated using multitemporal datasets [18][19][20][21], incorporating texture measures [14,22] and field-derived or satellite-derived threedimensional information [23][24][25][26][27][28][29][30]. Other approaches are based on the use of ultra-highresolution VNIR imagery (usually not free of cost) [31,32], radar imagery [1,[33][34][35][36][37][38][39][40][41][42][43][44][45][46], or combinations of VNIR and radar imagery [47][48][49][50][51][52][53]. We should also note approaches based on the direct use of spaceborne laser profiling [54] and those that explicitly incorporate a landscape characterisation, derived from satellite data, into a VNIR [55] or radar [56] analysis. Finally, Zharko et al. [57] have demonstrated the utility of winter VNIR imagery in sparsely forested areas subject to snow cover, where the optical contrast between snow and vegetation can be exploited. Figure 1 attempts to give a simple overview of the current situation regarding remote sensing estimation of GSV. It has been compiled from quantitative data abstracted from many publications [13,14,16,[21][22][23]25,27,32,33,37,40,47,48,50,[56][57][58][59][60]. As Figure 1 shows, typical accuracies for spaceborne methods are approximately 20 to 40% RMSE, becoming somewhat poorer at lower values of GSV. Horizontal axis shows the mean GSV of the plot or plots included in the studies, and the vertical axis shows the estimated error in the calculated GSV expressed as a percentage of the mean. Methods are roughly classified as airborne (using data from ALS (Airborne Laser Scanning) or UAVs), optical (satellite data from Sentinel-2 and Landsat), ultra-high resolution satellite data (with spatial resolution of 1 m or finer), radar (satellite radar data), or 'multiple', using more than one data type. 'Present' summarises the performance of the method developed in the present work. The dashed line is an empirical fit to the data except for the 'airborne' class, and has the formula RMSE = 436 (GSV) −1/2 , where RMSE is in % and GSV in m 3 ha −1 .
The work presented in this paper is focused specifically on estimation of GSV in the Russian boreal forest. The Boreal forest generally, and the Russian part in particular, is poorly inventoried [1,61] and yet its importance in our understanding of the global climate system is high [62]. Globally, the boreal forest accounts for 31% of the area, 20% of the GSV, and 13% of the above-ground biomass of all forest whereas Russia, which is dominated by boreal forest, accounts for 20% of the global forest area [63]. However, a recent study based on remote sensing data has shown that the growing stock of Russian forests is 39% higher than the value of official statistics in the State Forest Register [61]. Multispectral VNIR remote sensing-based GSV estimation for boreal forest is particularly challenging because of the low canopy coverage, combined with the fact the field layer is often composed of dwarf shrubs that are spectrally not very different from the forest canopy [64]. Although airborne methods have undoubted potential to improve our ability to estimate GSV for boreal forests, it will often be impractical to obtain airborne data, especially over large areas. There is thus an incentive to develop simple optimised estimation algorithms that exploit freely and frequently available satellite data. In the present work, we develop one such algorithm that uses an empirical estimation function combining both Sentinel-2 MSI imagery and a land-cover classification derived from this imagery and trained using MODIS (Moderate Resolution Imaging Spectrometer) data. The novel feature of this new method is thus the inclusion of land-cover as a potentially informative source of information on GSV, in addition to the multispectral bands of the MSI imagery itself. The explicit aim of this approach is to minimize its dependence on multiple data sources that may be difficult to acquire on a routine basis. The algorithm is locally tuned, and developed for two contrasting areas of the Russian boreal forest. Although we do not assert that these represent the complete range of types of boreal forest in Russia, their contrasting natures allow the method's potential for generalisation to be assessed to some extent.

Materials and Methods
Field data were collected as part of the project "Multiplatform remote sensing of the impact of climate change on northern forests of Russia", a Russian-UK collaboration that ran from 2018 to 2021. The broad aim of this project was to develop a systematic understanding of the distribution of biomass in the Russian boreal forest, its changes during the first 20 years of the present century, and the climatic influences on it. Fieldwork took place in July-August 2018 in and around the Khibiny mountains in north-western Russia, and in July-August 2019 in the vicinity of Yakutsk, Sakha Republic, in northeastern Russia ( Figure 2). The boreal forest in the Khibiny region is dominated by pine, spruce, and birch species, and lies within the area of Russian forest classed as 'accessible' [65]; whereas around Yakutsk it is dominated by larch, pine, and birch species and is classed as 'hard-to-reach'. Logistical support for the Khibiny fieldwork was provided by the Khibiny Educational and Scientific Station (67°38′ N, 33°44′ E) [66]  Location of study areas within the Russian boreal forest. Background map shows dominant forest types and is simplified from [67] using original data from [68]. Map prepared by the authors.
Within each of the two main study areas, a number of sample plots of 20 m × 20 m area were established, geolocated using non-differential GPS. Plots were accessed by road vehicle and on foot, and were selected to span, as far as practicable, the range of forest type and condition characteristic of the areas. The plots were judged by eye to be homogeneous in tree density. Where the number density of trees was particularly high, smaller plots were occasionally chosen.
Data suitable for estimating the GSV per plot were collected by measuring all stem diameters d (diameter at breast height-DBH-at 1.3 m) and tree heights h, together with tree genera, in each 20 m × 20 m plot. Trees were counted and measured only if their heights exceeded 2 m. (A height threshold was quicker to implement in the field than the equivalent DBH threshold of approximately 3 cm. The data collection protocol in this study also complied with previous studies of forests near the treeline, where the tree was defined as woody vegetation over 2 m tall. The resulting data are more complete in the number of stems per site, which is important in comparing to previous data.) Diameters were measured using a Haglöfs Mantax 40 cm tree caliper or by passing a flexible measuring tape around the stem, and heights were measured using an optical clinometer (Suunto PM5, Silva Clino Master CM and similar) with a measuring tape to determine the distance from the observer to the base of the tree. The accuracy of both methods was estimated to be 5%. In total, 1858 trees were measured across a total of 33 field plots (total area sampled was 8675 m 2 ).
Stem volumes V of individual trees were calculated from their geometrical measurements making use of the collection of allometric formulae compiled by Zianis et al. [69]. Although this compilation is explicitly for European trees, we propose that the shapes of individual trees in Siberia will not differ substantially from those of similar species in Europe. We justify this assumption on the basis of, first, our own informal observations and, second, the fact that the published formulae do not suggest large variations in stem volume between different tree species of the same genus, height, and DBH. Specifically, we proceeded as follows. For a given tree genus (e.g., Betula), all the allometric formulae contained within the compilation of Zianis et al. [69] for trees of the same genus were applied using the measured tree dimensions, and the median value of the calculated stem volumes was adopted as the value for that particular tree. The formulae were transcribed into the programming language GNU Octave [70] in order to facilitate the huge number of calculations required by this method. Some obvious errors in the published formulae (wrong units specified for the measurement of tree stem volume, height, or DBH, which produced errors of more than one order of magnitude) were corrected. By comparing the within-species and between-species variations of GSV within each genus, we estimate the uncertainty arising from this approach as 0.007 m 3 . On this basis, we estimate the uncertainty arising from the application of the Zianis et al. formulae to be of the order of 10%, so including the uncertainty in the measurements themselves, we estimate the fractional uncertainty in a calculated value of V to be 25%. By extrapolating the height-volume and diameter-volume relationships established from our measurements, we estimate that the proportion of total tree GSV that we did not sample was less than 0.1%.
Four Sentinel 2 Multispectral Imager (MSI) images were used in this study, consisting of a summer image and a winter image for each of the two main study areas (Table 1). Selection criteria were that the images should provide good coverage of all the field plots, should have no identifiable cloud or smoke cover over the field plots and be generally as free of cloud as possible, and that they should correspond to the year in which the corresponding field data were collected. In practice, application of these criteria required that three of the four images were supplied at level 1C (top-of-atmosphere reflectance) whereas one was available at level 2A (atmospherically corrected). The Sakha summer image exhibited several areas of cloud and smoke that were not adequately removed using the supplied cloud mask layer, and these were masked out manually. Following these steps, the images were clipped to rectangular areas large enough to include all the training plots. The resulting areas after masking (shown in Figure 3) were 7795 km 2 (Khibiny) and 22,042 km 2 (Sakha). The rather large difference in area (factor 2.8) was mainly a consequence of the spatial distribution of the field plots, controlled by the practical difficulties of access in the study areas.   Table 1.
Land-cover classifications were produced for the two study areas from the available MSI images using the Semi-Automatic Classification plugin (version 6.4.5: [71]) for QGIS version 3.14 [72]. Training data for these classifications were produced by generating classified point objects from the MODIS-based Russian land cover map [73,74] available through the VEGA system of the Space Research Institute of the Russian Academy of Sciences [75]. The 'Point object' tool in the VEGA system was used to generate the classified point objects. VEGA superimposes a grid over the MODIS land-cover map for a chosen extent, and if a node of this grid falls into a pixel that is surrounded by pixels of the same land-cover class, then the grid node becomes a point object of this land-cover class. The points are then filtered to provide a similar number of points for each landcover class. Square buffers of 30 m × 30 m were applied to these point objects, and spectral signatures for any of the VEGA-defined land-cover classes were calculated over the eight available 10-m resolution bands (i.e., bands 2, 3, 4, and 8 of the summer and winter images) using the pixels which intersected with the square buffers.
Some manual intervention was required to remove obviously incorrect point training data. In all cases, these were observed to be a consequence of the mismatch in spatial resolution between the MSI images and the MODIS data used to generate the training points (for example, a point classified as needleleaf forest that actually lay within a small lake). In total, 7578 classified points in 13 land-cover classes, and 6524 classified points in 10 land-cover classes were generated for Khibiny and Sakha, respectively. The two combined 8 band Sentinel 2 summer and winter images, one for Khibiny and one for Sakha, were then classified using the relevant spectral signatures and the maximum likelihood algorithm. The sets of image classes were then reduced to include only those occurring within 3 × 3 pixel neighbourhoods of the centre locations of the field plots (8 and 5 remaining classes in Khibiny and Sakha respectively2). These image classes were further generalised, as described in Section 3, to four and three in number, respectively, as our method of estimating GSV depends on limiting the number of potential explanatory variables.
Modelling of GSV per unit area, G, was based on individual summer Sentinel-2 MSI band values and land-cover classifications. To better accommodate the dynamic range in the calibration data, and to ensure that the model could not generate non-positive values of G, the empirical model was defined such that the natural logarithm of G, ln G, was a linear function of the variables. The generic model had the form where Bi is the pixel value in band i of the Sentinel-2 MSI image, and Cj is the number of pixels (out of a maximum of nine) in the 3 × 3 neighbourhood of the pixel to be modelled that are assigned to merged land-cover class j. Because the number of available MSI bands was 4 (bands 2, 3, 4, and 8), the value of n could have been anything from 0 to 4. And as the number of generalised land-cover classes was four (Khibiny) and three (Sakha), the value of m could have been anything from 0 to these values. Thus the total number of parameters in the model defined by Equation (1) could be as many as 7 or 8. We chose, however, to limit the actual number (i.e., the sum of n + m) to three in each case. The choice of MSI bands i to include in the model, and the number and merging of land-cover classes j, were determined experimentally using the field-based estimates of G as training data. The values of the coefficients ai and bj were determined through linear least-squares regression analysis, and performance was assessed using leave-one-out error analysis. Separate modelling exercises were performed for the two study areas, and the optimal models (i.e., those that resulted in the smallest RMSE errors in lnG) were applied to the entire MSI image area. Non-forest areas, as defined by the land-cover classifications, were masked out, as were water bodies, identified by applying a threshold of 0.3 to calculated values of the normalised difference water index (NDWI: [76]). Small water bodies are particularly abundant in the Sakha study area. A 10-m buffer was applied to all detected water bodies to remove marginal pixels. A graphical summary of the processing chain by which the GSV G was estimated is given in Figure 4.  Table 4, then quantified (4) to generate a multi-band image in which each 'band' represents the number of pixels within a 3 × 3 neighbourhood corresponding to a particular class in the GLCC. The GSV estimator function is produced using equation (1), with the summer MSI image and the 3 × 3 class counts image as inputs, trained using field estimates of GSV (5). Finally, the GSV estimates for the whole study area are produced using the estimator function, class counts, and MSI summer image (6). Figure 3) in the Khibiny and Sakha study areas. Tables 2 and 3 include an indication of the composition of each plot as a percentage GSV represented by each of the main tree genera (i.e., last four columns of Table 4), and these were used to guide a process of generalisation of the land cover maps to maximise the correspondence (by minimising the Kramér V-statistic on contingency tables: [77]) between the land cover and the composition. These generalisations are shown in Table 4 and the resulting generalised maps are shown in Figure 5. The names allocated to these generalised classes are arbitrary and have no ecological significance, and in particular the use of the same generalised names between the two study areas does not imply any ecological equivalence between them. We emphasise that no inferences are drawn from the names of these classes.    The optimum model for the Khibiny area employed a single band (band 3: green) of Sentinel-2 MSI data, together with the 'low vegetation' and 'needleleaf forest' generalised classes. The optimum model for the Sakha area employed two MSI bands (2: blue and 3: green), together with a single generalised land-cover class 'Needleleaf forest'. Coefficients of the optimum models are shown in Table 5, together with their accuracy estimates. Figures 5 and 6 show the results of applying these models to the entire area represented by the MSI images. The logarithmic values generated by applying Equation (1) were transformed to linear values by exponentiation. Mean, standard deviation, and median GSV estimated for forest areas in the Khibiny area were 102, 34, and 98 m 3 ha −1 . The corresponding values for the Sakha area were 118, 91, and 99 m 3 ha −1 . These values are comparable to the mean value of 72 m 3 ha −1 deduced for boreal forest globally [63]. The pseudocolour scales of Figures 6 and 7 are different, corresponding to the different distributions of estimated GSV in the two study areas. Table 5. Parameters and coefficients of the optimum GSV models defined by Equation (1), together with r 2 coefficient of the fit to the field data and an estimate of the uncertainty ∆ ln G in fitting the natural logarithm of GSV from leave-one-out estimation.

Discussion
The premise of this study is that the inclusion of a land-cover classification, suitably converted into quantitative data, can provide useful ancillary input to an empirical model to estimate forest GSV from summer Sentinel-2 MSI imagery. This has proved to be the case, at least in the two study areas investigated, as the optimum models in both cases selected at least one of the land-cover classes as input. Some experimentation was performed to include winter imagery but this did not materially improve the performance of the method.
Inspection of the GSV model output showed some anomalously high GSV values, especially in areas that may be partly in shadow. This evidently points to the empirical, nonphysical basis of the algorithm and suggests that the incorporation of topographic data would have scope for improving its performance. However, we note that coverage of both study areas in the ArcticDEM product (https://www.pgc.umn.edu/data/arcticdem/, accessed 01 November 2021) is at present incomplete, and that either this or the ASTER GDEM-which does offer complete coverage-would require filtering for artefacts, thus increasing the complexity of the algorithm. Some small shadow-affected areas were removed by the median filtering noted earlier, and some further improvement was made by truncating the predicted GSV values at an upper bound of 500 m 3 ha −1 , to limit the extent to which values were extrapolated beyond the range of the calibration data. This removed 1.7% of the pixels in the Sakha image. Far fewer anomalies (approximately 0.001%) were noted in the Khibiny image, and GSV truncation was not applied. The distribution of estimated GSV values was narrower than that for the Sakha image (e.g., a standard deviation of 34 m 3 ha −1 compared to 91 m 3 ha −1 ) and no truncation was deemed necessary.
In contrast, the spatial correspondence between the modelled GSV and ultra-highresolution satellite imagery ( Figure 8) and a large-scale MODIS-based GSV product ( Figure 9) is evidently good at both small and large spatial scales. Visual comparison (Figures 8 and 9) is convincing. We also quantify the correspondence by constructing 2 × 2 contingency tables between above-and below-median GSV values as classified using our method and using the MODIS GSV product. These show accuracies (proportion of pixels agreeing whether the GSV is above or below the median value) of 70.5% for the Khibiny study area and 68.0% for the Sakha area. The relationship between our estimated GSV values and those derived in the MODIS-based product is shown in Figure 10, demonstrating a monotonic (if not linear) correspondence. We recall that the present algorithm was not calibrated to the MODIS product, but only against field data. These observations lend confidence in the present method.
The accuracy of this method is summarised in Table 5, whose results are interpreted as implying that the RMS error in GSV estimation is approximately 35% for the Sakha study site and approximately 55% for the Khibiny site. These values are included in Figure  1, where they imply that the method is not obviously inferior to other approaches to GSV estimation for sparse forests based on spaceborne optical or multispectral data. However, we also note that up to approximately 25% uncertainty in GSV may be contributed by the allometric estimation, so the algorithm's performance may be considerably better than these values imply. We thus propose that it is worth developing this approach. Its principal disadvantage is that it is constructed on an empirical rather than a physical relationship. This is compensated for by the fact that it is derived from a large number of field measurements which are labour-intensive to acquire, although spaceborne laser altimetry from GLAS and ICESat-2 could offer some scope for acquiring GSV estimates for calibration [78]. Additionally, the strong correspondence noted in Figure 9 suggests that its most useful application may be as a downscaling tool from large-scale GSV estimates, where its requirement for just two Sentinel-2 images (or similar) would be a major advantage. Obvious future developments would be to attempt to derive the calibration data themselves from potentially less time-consuming data collection methods, such as UAV surveys, or from published databases of field measurements over a wider range of locations. Forest presence data could be obtained at higher spatial resolution from Landsat-derived products [79,80].

Conclusions
We have developed a simple, empirically-based algorithm for spatial extrapolation of GSV based on one summer and one winter Sentinel-2 MSI image, a large-scale Russian land-cover classification, and field-plot scale GSV data used for parameter selection and calibration of the algorithm. It has been applied to two contrasting regions of the Russian boreal forest and produces convincing patterns of spatial variation as well as mean GSV values consistent with what is expected for boreal forest in general. Over the limited range of situations to which it has been applied, it appears that its accuracy is comparable to, and perhaps better than, other local or regional-scale methods used to estimate GSV on the basis of satellite imagery. The essence of the method is to optionally include a simple description of land-cover, which is converted into a set of quantitative variables, along with the values (reflectances, radiances, or digital numbers) from the available bands of the MSI images. This approach is relatively undemanding of data availability. As it has been implemented in the present work, it is trained using field data that are laborious to acquire; but as a downscaling method for large-scale GSV products such as those generated from MODIS imagery this requirement for field data would not be necessary.  We also gratefully acknowledge financial support and much encouragement from the UK Science and Innovation Network through the British Embassy in Moscow. The EU Transnational Access Interact scheme provided financial and logistical support for access to and use of facilities at the Khibiny and Spasskaya Pad field stations. Data Availability Statement: All data used in this research are either included in the manuscript or publicly available.