Deep Learning Model Transfer in Forest Mapping using Multi-source Satellite SAR and Optical Images

Deep learning (DL) models are gaining popularity in forest variable prediction using Earth Observation images. However, in practical forest inventories, reference datasets are often represented by plot- or stand-level measurements, while high-quality representative wall-to-wall reference data for end-to-end training of DL models are rarely available. Transfer learning facilitates expansion of the use of deep learning models into areas with sub-optimal training data by allowing pretraining of the model in areas where high-quality teaching data are available. In this study, we perform a"model transfer"(or domain adaptation) of a pretrained DL model into a target area using plot-level measurements and compare performance versus other machine learning models. We use an earlier developed UNet based model (SeUNet) to demonstrate the approach on two distinct taiga sites with varying forest structure and composition. Multisource Earth Observation (EO) data are represented by a combination of Copernicus Sentinel-1 C-band SAR and Sentinel-2 multispectral images, JAXA ALOS-2 PALSAR-2 SAR mosaic and TanDEM-X bistatic interferometric radar data. The training study site is located in Finnish Lapland, while the target site is located in Southern Finland. By leveraging transfer learning, the prediction of SeUNet achieved root mean squared error (RMSE) of 2.70 m and R$^2$ of 0.882, considerably more accurate than traditional benchmark methods. We expect such forest-specific DL model transfer can be suitable also for other forest variables and other EO data sources that are sensitive to forest structure.


I. INTRODUCTION
F ORESTS cover approximately one-third of Earth's land- mass (FAO 2022) and play a key role in mitigating the effects of climate change by reducing the concentration of carbon dioxide in the atmosphere.Forests are fundamental for preserving biodiversity as they are the natural habitat to a myriad of plant and animal species.Several international initiatives, such as the Framework Convention on Climate Change 1 and the Convention on Biological Diversity 2 by the United Nations, and the new Forest (EU 2021) and Biodiversity (EU 2020) strategies by the European Union, Shaojia Ge is with Department of Electronic Engineering, School of Electronic and Optical Engineering, Nanjing University of Science and Technology, Nanjing 210094, China (e-mail: geshaojia@njust.edu.cn),Oleg Antropov, Tuomas Häme and Jukka Miettinen are with VTT Technical Research Centre of Finland, 00076 Espoo, Finland (e-mail: name.surname@vtt.fi),Ronald E. McRoberts is with Department of Forest Resources, University of Minnesota, Saint Paul, MN 55108, USA (e-mail: mcrob001@umn.edu),Correspondence: oleg.antropov@vtt.fi. 1 https://unfccc.int/ 2 https://www.cbd.int/induce increased forest monitoring requirements at national and local levels by increasing the reporting requirements.
Interest in voluntary certification schemes and diversification of forest uses (e.g.carbon or other ecosystem services) is also growing among forestry stakeholders, which typically requires improved forest monitoring approaches for verification purposes, further increasing the need for reliable high frequency information on forests.The use of Earth Observation (EO) data has become an integral part of forest monitoring due to numerous satellite sensor missions designed for environmental monitoring during the last decades [1], [2].However, as satellite sensors typically cannot directly measure many forest attributes, indirect modelling approaches have been developed that combine EO data with ground based observations to provide a set of predictions presented in the form of a map.In wide-area forest mapping, statistical, physics-based, and machine learning (ML) methods have been used for modeling and prediction purposes [3].Satellite remotely sensed data combined with in-situ forest measurements can be considered a cost-effective means for producing forest attribute maps and forest estimates on various areal levels, but such predictions and estimates can have considerable uncertainty [2].
EO datasets used for mapping of boreal forest resources include both optical and SAR datasets [3], [4].Optical datasets were widely used historically and are particularly suitable for mapping forest cover and tree species [5].The SAR signal, especially at lower frequencies, is sensitive to forest biomass and structure [3].Multitemporal and polarimetric features provide improved accuracies [6], [7].Interferometric SAR datasets have particular sensitivity to vertical structure of forest that makes them very suitable predictors for mapping forest structure and height [8], [9].
Deep learning (DL) methods are widely adopted for various image classification, and semantic segmentation tasks [10]- [12].To date, several fully convolutional and recurrent neural networks have been demonstrated in forest remote sensing [13]- [20].These models often provide improved accuracy in forest classification or predicting forest variables, as well as in forest change mapping.However, training of DL models often requires a fully segmented reference label, such as airborne laser scanner (ALS) based forest attribute maps that are costly and not available over wide areas.
Instead, reference data from forests are typically available as field sample plots measurements.The sample plots have been traditionally used to derive sample based areal estimates of forest resources, but in recent decades also as in-situ data for training and evaluating remote sensing based models.Within the DL context, such reference data can be considered weak labels [18] and they are not optimal for training a DL model.In addition to the typically insufficient number of sample plots, they lack information on the spatial context.
Although there is an increasing amount of ALS data available over boreal forests, current airborne campaigns do not provide the spatial and temporal coverage required to meet the needs for modern forest inventories.For example, annual information would be needed to support forest management decisions and monitoring requirements in an operational context.Therefore, the lack of appropriate ALS data greatly reduces the utility of DL models for operational forest monitoring.With transfer learning techniques, it may be possible to fine-tune DL models to the area of interest using data for field sample plots that are generally more widely available in managed forest areas.This approach would open possibilities for both geographic (i.e. from an area that has ALS data to an area that has not) and temporal (i.e. from a year that has ALS data to a year that has not) transfer of DL models, facilitating wider usability of DL-based approaches in operational forest monitoring.
In this study, we aim to demonstrate the potential of transfer learning in the context of forest monitoring with deep learning models.We pretrain an earlier developed SeUNet deep learning model [15] with ALS-based data in a training site situated in Lapland in the Northern part of Finland.Further, we apply the pretrained model in the target site in the Southern part of Finland, with and without fine-tuning the model with field sample plots from the target area.We highlight the effects of the model fine-tuning with field plot data and compare the resulting accuracy to traditional machine learning methods trained in the target area using the same field plots.We also investigate scenarios where reference forest plots in the target site are very scarce or measurements for certain forest strata are completely missing.Additionally, we compare various EO sensors in terms of prediction accuracies.

II. MATERIALS AND METHODS
A. UNet model and its derivative/improved models One of the main and frequently used architectures of CNNs for semantic segmentation of satellite images, including the forest mask segmentation, is the U-Net architecture.UNet is a variant of convolutional network originally introduced for biomedical image segmentation [21] and is presently often used in various semantic segmentation and regression tasks [15], [19], [22].The basic UNet (also known as Vanilla UNet) uses convolutional network to extract image features.The UNet model consists of an encoder and a decoder, which are connected by skip connections.The encoder is responsible for feature extraction and the decoder is used to restore the feature map to its original size.The model is symmetrical in structure and has a double-convolution structure at its core, which is made up of a 2-D convolution, batch normalization, and ReLU activation.This structure allows UNet to extract deeper features of the input data and maintain a good fusion ability at all levels, while keeping the feature map size unchanged.The overall architecture of UNet makes it well-suited for pixellevel classification and regression tasks.
Here, we use an improved version called SeUNet, suitable for producing spatially explicit pixel-level forest inventory using EO data [15] when trained with fully-segmented (spatially explicit) image patches, such as ALS-based forest inventory data.Within SeUNet, a Squeeze-and-Excitation attention module is used to recalibrate the multi-source features using channel self-attention to improve the accuracy of predictions with limited reference data.SeUNet was superior to basic UNet model and was shown to be particularly effective in boreal forest height mapping using Sentinel-1 time series and Sentinel-2 datasets [15].The structure of the model is shown in Figure 1.

B. How transfer learning is organized
The concept of transfer learning in deep learning involves using a pre-trained model on a large dataset as a starting point for training on a new, smaller dataset [23]- [25].Since both study areas (initial and target sites) are represented by boreal forests, we can safely assume their latent representations strongly overlap in EO feature space, although the specific forest characteristics may be different (considerably more sparse forest in Lapland).This means the prior knowledge learned from the source site can mitigate the negative influence brought by limited reference, e.g.plot-level forest reference that limits inferring spatial context (neighbourhood features).Our hypothesis is that by leveraging the knowledge gained from the pre-training on a spatially explicit dataset, we can achieve better results compared to the end-to-end training from scratch using conventional statistical and machine learning approaches with limited reference forest plot data collected at the new target site.

C. Our approach
Our suggested approach follows the study logic shown in Figure 2. Firstly, a SeUNet model is pretrained using multisource EO-dataset and spatially explicit reference based on ALS data.This is done over the pre-training site where such reference data are available.Both spectral and spatial forest signatures are learned as parameters of pre-trained UNet model.Various combinations of input satellite EO data are tested, resulting in a set of DL models.
In the second stage, the learned SeUNet parameters are used as the initial weights for model training over the target site.In contrast to the pretraining site, only a very sparse reference dataset is available from the target site, represented by forest plots.The model is fine-tuned by including only pixels that have known reference value in the loss computation.Both the pretraining phase and DL model training over the target site are illustrated in Figure 3.We also investigate scenarios for which only a fraction (5-10%) of originally available forest plots are used, and when several forest strata are underrepresented.
Lastly, our predictions are compared to predictions obtained using traditional EO-based forest inventory methods including multiple linear regression (MLR) and the popular k-Nearest Neighbours (k-NN) technique [7], [26]- [28].MLR is a basic regression approach often used for modeling the relationship between response variables such as GSV or forest biomass and SAR and optical image features [29]- [31].k-NN is an established non-parametric and distribution free method widely used for forest variable prediction [32].Predictions are obtained as weighted linear combinations of attribute values in a set of k nearest units selected from a reference set of units with known values.The choice of these units is determined by a distance metric defined in the auxiliary variable space.

D. Study sites, satellite SAR and optical data
The study is performed in Finland over two geographically distinct areas, separated by around 700 km (Figure 4).Both sites feature boreal forests, one in the Northern Boreal zone and the other in the Southern boreal zone.The forests in the Northern Finland are in many aspects different from the forests in Southern Finland.This makes this pair of study sites particularly suitable for demonstrating the potential of DL model transfer.The Northern site in Lapland, in the Salla  39 Sentinel-1 images acquired during 2018 in the same geometry.The original dual-polarization Sentinel-1A images available as GRD (ground range detected) products were radiometrically terrain-flattened and orthorectified with VTT in-house software using local digital elevation model available from National Land Survey of Finland [34], [35].Final preprocessed images were in gammanaught format [36].
• L-band SAR imagery was represented by JAXA mosaic produced from dual-pol ALOS-2 PALSAR-2 images acquired during 2018 were used.• Interferometric SAR layers were represented by TanDEM-X images collected during summer 2018.TanDEM-X canopy height model is calculated via subtracting of TanDEM-X phase and topographic phase (calculated from local topographic map) in slant range followed by phase-to-height conversion and geocoding obtained height product.It is later called interferometric canopy height model (ICHM) in the paper.Additionally, TanDEM-X coherence magnitude was used as an image feature layer.ESA SNAP software was used for calculating TanDEM-X image layers.
2) Reference data: Over the initial pretraining site, ALSbased heights were used.ALS data were collected by National Land Survey of Finland during summer of 2018.Forest heights were estimatedfrom ALS point clouds as average elevation of forest classified points over ground layer within 20×20 m 2 pixel cells.In this way, a wall-to-wall coverage of the pretraining study site with the reference height information was obtained.
Reference data over the target Kouvola site were represented by data for a sample of plots measured by the Finnish Forest Centre in 2018.The plots were circular with three different radii: 9 m in young and advanced managed forests with a relatively large tree density; 12.62 m in forest with a small stem density but usually large volume due to the mature development stage and 5.64 m in seedling stands.Altogether 1064 field plots were used.Two thirds of the plots, 709 of them, were used for model training (model transfer), while the remaining 355 (selected as every third plot after arranging the plots in the order by volume) were used for the uncertainty assessment.

E. Implementation details
In model pretraining, the wall-to-wall reference and EO data were first cropped into 256 px×256 px non-overlapping image patches in spatial dimension.The non-forested regions were removed by masking out corresponding areas on both EO data and reference data.Additionally, several patches with forest cover proportion less than 20% were removed.In total, 614 image patches were prepared, half of which (307 patches) were randomly assigned to the testing subset, 10% were assigned to the validation subset and the remaining patches were used in the model training.Later, after the data augmentations that included in-situ spatial shifting and rotations, 1433 augmented training patches were generated.
In model transfer, forest field plots were converted to rasters.We used the similar image patch cropping approach as described above, keeping only patches that have at least one plotlevel reference within the patch.In total, there were 138 testing patches, 32 validation patches and 524 augmented training patches.We used the weak-labeled training and validation patches to fine-tune the pretrained model.
In both pretraining and fine-tuning processes, we used Adam as the optimizer and OneCycleLR as learning rate scheduler, the maximum learning rate was set to 10 −2 .We pretrained the For conventional methods, plot-level EO features were calculated using described datasets and data splits over the Kouvola target site and used in the model training.
The experiments were performed using Windows Server with Intel Xeon E5-2697 v4 CPU and NVIDIA RTX A5000 GPU accelerated by CUDA 11.7 toolkit.The SeUNet model was built with a neural network library, Pytorch 1.11.0.MLR and RF were implemented with Scikit-learn machine learning toolbox.

F. Accuracy metrics
The prediction accuracy for the various regression models was calculated using the following accuracy metrics, including root mean squared error (RMSE), relative root mean squared error (rRMSE), the coefficient of determination (R 2 ) as follows: where y i and ŷi are reference and predicted values of forest height for i-th plot, ȳ is the mean value of all plots and n is the total number of plots.

A. Prediction Performance over Pretraining (Salla) Site
Figure 5 shows scatterplots after the model pretraining on original/initial dataset over the Salla site in Lapland.Three combinations of input EO datasets resulted in three distinct DL models: 1) S2-Lapland, with only Sentinel-2 data; 2) S1S2-Lapland, where both Sentinel-2 and Sentinel-1 data were included, and 3) MS-Lapland, where multi-source data included additionally ALOS-2 PALSAR-2 and TanDEM-X data layers.
Depending on the input EO dataset, prediction accuracy of the SeUNet model (calculated on testing image patches not involved in the training process) varied.Scenario that included only Sentinel-2 bands had somewhat smaller prediction accuracy (RMSE=1.79m, rRMSE=18.8%,R 2 =0.79) while adding Sentinel-1 layer slightly improved the prediction performance (RMSE=1.74m, rRMSE=18.2%,R 2 =0.80).Importantly, these results can be achieved using freely available Copernicus datasets.When adding ALOS-2 PALSAR-2 mosaic (with wellknown L-band sensitivity to forest growing stock volume) or TanDEM-X data (with well-known sensitivity to vertical forest structure) the prediction accuracies increased to RMSE=1.53 m, rRMSE=16.1% and R 2 =0.84.Importantly, those predictions are done at 10 m pixel resolution, and accuracy estimates increase when aggregating to coarser mapping units.

B. Prediction Performance over Target (Kouvola) Site
Results of "blindly" applying pretrained models (without fine-tuning with in-situ forest plots) over the target Kouvola site are shown in the upper row of Figure 6.Prediction performance is not satisfactory, with RMSE in the range 5-7 m (40-50% rRMSE), strong negative systematic prediction error of 2.3-2.8m present in all model predictions, and apparent signal-saturation effects clearly visible for taller trees.Such performance can be attributed to differences in both forest structure and EO images (spectral, calibration, seasonal changes).Similar albeit smaller effects could be expected if the target site was the same as pretraining, and only EO images Fig. 6.Scatterplots illustrating prediction performance for SeUNet model: upper row -use of pretrained non-finetuned models; bottom row -using fine-tuned models; 1st column -Sentinel-2 data; 2nd column -combined Sentinel-2 and Sentinel-1 images, 3rd column -all available images (Sentinel-2, Sentinel-1, ALOS-2 PALSAR-2, TanDEM-X) acquired at another time were used in the prediction (i.e., at least the forest structure would be the same).Prediction performance was problematic for all non-finetuned DL models and EO data inputs, with slightly more accurate predictions for the MS-Lapland model.
After the SeUNet model fine-tuning, accuracies strongly increased.Scatterplots for corresponding models are shown in Figure 6, bottom row.Detailed results are collected in Table I for all models as well as for the benchmark MLR and kNN methods.Corresponding scatterplots for the benchmark models are shown in Figure 7.
Using additional explanatory EO variables improved prediction accuracy in all cases, for both baseline methods and for developed models.The multi-source dataset demonstrated the most accurate predictions for all methods including benchmark methods, similar to model pretraining with ALS data.
Achieved prediction accuracies are considerably improved compared to applying non-finetuned methods and compared to traditional machine learning approaches.

C. Model stability with scarce or missing reference data in the Kouvola site
Additionally, we checked the stability/resilience of the suggested model and baseline approaches with respect to scarce (only 5-10% of all plots are available over target site, suitable for SeUNet model transfer or baseline traditional model training) or completely missing data (e.g., forest plots with tall trees or forest plots with short trees are completely missing from the target site).The baseline method for comparison was kNN, which is widely used in forest inventory mapping [2], and which also demonstrated superior performance compared  to fail when reference data are scarce or non-representative (e.g., missing the smaller or larger end of the height range).The difference in R 2 and RMSE is substantial between finetuned SeUNet and kNN methods, but most importantly the range of predicted heights is distorted, and predictions can not understandably reach certain (small or large) values that fall out of the range of forest plot measurements.For example, when young forest plots are missing in the training set (Figure 8, 2nd column), kNN fails to produce any estimates less than 10 m.The same effect can be observed for forest plots with tall trees, with the tallest predictions of ∼25 m for kNN, while the SeUNet predictions reach 30 m.When a small number of plots is used, kNN predictions start to approach the average height.The effect would become even more pronounced if the number of plots is decreased further, e.g. to 10 reference plots.The model transfer is still expected to work in that situation, provided that the reference dataset for the target site includes sample units with short and tall forests.
IV. DISCUSSION A. Overall discussion on performance across various models and EO datasets Figure 6 clearly demonstrates increased accuracy for Se-UNet model prediction when using transfer learning mechanism.As visible from scatterplots in the upper row, the prediction without transfer learning does not follow the 1:1 line very well, showing strong systematic negative prediction error and saturation effects.The heights of taller forests greater than 15 m are underestimated, while height predictions of forests shorter than 10 m tend to be overestimated.In the best case when multisource satellite data are used, the mean prediction error exceeds 2 m.
Within the transfer learning, the finetuning process compensates for the disadvantage of insufficient or erroneous (because of different forest structure and EO data) representation information.Representation features themselves, particularly spatial context learned from source site are transferred to the target site.As shown in the bottom row of Figure 6, when using multisource EO data, the rRMSE is reduced by 20% compared to the scenario for which the non-finetuned model is applied, and the mean prediction error is accordingly reduced to less than 0.3 m.
Additionally, we compare the performance of SeUNet to other traditional machine learning methods, such as kNN and MLR.The deep learning model generated more precise forest height predictions, especially when multi-satellite data are used.Achieved RMSE=2.70 m with rRMSE=18.07%is several percentage units smaller than obtained with kNN and MLR.From the scatterplots, we can clearly observe a closer linear relationship between predictions and reference.The improvement can be attributed to learned spatial representations from pre-training site.This is in line with common observation that use of textural information can help improve predictions of forest variables [6].
Regarding the role of different EO datasets, combining SAR and optical datasets provided more accurate predictions than use of optical or SAR datasets alone, even though gain from adding Sentinel-1 to Sentinel-2 was limited compared to using Sentinel-2 data only.In Figure 6 and Figure 7, the bias is slightly reduced when adding Sentinel-1 data.One possible reason of limited improvement is that Sentinel-1 data are not the optimal dataset for forest height estimation.In contrast, adding L-band ALOS-2 PALSAR-2 and especially interferometric TanDEM-X image layers strongly increased the accuracies; such effects were observed over both the pretraining site with ALS data, and over the target site.With multi-source data, considerable reduction in rRMSE was observed not only for SeUNet but also for kNN and MLR, which can be explained by high sensitivity of TanDEM-X to vertical structure of forests [8], [9].

B. Comparison with Prior Studies
To date, numerous methods and remotely sensed data combinations have been used for forest height estimation in boreal and temperate forests [5], [33], [37]- [39].Reported accuracies for boreal forest height mapping range typically in the order of 30-40% rRMSE in these studies.The results achieved in the present study can therefore be considered quite comparable to earlier methods, although in this study a model transfer between two highly different study sites was performed.
In the boreal region, reported forest height accuracies with Sentinel-2 and Landsat data have been 35-60% rRMSE at forest plot level [5], [13], [33], while the proposed model transfer approach reached 30% rRMSE at plot level with Sentinel-2 data only, and 18% rRMSE with multi-source EO dataset.Predictions obtained with ML models and Sentinel-2 data are within the same accuracy range as in recent published studies using Sentinel-2 and Landsat [5], while our predictions using DL models are more accurate for similar EO data combinations.The literature on using SAR data for forest height prediction is limited, with most studies conducted at forest-stand level, but our tree height predictions are at the same accuracy level or even greater than reported for retrievals with TanDEM-X interferometric SAR data that are considered very suitable for vertical forest structure retrieval [8], [9], [40].Regarding coupling of ALS data with satellite EO data in our pretraining in Lapland, our results are in line with other similar studies [41].Use of recurrent and fully convolutional neural networks with fully segmented labels and Sentinel-1 time series or combined SAR and optical data provided accuracies on the order of 17-30% rRMSE that are similar to results in our work over pretraining site in Lapland [15], [16], [20], [42].Inversion of TanDEM-X images acquired over Estonian hemiboreal and Canadian boreal forests provided accuracies with RMSE in range of 3-4 m and correlation coefficients R 2 larger than 0.5 [40], [43]- [46].Such results were achieved using various sets of TanDEM-X data and simplified semiempirical parametric models.Other reports indicate typical error levels around 4.8 m RMSE for TanDEM-X datasets in comprehensive studies using those models [47].
Our results with MS-Lapland SeUNet model fine-tuned using forest plots in Kouvola are more precise thus indicating useful synergy of SAR and optical datasets in boreal forest parameter mapping and thus benefits of using combined multisource datasets.

C. Outlook
The first demonstration of the model transfer of a deep learning model for forest variable estimation performed in this study indicates substantial potential of such approaches for operational forest monitoring tasks.Demonstrated study can be considered a "geographic transfer", while also "temporal transfer" is possible when a model is pretrained on one year's EO data, and then later used over the same site but with EO data collected in another year.Obviously, a combination of those is possible as well.Both types of model transfer hold great potential for forest monitoring.
The geographic model transfer, such as the one used in this study, extends the utility of deep learning models into areas that do not have suitable training data.Knowing the sporadic availability of ALS data over boreal and temperate forests, this would greatly expand the usability of deep learning models.It must be remembered that in this study the transfer was performed within the same boreal zone, albeit between two forest areas with different structural characteristics.Further studies are needed to investigate the limitations of transferability of models between different biomes with greater structural differences (e.g. in species composition).Similarly, transferability of models between different remotely sensed data combinations would be required.This would further broaden the benefit of model transfer, by allowing flexible use of the best possible dataset combinations in the target area, regardless of the dataset used in the model development.
Temporal model transfers, which were not tested in this study at all, would improve the efficiency of high frequency forest monitoring even in areas that have ALS data coverage for some years.With a limited number of field plots measured in the following (or preceding) years, availability of pretrained DL model and model transfer technique would allow use of the fine-tuned model in the years that do not have any ALS data available from the area.This would improve the frequency and consistency of forest monitoring in the area, which is something that is in high demand as the monitoring and reporting requirements for forest owners are constantly increasing.
The results of this study also indicate also that model transfer can be performed with sub-optimal non-representative field plot reference data (e.g., with very few plots or limited range of observations).Limitations of existing field datasets can be overcome by transferring a model that is trained with optimal datasets into the target area (or year) by fine-tuning it with existing sub-optimal field plots.Alternatively, limited field campaigns can be conducted to collect sufficient data for fine-tuning.This increases the efficiency and reduces the costs of forest inventories enabling increase in temporal frequency in an economical manner.Thus, the approach presented in this study has the potential to support forest owners in meeting the increasing forest monitoring and reporting requirements.
In our opinion, further improvement from an operational viewpoint, and also in terms of prediction accuracy, can be gained by improving "initial" pretrained deep learning model that is further fine-tuned.The Lapland model in this regard was limited as did not feature the whole range of forest height and biomass values.Reference data from the whole country, such as Finland, can be used to establish such baseline deep learning model for taiga forests.Our further work will focus on scaling the demonstrated approach and establishing such baseline multi-source EO models for various forest biomes.Another direction is incorporating other UNet+ models, as well as other semi-supervised convolutional and recurrent models to form an extended set of deep learning modeling approaches.
V. CONCLUSIONS This is the first demonstration of deep learning model transfer in the context of EO based forest inventory using multisource optical and SAR data, for which ALS data were used in the model pretraining, and only a limited sample of forest plots was used in the target area to fine-tune the model.This approach facilitated production of more accurate predictions compared to more traditional modeling and machine learning approaches, particularly when reference data were incomplete, very sparse or underrepresented.
The proposed approach offers new perspectives in multisource EO based forest mapping using pretrained deep learning models and a sparse set of forest plots.We demonstrated that such an approach can deliver greater accuracies compared to traditional machine learning methods, and importantly it is also quite robust to underrepresented or scarce forest plot data that are used in fine-tuning -when other machine learning models completely fail.This opens new perspectives in operational forest management and producing timely and updated forest inventories using EO dataset.

Fig. 8 .
Fig. 8. Scatterplots illustrating prediction performance for nonparameteric models: 1st column -kNN, 2nd column -SeUNet, 1st row -scarce training sample (35 plots used), 2nd row -plots smaller than 10m were removed during model training/finetuning in the target site, 3rd row -plots with forest taller than 25 meters were absent in model training/finetuning over the target site

Fig. 9 .
Fig. 9. Predicted forest height map over the Kouvola (target) site and zoomed in fragment.SeUNet model has been used in the prediction.

TABLE I PREDICTION
ACCURACY STATISTICS FOR KOUVOLA (TARGET) SITE • small-biomass plots with forest heights less than 10 meters are completely removed from the training dataset.•tallforestplots with heights exceeding 25m are completely removed from the training dataset.It can be seen that in all the "extreme" cases, illustrated in Figure8, classical nonparametric approaches such as kNN start