Ranking Importance of Topographical Surface and Subsurface Parameters on Paludiﬁcation in Northern Boreal Forests Using Very High Resolution Remotely Sensed Datasets

: The accumulation of organic material on top of the mineral soil over time (a process called paludiﬁcation) is common in Northern Boreal coniferous forests. This natural process leads to a marked decrease in forest productivity overtime. Topography both at the surface of the forest ﬂoor (i.e., above ground) and the subsurface (i.e., top of mineral soil which is underground) is known to play a critical role in the paludiﬁcation process. Until recently, the availability of more accurate topographic information regarding the surface and subsurface was a limiting factor for land management and modeling of spatial organic layer thickness (OLT) variability, a proxy for paludiﬁcation. However so far, no research has assessed which of these two topographic variables has the greatest inﬂuence on paludiﬁcation. This study aims to assess which topographic variable (surface or subsurface) better explains paludiﬁcation, using high-resolution remote sensing technology (i.e., Light Detection and Ranging: LiDAR and Ground Penetrating Radar: GPR). To this end, ﬁeld soil measurements were made in over 1614 sites distributed throughout the reference Valrennes Experimental site in Canadian northern coniferous forests. Then, a machine learning model (i.e., Random Forest, RF) was implemented to rank a set of selected predictor topographic variables (i.e., slope, aspect, mean curvature, plan curvature, proﬁle curvature, and topographic wetness index) using the Mean Decrease Gini (MDG) index as an indicator of importance. Results showed that overall 83% of the overall variance was explained by the RF selected model, while the derived subsurface topography predictors had the lowest MDGs for predicting paludiﬁcation. On the other hand, the surface slope predictor had the highest MDGs and better explained paludiﬁcation. This ﬁnding would be particularly useful for implanting sustainable management strategies based on the surface variables of paludiﬁed northern boreal forests. This study has also highlighted the potential of LiDAR data to provide surface topographic spatial detail information for planning and optimizing forest management activities in paludiﬁed boreal forests. This is even of great importance when we know that LiDAR variables are easier to obtain compared to GPR derived variables (subsurface topographic variables).


Introduction
Canada's boreal forest covers 27% (equivalent of 270 10 6 ha) of the country's terrestrial area and has considerable economic and ecological importance. In addition to acting as a reservoir for maintaining biological and genetic diversity and providing a habitat for diverse wildlife, storing carbon, purifying air and water, and regulating regional and global climates, boreal forests constitute a source of numerous resources for industry in Canada. Black spruce forests (Picea mariana (Mill.) BSP.) are the dominant forest type occupying a large extent of these forests and provide a source of raw material for the wood and paper products industry [1]. However, some of the areas of these northern boreal forests are prone to paludification, which can be defined as a natural process where a thick organic material accumulates on the forest floor surface over time, resulting in higher soil moisture and higher water tables levels [2,3]. From the perspective of sustainable forest management, these conditions can favor the invasion of Sphagnum moss species on the forest floor [4][5][6] and the conversion of potentially forested areas to large bog landscapes, largely resistant to forest establishment and growth [2]. Consequently, this leads to a marked decrease in forest productivity [7,8]. Factors such as time since the last fire and topography play important roles in the occurrence of paludification in these Northern boreal forests. Some parts of the world where paludification can be observed include the interior of Alaska, Labrador, and the Canadian Hudson Bay-James Bay lowlands.
The Clay Belt a vast region (i.e., 125,000 km 2 ) of boreal eastern Canada and the Hudson Bay-James Bay Lowlands, is mainly disposed to paludification where thick organic layers usually observed on the forest floor of mature boreal black spruce forests. Simard et al. [8] have defined this type of paludification as a dynamic process driven by forest succession between fire events that leads to a thick organic soil layer, and the formation of waterlogged conditions on a formerly dry mineral soil [8]. Relatively long-time intervals since the last fire (i.e., 400 years), a flat topography, cold climate, and surficial deposits resistant to water penetration (i.e., compacted clay) have made the Clay Belt especially favorable to paludification [9]. In addition to successional paludification, there is another type of paludification, called permanent paludification, which dominates in natural depressions, which have wetter soil conditions favoring organic layer (OL) build-up. This study focuses only on topographic factors as there is an increasing practical demand for knowledge concerning the influence of topography (i.e., Surface and subsurface) on the variation in organic layer thickness (OLT) from these Northern boreal-forested areas. Surface and subsurface topographies refer to above-ground (i.e., top of forest floor) and underground (i.e., beneath the OL; 5-150 cm depth) topographies, respectively. The effect of allogenic factors (fire and climate) on paludification are not discussed in this study.
Within the Clay Belt paludified forests, there have been few studies that compare topographic factors (from surface and subsurface) that influence the spatial distribution and accumulation of OL at plots and landscape scales [7][8][9][10][11]. Until lately, the availability of accurate topographic data beneath the ground surface (i.e., at top of mineral layer or subsurface topography) was a restrictive factor for both land management and modeling of topsoil spatial variability. Recent studies by Laamrani et al. [12,13] have benefited from advances in remote sensing technologies (i.e., Light Detection and Ranging: LiDAR and Ground Penetrating Radar: GPR) which permitted the generation of high-resolution topographic data (i.e., slope, aspect, elevation, curvature or topographic indices) at the surface or subsurface and related these variables to OLT. Even though LiDAR sensors cannot deliver direct OLT measurements, they can provide accurate mapping topographical features and digital elevation maps (DEM) [12]. Such datasets can be used as supporting information in mapping OL in boreal forests [13]. When combined, LiDAR and GPR are considered to be some of the most effective and reliable high-resolution remote sensing techniques for evaluating topography at both the plot and landscape scales in boreal forested environments [11,[14][15][16][17][18]. Even though these studies proved the importance of topography on the occurrence of paludification, thus far, no study has assessed which set of topographic variables (surface, subsurface, or a combination of both) have a greater influence on this natural process. It is expected that the different topographic variables (predictors) are likely to affect the OLT distribution; and consequently to affect paludification. For instance, Topographic Wetness Index (TWI) was found to play a role on the paludification distribution and therefore should be taken into consideration.
The aim of this study was to assess and compare the relative ability of surface and subsurface topographic variables in predicting OLT in Canadian northern boreal forests with different degrees of paludification using LiDAR and GPR derivatives variables coupled with a machine learning modeling approach. More specifically, the objectives of this study were to quantify the relative importance of topographic parameters at both surfaces and subsurface and to determine how these factors influence spatial paludification process within the Clay Belt region. Given the relatively flat topography of paludified areas (i.e., the Clay Belt region), we hypothesize that subsurface topography more accurately influence paludification occurrence than does the surface topography. To our knowledge, this is the first study to determine the relative importance of topographies at both the surface of organic layer and at the subsurface beneath OLT. This would allow ordering topographic predictor variables according to their predictive performance for paludification in boreal-forested environments.

Study Area
The study area is located in the James Bay Lowlands physiographic region of Quebec, Canada, more specifically within the Clay Belt region ( Figure 1A). The study was conducted on an established pilot site centered at 49 • 27 30" N, 78 • 31 5" W ( Figure 1B) and named the "Valrennes Site". This site covers approximately 720 ha of boreal forest and has been used in various forestry-remote sensing studies [10,11,15,19,20]. The studied area contains natural and managed forests (mainly experimental clear cutting followed by mechanical site preparation and planting conducted in 2011). Since 2009, the Valrennes site has been a subset of a larger group of long-term forest growth and yield study plots used in various scientific research studies to monitor the effects of forest management practices through time on forest productivity. The reference data in this study were measured in 2009 and 2010 prior to the 2011 harvesting activities.
A large part of the study area was covered by black spruce (Picea mariana (Mill.) BSP.) and jack pine (Pinus banksiana Lamb.). The forest floor was composed of Sphagnum spp., feather mosses (principally Pleurozium schreberi (Brid.) Mitten), and shrubs, (mainly dwarf ericaceous species), with variable coverage across the landscape [19]. The Valrennes site has low topographic relief, as the Canadian Shield was overlaid by an extensive clay deposits laid down by pro-glacial Lakes Barlow-Ojibway [21]. In the study area, surface slope ranged from 0.3% to 15.7%; A large part of the study area (about 60%) has a slope superior to 2%, and the elevation levels average 303 m above sea level. Local complex topographical patterns are shaped within the landscape as results of stream channels. Mineral soil (MS) nomenclature used in this study referred to surficial deposit materials (i.e., clay, till) underlying the OL. The depth of the mineral layer across the landscape, varied from 100 cm [12] to 60 m [22]). Detailed information of mineral layer composition present in the study area are described in Veillette et al., 2005 [22]. In the study area landscape, a complex mixture of Precambrian granitic rock types outcrops (i.e., bedrock) appears occasionally at the surface and forms scattered gentle hills. The mean annual temperature is 0 • C and total annual precipitation is 909 mm (Joutel weather station, about 15 km east of the study area [23]).

Sampling Design and Field Datasets Collection
Thirteen transects, totaling 15 km in length, were established in the investigated site following the methods described in Laamrani et al., [11]. Transects were designed to cover a maximum variation of tree species compositions, slopes, and OL thicknesses of the study site as well as for their proximity to main road to facilitate access, sampling, and monitoring. These 13 transects enclosed a range of sites that have varied OLT, paludification levels, drainage networks, vegetation cover, and ground surface moisture conditions. Such transect configuration allowed us to produce a spatially continuous cross-sectional profiles of the subsurface topography ( Figure 2A). A 20 m minimum distance was well-maintained between transects in order to optimize spatially-lateral interpolation between transects. more specifically within the Clay Belt region ( Figure 1A). The study was conducted on an established pilot site centered at 49°27′30″ N, 78°31′5″ W ( Figure 1B) and named the "Valrennes Site". This site covers approximately 720 ha of boreal forest and has been used in various forestry-remote sensing studies [10,11,15,19,20]. The studied area contains natural and managed forests (mainly experimental clear cutting followed by mechanical site preparation and planting conducted in 2011). Since 2009, the Valrennes site has been a subset of a larger group of long-term forest growth and yield study plots used in various scientific research studies to monitor the effects of forest management practices through time on forest productivity. The reference data in this study were measured in 2009 and 2010 prior to the 2011 harvesting activities.  Permanent OLT measurement locations (n = 1614) were established in the studied area along the thirteen transects. Systematic OLT measurements were collected at 10 m intervals along each transect through probing with a manual auger ( Figure 2B). At each sampling point. The auger bored through the organic layer until the MS was encountered and OLT was accurately measured ( Figure 2B). The OLT was taken as the orthogonal distance between the OL surface and the MS interface ( Figure 2). In nearly all cases, the transition between the humic horizon (OL) and mineral soil was visibly noticeable by an obvious change in soil characteristics (i.e., color, compaction, and texture ( Figure 2A,B).

Mineral Layer Identification and Digital Mineral Soil Elevation Model Generation
Ground Penetrating Radar (GPR) survey was conducted along each of the 13 transects of the study area to measure the depth to MS using Pulse EKKO Pro (Sensors & Software Inc. Mississauga, ON, Canada) in the winter of 2010. The survey was conducted in the winter because (i) the presence of snow cover on the forest floor facilitates data acquisition (i.e., surveys have proven to be difficult to perform on snow-free forest floor) and (ii) previous studies have demonstrated that the frozen soil can facilitate the spatially continuous mapping of the interface between the OL and the MS (i.e., [14]). During the survey, GPR sent short pulses of high frequency (i.e., 200 MHz) electromagnetic (EM) energy into the ground from an antenna. When the emitted EM wave reached the interface between the OL and MS (i.e., materials with contrasting dielectric properties), part of the EM energy was reflected back to a surface receiving antenna and was registered as a single radar trace. The low electrical conductivity of the organic layer in these paludified forests allows large penetration depths, and the moisture content changes that occur at various interfaces caused multiple GPR reflections [14]. Juxtaposition of the recorded traces (or reflections) was used to exhibit a 2-D scan of the surveyed subsurface. By knowing the propagation velocity of the pulse through the substrate and the time Sustainability 2020, 12, 577 5 of 14 required for a pulse to travel from the transmitter to the reflector and return back to the receiver, it was possible to determine the location, depth, presence, and spatial continuity of underground MS. Thus, we determined that the GPR survey generated spatially continuous profiles of the MS contour over the study area (i.e., elevation of the MS layer) and could represent an independent measure of OLT). All the collected GPR profiles were post-processed following the methodology described in Laamrani et al. [14] (Figure 3). Depth to mineral layer data were then determined at all of the 1614 sampling field locations. The new generated subsurface elevations dataset was then used to create a digital representation of the three-dimensional subsurface using the TIN procedure (Triangulated Irregular Networks; [24]). Afterwards, a 3D model of subsurface (i.e., at the top of the MS) was created by transforming the TIN to a new subsurface raster. The later has an optimal resolution of 10 m, represents the subsurface topography and is referred to hereafter as the SUB-DEM.

Mineral Layer Identification and Digital Mineral Soil Elevation Model Generation
Ground Penetrating Radar (GPR) survey was conducted along each of the 13 transects of the study area to measure the depth to MS using Pulse EKKO Pro (Sensors & Software Inc. Mississauga, ON, Canada) in the winter of 2010. The survey was conducted in the winter because (i) the presence of snow cover on the forest floor facilitates data acquisition (i.e., surveys have proven to be difficult to perform on snow-free forest floor) and (ii) previous studies have demonstrated that the frozen soil can facilitate the spatially continuous mapping of the interface between the OL and the MS (i.e., [14]). During the survey, GPR sent short pulses of high frequency (i.e., 200 MHz) electromagnetic (EM) energy into the ground from an antenna. When the emitted EM wave reached the interface between the OL and MS (i.e., materials with contrasting dielectric properties), part of the EM energy was reflected back to a surface receiving antenna and was registered as a single radar trace. The low electrical conductivity of the organic layer in these paludified forests allows large penetration depths, and the moisture content changes that occur at various interfaces caused multiple GPR reflections [14]. Juxtaposition of the recorded traces (or reflections) was used to exhibit a 2-D scan of the surveyed

Surface Digital Elevation Model Generation
In May 2010, LiDAR data were collected over an area that covers approximately 10,000 ha (100 km 2 ) centered on the sampling area. LiDAR data acquisition was conducted using a Multipulse Leica ALS50 phase II airborne laser scanner with an average sampling of 2.8 points/m 2 , an absolute vertical accuracy of RMSE = 0.065 m. The collected LiDAR data (LAS files) were preprocessed by separating canopy pulse returns from ground pulse returns; and predicted z values were obtained within the study area using Inverse distance weighting as the grid interpolating model. The resulting model was used to produce a 0.5 m cell resolution (cell size) digital terrain model (DTM), which was rescaled-up to produce 10 m resolution DTM using ArcGIS 10.2 [25]. The 10 m resolution (cell size) Sustainability 2020, 12, 577 6 of 14 was chosen to match the same scale as the mineral layer and field measurements datasets. Accuracy measures (i.e., R 2 and RMSE) between the 50-cm basic and the 10-m upscaled LiDAR products and showed negligible positioning georeferencing difference (RMSE) less than 10 cm (results not shown here). The LiDAR-derived 10 m resolution DTM was used to represent the surface topography and is referred to hereafter as the LiDAR-DTM.  [24]). Afterwards, a 3D model of subsurface (i.e., at the top of the MS) was created by transforming the TIN to a new subsurface raster. The later has an optimal resolution of 10 m, represents the subsurface topography and is referred to hereafter as the SUB-DEM.

Surface Digital Elevation Model Generation
In May 2010, LiDAR data were collected over an area that covers approximately 10,000 ha (100 km 2 ) centered on the sampling area. LiDAR data acquisition was conducted using a Multipulse Leica ALS50 phase II airborne laser scanner with an average sampling of 2.8 points/m 2 , an absolute vertical accuracy of RMSE = 0.065 m. The collected LiDAR data (LAS files) were preprocessed by separating canopy pulse returns from ground pulse returns; and predicted z values were obtained within the study area using Inverse distance weighting as the grid interpolating model. The resulting model was used to produce a 0.5 m cell resolution (cell size) digital terrain model (DTM), which was rescaled-up to produce 10 m resolution DTM using ArcGIS 10.2 [25]. The 10 m resolution (cell size) was chosen to match the same scale as the mineral layer and field measurements datasets. Accuracy measures (i.e., R 2 and RMSE) between the 50-cm basic and the 10-m upscaled LiDAR products and showed negligible positioning georeferencing difference (RMSE) less than 10 cm (results not shown here). The LiDAR-derived 10 m resolution DTM was used to represent the surface topography and is referred to hereafter as the LiDAR-DTM.

Datasets Georeferencing Accuracy
To obtain highest possible positioning accuracy of all the available datasets (i.e., OLT field measurement locations and corresponding LiDAR-DTM, SUB-DEM and GPR data) georeferencing procedure were achieved to allow reliable comparison among these datasets. To this end, (i) the airborne ALS50 LiDAR sensor produced point clouds using an onboard Global Navigation Satellite system (GNSS) and an Inertial Systems. For the purpose of calibrating and validating the accuracy of the collected the LiDAR-derived point clouds, GNSS-georeferenced ground control points dataset was also acquired. A RMSE computing between the two datasets has resulted in LiDAR data that has a horizontal accuracy in the order of RMSE = 0.26 cm. (ii) Positions of all OLT field measurements (n = 1614) were directly georeferenced using a Trimble GeoXT handheld GPS which provided 50 cmlevel positioning accuracy. (ii) For matching validation purposes, the 1614 field sampling OLT measurement were superimposed upon the LiDAR-DEM raster, then a visual paired representation of georeferenced data was conducted to address any data unreliability or misplacement. (iii) The position of all the collected GPR profiles were geo-referenced using a Trimble R8-GNSS. To do so, the GPR system, used in this study, was coupled to a Trimble R8-GNSS and allowed to determine the x, y, z position of all the OLT sampling field locations with a horizontal accuracy of less than 0.03 m. The collected georeferenced (i.e., x, y, z) position was further used to replace all profile in their georeferenced location and to enable visualization of profile topography. In this study, the three

Datasets Georeferencing Accuracy
To obtain highest possible positioning accuracy of all the available datasets (i.e., OLT field measurement locations and corresponding LiDAR-DTM, SUB-DEM and GPR data) georeferencing procedure were achieved to allow reliable comparison among these datasets. To this end, (i) the airborne ALS50 LiDAR sensor produced point clouds using an onboard Global Navigation Satellite system (GNSS) and an Inertial Systems. For the purpose of calibrating and validating the accuracy of the collected the LiDAR-derived point clouds, GNSS-georeferenced ground control points dataset was also acquired. A RMSE computing between the two datasets has resulted in LiDAR data that has a horizontal accuracy in the order of RMSE = 0.26 cm. (ii) Positions of all OLT field measurements (n = 1614) were directly georeferenced using a Trimble GeoXT handheld GPS which provided 50 cm-level positioning accuracy. (ii) For matching validation purposes, the 1614 field sampling OLT measurement were superimposed upon the LiDAR-DEM raster, then a visual paired representation of georeferenced data was conducted to address any data unreliability or misplacement. (iii) The position of all the collected GPR profiles were geo-referenced using a Trimble R8-GNSS. To do so, the GPR system, used in this study, was coupled to a Trimble R8-GNSS and allowed to determine the x, y, z position of all the OLT sampling field locations with a horizontal accuracy of less than 0.03 m. The collected georeferenced (i.e., x, y, z) position was further used to replace all profile in their georeferenced location and to enable visualization of profile topography. In this study, the three datasets (i.e., LiDAR, GPR, and field) have achieved high positioning performance in terms of achievable accuracy (within a 50 cm); and therefore, this has allowed reliable, and consequentially valid comparison among these datasets.

Explanayory Topographic Variabled
LiDAR-DTM and SUB-DEM were used to generate six topographic rasters each (i.e., slope, aspect, mean curvature, plan curvature, profile curvature, and the topographic wetness index (TWI) for a total of 12 raster variables (i.e., six surface and six subsurface). Surface and subsurface topographic variables (predictor variables) derived from the generated raster (i.e., LiDAR-DTM and SUB-DEM) were linked to field-measured OLT (response variable). In this study, the selected topographic variables were chosen because of their link to paludified areas. Indeed, the topography was presumed to have influence on OL accumulation because depressions are believed to be associated with an accumulation of organic matter and a concomitant rise in the water table. The TWI, for instance, has been found to play a significant role in estimating different soil features that are related to paludified areas such as local soil moisture [26,27] and the spatial distribution of groundwater flow along forest-peatland complexes within the boreal forest [28]. It is important to mention that other parameters such as soil type and texture could affect the water level of groundwater, but they are not dealt with in this study.
In this study, the 12 rasters corresponding to each topographic variable were computed using Spatial Analyst tools (ArcGIS 10.2; ESRI, 2011). Conceptually, the topographic variable tool fits a plane to the z-values of a 3 × 3 cell neighborhood around the central cell of a specific raster (i.e., slope). When a cell location within this nine-cell neighborhood has a "NoData" z-value, the z value of the central cell was assigned to the location, after which the topographic variable was then computed. A detailed description of each of the selected topographic variables (i.e., slope, aspect, mean curvature, plan curvature, profile curvature, and TWI) is provided in Table 1. Table 1. The different topographic variables (predictors) likely to impact OLT distribution. They were measured at the surface and subsurface to characterize respective topographies (modified after Laamrani et al., [15]).

Slope
Calculated for each grid cell as the maximum rate of change in z-value from that cell to its neighbors. Slope affects the overall rate of movement downslope.

Aspect
Direction of the maximum rate of change in the z-value from each cell to its neighbors. Aspect defines the direction of flow and was classified into eight major classes (i.e., N, NE, E, SE, S, SW, W, NW).

Mean curvature
A general measure of the convexity of the landscape, where negative values represent sinks and valleys are considered to be concave, and positive values are associated with peaks and highs are considered to be convex.

Plan curvature
Curvature of the surface perpendicular to the slope direction.
Positive values indicate that water flow would diverge (convex surface), whereas negative values indicate that water flow would converge (concave surface).

Profile curvature
Curvature of the surface in the direction of a slope. Positive values indicate that water flow would decelerate (concave surface), whereas a negative value will indicate that water flow would accelerate (convex surface).

Topographic wetness index (TWI)
TWI = ln (A s /tan β) (Eq. in [28]) where A s is the local upslope contributing area and β is the local slope [29][30][31]. The higher the value of the TWI in a cell, the higher the soil moisture and water accumulation that can be found on it.

Machine Learning Classification and Accuracy Assesment
We used the Random Forest (RF) classification method [32] to assess the importance of the predictor topographic variables on the variability of OLT as an indicator of paludification. RF is a machine-learning method [32] that builds a multitude of decision trees by randomly dividing the dataset into a training set (2/3 of the entire dataset) and an out-of-bag dataset (or validation data set = 1/3 of the dataset). This has the advantage of enabling model validation without a secondary independent dataset [33]. Each node was split with a subset of predictors randomly chosen and the final prediction is made using all of the decision trees. The RF out-of-bag (OOB) error, representing the model error across all runs, was used to describe the RF model fit in terms of R 2 . The RF approach provides several advantages. First, its capability to determine the relative importance of predictor variables in the resulting model, which allows ordering of variables according to their predictive performance. In this study, variable importance was computed for each topographic variable (i.e., surface and subsurface topographical variables) used in the model. Second, it is a simple and user-friendly approach that requires fewer decisions on the model parameterization than other methods. Third, it can deal with Sustainability 2020, 12, 577 8 of 14 non-linearity and allows both continuous and categorical explanatory variables to be included in the model. Third, we could obtain the RF by calculating the mean decrease (also known as mean square error of a variable) in random permutation accuracy with each variable considered in turn. The greater the decrease in average accuracy following variable removal, the more important the variable [34]. In this study, RF was implemented using the randomForest package and optimized with three parameters: mtry = 3 and ntree = 500. mtry represented the number of predictor variables randomly tested at each node and ntree represented the number of trees grown from a bootstrapped sample.

Results of OLT Class Distribution
The 1614 selected OLT samples used in this current study were divided into classes: paludified (OLT > 25 cm) or not paludified (OLT ≤ 25 cm). Table 2 and Figure 4 provide the summary of the occurrence of these two classes over the study area. OLT values ranged from 5 to 150 cm. The majority of the samples (72%) had an OLT between 26 and 150 cm (paludified class) with an average of 58 ± 0.7 cm (mean ± SE), whereas the non-paludified class represented 28% of the samples and their OLT ranged from 5 to 25 cm with an average of 19 ± 0.2 cm.  Figure 5 shows an overview of the 12 topographic variables selected to build the RF model. These variables were ordered according to their ability to predict OLT. Overall the RF model based on the 12 selected variables explained 83% of the variance with a 17% OOB estimated error rate. Based on the Mean Decrease Gini (MDG) values, the surface slope had the best predictive performance for OLT, followed by subsurface slope; suggesting that both factors are conditioning the presence of paludification. Aspect and TWI (for both surface and subsurface) also had an effect on OLT, but to a lesser extents. Based on the relative importance of predictor variables in the RF model ( Figure 5), the three surface topographic variables (i.e., slope, aspect, and TWI) were found to have higher predictive power on OLT as compared to the three subsurface variables (i.e., slope, aspect, and TWI). Previous research on paludification showed a good relationship between these three variables and OLT [11,14,15] which is consistent with this finding.

Results of Random Forest Modeling and Variable Importance
To further investigate the superiority of surface variables compared to subsurface variables, we ran separate models with only the top six variables shown in Figure 5. As results, the amount of variation explained by the surface topography model was only 6% greater than the variation explained by the subsurface topography (83% vs. 77%, respectively). A multi-collinearity test [34] was used to examine if this small but significant difference between the performances of the two models (i.e., 6%) could be interpreted as an effect of collinearity between surface and subsurface variables. Consequently, the two slopes variables (i.e., surface and subsurface) were found to have Pearson correlation coefficients of 0.81 (R 2 = 0.64; Figure 6). It is important to mention that aspects obtained at the surface and subsurface were not tested for multi-collinearity because aspect values are circularly disturbed (meaning they are not measured using a linear scale).  Figure 5 shows an overview of the 12 topographic variables selected to build the RF model. These variables were ordered according to their ability to predict OLT. Overall the RF model based on the 12 selected variables explained 83% of the variance with a 17% OOB estimated error rate. Based on the Mean Decrease Gini (MDG) values, the surface slope had the best predictive performance for OLT, followed by subsurface slope; suggesting that both factors are conditioning the presence of paludification. Aspect and TWI (for both surface and subsurface) also had an effect on OLT, but to a lesser extents. Based on the relative importance of predictor variables in the RF model ( Figure 5), the three surface topographic variables (i.e., slope, aspect, and TWI) were found to have higher predictive power on OLT as compared to the three subsurface variables (i.e., slope, aspect, and TWI). Previous research on paludification showed a good relationship between these three variables and OLT [11,14,15] which is consistent with this finding. To further investigate the superiority of surface variables compared to subsurface variables, we ran separate models with only the top six variables shown in Figure 5. As results, the amount of variation explained by the surface topography model was only 6% greater than the variation explained by the subsurface topography (83% vs. 77%, respectively). A multi-collinearity test [34] was used to examine if this small but significant difference between the performances of the two models (i.e., 6%) could be interpreted as an effect of collinearity between surface and subsurface variables. Consequently, the two slopes variables (i.e., surface and subsurface) were found to have Pearson correlation coefficients of 0.81 (R 2 = 0.64; Figure 6). It is important to mention that aspects obtained at the surface and subsurface were not tested for multi-collinearity because aspect values are circularly disturbed (meaning they are not measured using a linear scale).   To further investigate the superiority of surface variables compared to subsurface variables, we ran separate models with only the top six variables shown in Figure 5. As results, the amount of variation explained by the surface topography model was only 6% greater than the variation explained by the subsurface topography (83% vs. 77%, respectively). A multi-collinearity test [34] was used to examine if this small but significant difference between the performances of the two models (i.e., 6%) could be interpreted as an effect of collinearity between surface and subsurface variables. Consequently, the two slopes variables (i.e., surface and subsurface) were found to have Pearson correlation coefficients of 0.81 (R 2 = 0.64; Figure 6). It is important to mention that aspects obtained at the surface and subsurface were not tested for multi-collinearity because aspect values are circularly disturbed (meaning they are not measured using a linear scale). Figure 6. Relationship between the subsurface slope and surface slope. Figure 6. Relationship between the subsurface slope and surface slope. Figure 5 also shows that profile, plan, and curvature variables derived from both surface and subsurface had low MDGs (i.e., 11 to 13, closer to zero), and consequently these variables had very low predictive power for OLT. This is in agreement with our previous studies [11,14,15]. When profile, plan, and curvature were excluded from the model, 84% of the variance was explained by slope, aspect, and TWI and their respective MDG values for predicting paludification were higher (Figure 7) compared to the model that had analyzed the 12 variables together. Figure 5 also shows that profile, plan, and curvature variables derived from both surface and subsurface had low MDGs (i.e., 11 to 13, closer to zero), and consequently these variables had very low predictive power for OLT. This is in agreement with our previous studies [11,14,15]. When profile, plan, and curvature were excluded from the model, 84% of the variance was explained by slope, aspect, and TWI and their respective MDG values for predicting paludification were higher (Figure 7) compared to the model that had analyzed the 12 variables together. Figure 7. Results of the RF modeling using only the six topographic variables that displayed the highest predictive power of paludification. The closer the MDG is to zero, the less important the variable is for predicting paludification. _Sur and _Sub on the Y-axis refer to surface and subsurface topographies, respectively.

Discussion
This study has found that among all the twelve topographic variables that were studied, slope had the highest effect on OLT. This finding is consistent with earlier studies within the Clay Belt, which found that paludification is affected by the surface slope [7,9,11] and subsurface slope [15]. However, unlike this study, Simard et al. and Lavoie et al. [7,9] found that only surface slope had an effect on paludification. This could be explained by the fact that these two studies did not have access to high resolution remote sensing technology (i.e., LIDAR; GPR) derived-data to generate appropriate topographic variables (i.e., elevation, slope, aspect, and TWI) at both the surface and the subsurface.
This study has provided a new insight that surface topography matters more to paludification than subsurface topography. This finding would be particularly useful for implanting sustainable management strategies in paludified northern boreal forests that take into account the importance of surface topographic variables. For instance, forest managers can make decisions based on surface topographical variables to (i) adopt the appropriate forest management practices (i.e., field preparation treatments and replanting), and (ii) adopt the none-use of any mechanical site preparation in highly paludified areas (i.e., OLT ≥ 60 cm). Such forest management of the highly paludified areas would not require any practices that lead to soil disturbing (i.e., field preparation), and therefore could enhance soil organic carbon storage. Indeed, from a climate change perspective, A Figure 7. Results of the RF modeling using only the six topographic variables that displayed the highest predictive power of paludification. The closer the MDG is to zero, the less important the variable is for predicting paludification. _Sur and _Sub on the Y-axis refer to surface and subsurface topographies, respectively.

Discussion
This study has found that among all the twelve topographic variables that were studied, slope had the highest effect on OLT. This finding is consistent with earlier studies within the Clay Belt, which found that paludification is affected by the surface slope [7,9,11] and subsurface slope [15]. However, unlike this study, Simard et al. and Lavoie et al. [7,9] found that only surface slope had an effect on paludification. This could be explained by the fact that these two studies did not have access to high resolution remote sensing technology (i.e., LIDAR; GPR) derived-data to generate appropriate topographic variables (i.e., elevation, slope, aspect, and TWI) at both the surface and the subsurface.
This study has provided a new insight that surface topography matters more to paludification than subsurface topography. This finding would be particularly useful for implanting sustainable management strategies in paludified northern boreal forests that take into account the importance of surface topographic variables. For instance, forest managers can make decisions based on surface topographical variables to (i) adopt the appropriate forest management practices (i.e., field preparation treatments and replanting), and (ii) adopt the none-use of any mechanical site preparation in highly paludified areas (i.e., OLT ≥ 60 cm). Such forest management of the highly paludified areas would not require any practices that lead to soil disturbing (i.e., field preparation), and therefore could enhance soil organic carbon storage. Indeed, from a climate change perspective, one of the advantages of the use of management solutions with no or little soil disturbance (i.e., partial and selection cuts) is that they would have a positive effect on the amount of carbon stored in the soil OL (i.e., soil carbon stocks) in paludified forest soil [35]. From management implications perspective, the results of this study are important for landscape management for several reasons: (i) we have demonstrated the potential of LiDAR data to provide spatially detailed topographic information that can be used as sources of data in managing, planning and optimization of forest management activities in paludified boreal forests; (ii) we have increased understanding that both surface and subsurface topographies are related to OL accumulation, which is an important step towards predicting and mapping productivity across paludified landscapes; (iii) we have determined that maintain or improve forest productivity in the low/medium paludified boreal forests (40-60 cm), management strategies (i.e., partial and selection cuts) should focus more on the surface of the forest floor than on the subsurface (i.e., [36]).
Finally, it is worth mentioning that (i) even though the study site "Valrennes Site" has low topography, it still has an unusual enhanced relief compared to most conventional paludified landscapes (i.e., almost flat) and might be not very representative of the classical paludified landscapes. We believe that the application of the methodology used in this study on a very large classical paludified landscapes in future work will prove invaluable for testing the validation of our results, and (ii) the RF model used in this study has provided only a current snapshot of the paludified landscape, since paludification is a dynamic process that changes with time [20].

Conclusions
To our knowledge, this study was the first to use very high resolution remotely sensed data to determine which of the surface or subsurface topographies matters to paludification. This study has shown that surface topographies need to be taken into consideration first when managing paludification. This study also provided a new insight that the surface slope has a greater effect on paludification than the subsurface slope variable. Knowing that surface topography matter more to the natural process of paludification, we believe that this information is of great importance to forest managers as it can help them to adopt appropriate sustainable forest management practices (i.e., selecting harvest stands, mechanical site preparation and proper tree planting techniques). This study also has highlighted the potential of LiDAR data to provide topographic spatial detail information for planning and optimizing forest management activities in paludified boreal forests. Finally, future work conducted on larger paludified area is suggested for validating the importance of surface and subsurface topographies on paludified black spruce forests of the Clay Belt.