Terrestrial CDOM in lakes of Yamal Peninsula: Connection to lake and lake catchment properties

© 2018 by the authors. In this study, we analyze interactions in lake and lake catchment systems of a continuous permafrost area. We assessed colored dissolved organic matter (CDOM) absorption at 440 nm (a(440)CDOM) and absorption slope (S300-500) in lakes using field sampling and optical remote sensing data for an area of 350 km2in Central Yamal, Siberia. Applying a CDOM algorithm (ratio of green and red band reflectance) for two high spatial resolution multispectral GeoEye-1 andWorldview-2 satellite images, we were able to extrapolate the a(λ)CDOMdata from 18 lakes sampled in the field to 356 lakes in the study area (model R2= 0.79). Values of a(440)CDOMin 356 lakes varied from 0.48 to 8.35 m-1with a median of 1.43 m1. This a(λ)CDOMdataset was used to relate lake CDOM to 17 lake and lake catchment parameters derived from optical and radar remote sensing data and from digital elevation model analysis in order to establish the parameters controlling CDOM in lakes on the Yamal Peninsula. Regression tree model and boosted regression tree analysis showed that the activity of cryogenic processes (thermocirques) in the lake shores and lake water level were the two most important controls, explaining 48.4% and 28.4% of lake CDOM, respectively (R2= 0.61). Activation of thermocirques led to a large input of terrestrial organic matter and sediments from catchments and thawed permafrost to lakes (n = 15, mean a(440)CDOM= 5.3 m-1). Large lakes on the floodplain with a connection to Mordy-Yakha River received more CDOM (n = 7, mean a(440)CDOM= 3.8 m-1) compared to lakes located on higher terraces. http://dx.doi.org/10.3390/rs10020167


Introduction
The high latitude Arctic lowlands underlain by permafrost are particularly rich in lakes [1][2][3].Most of these lakes are of thermokarst origin [4] and represent carbon turnover hotspots [5].Both the photodegradation of lake organic matter [6] and subsequent biodegradation processes lead to the formation and outgassing of methane [7,8].Compared with the allochthonous component of Arctic lakes, the autochthonous component, i.e., plankton, is assumed to play a small role in the total organic matter content [9,10].
Dissolved organic content of lake water influences the penetration depth of ultra-violet and visible sunlight [11][12][13][14].High organic matter concentration can increase the absorption of light in lakes which may warm the water body and impacts the energy balance of high-latitude landscapes [13].Organic matter in lakes is also assumed to influence lake biodiversity [12].
The dissolved organic content in lake water is known to decrease along a vegetation biomass gradient south to north from the boreal forest to arctic tundra [12,[15][16][17].This reflects the fact that the surrounding lake catchments are the main source of the dissolved organic matter in inland waters [9,10,12,15,16].However, the large-scale trend over the vegetation biomass/latitude gradient is overlain by high small-scale spatial variability, the origin of which is not yet fully understood [18].
Ongoing Arctic warming is changing Arctic landscapes in various ways that potentially alter the organic matter supply to lakes.Warming may increase vegetation density in lake catchments and in turn the organic matter supply to lakes.Furthermore, warming may cause an increase of ground temperatures [19] and a deepening of the active layer (the annually-thawed upper soil layer) [20] in permafrost soils and thus activate various cryogenic processes including thermodenudation [21,22] in the lake shores, leading to changes of the geochemistry and organic matter content in those lakes [23,24].
The total organic matter in lake water can be separated into particulate organic matter (POM) and dissolved organic matter (DOM).POM is the organic fraction of suspended particulate matter including the phytoplankton and detritus.The colored dissolved organic matter (CDOM) is the colored fraction of dissolved organic carbon (DOC) and is highly correlated to DOC concentration over a wide range of aquatic ecosystems [25,26].CDOM mainly consists of fulvic and humic acids [9], absorbing in the short wavelength ranges.CDOM is typically measured photometrically after filtration where the parameters obtained are absorption at a specific wavelength (λ) and the spectral slope coefficient, (S) denoting the steepness of the absorption spectra, providing information on both CDOM concentration and CDOM type, respectively [27][28][29][30].
Here, we study CDOM in lakes of Central Yamal in the Russian Arctic as derived from field sampling and calculated with a CDOM algorithm using optical satellite data.We relate these data to a number of environmental variables characterizing the lake´s internal characteristics and catchment properties.We address the following research questions: (1) What is the range of CDOM in lakes of Central Yamal?(2) Which catchment characteristics and lake properties explain the variations in CDOM absorption and slope values?

Study Area
The study region comprises a 350 km 2 area in the central part of the Yamal Peninsula (West Siberia) located around the long-term Russian research station Vaskiny Dachi (70 • 20 N, 68 • 51 E) (Figure 1) [31].Vaskiny Dachi was established in 1988 as a permafrost monitoring site, and since 1996 it has been operated by the Earth Cryosphere Institute, Siberian Branch of the Russian Academy of Sciences.The area is also an established site for the international Global Terrestrial Network for Permafrost (GTN-P) program [31].
More than 350 water bodies with an area greater than 1000 m 2 are located within the study area resulting in approximately 12% limnicity.The central part of the studied region is characterized by elevated terraces which are extensively dissected by narrow valleys of rivers, small streams, gullies, and ravines.There are several geomorphological terraces consisting of Quaternary marine sediments comprised of sands, loam, and clays overlain by organic layer.The large floodplains of the Se-Yakha and Mordy-Yakha Rivers are located in the southwest and northwest of the study area, respectively.On the floodplains, lakes undergo considerable flooding in early summer and are much larger than lakes within the topographic depressions on the higher terraces.The altitude ranges from 2 m below sea level on the floodplains up to 53 m. above sea level (a.s.l) on the high terraces.
On the floodplains, lakes undergo considerable flooding in early summer and are much larger than lakes within the topographic depressions on the higher terraces.The altitude ranges from 2 m below sea level on the floodplains up to 53 m. above sea level (a.s.l) on the high terraces.The characteristics of lake shores within the study area vary from flat beaches to high cliffs.The presence of tabular ground ice in connection with active layer dynamics is responsible for massive cryogenic landsliding and thermodenudation in this area [22].In 2012 and 2013 a noticeable activation of thermal denudation was observed in the area due to above average summer air temperatures [31].The processes resulted in thermocirque development at several lake cliffs [32] which caused transport of large portions of thaw material into the lake waters [24].
Typical tundra of the bioclimatic subzone D covers the gently sloping upland surfaces and terraces [33,34].Dense shrublands dominate on valley bottoms and gentle hill slopes representing old landslides [35].Wetland vegetation and emerging macrophytes occupy the litoral zone of many lakes on the terraces.The characteristics of lake shores within the study area vary from flat beaches to high cliffs.The presence of tabular ground ice in connection with active layer dynamics is responsible for massive cryogenic landsliding and thermodenudation in this area [22].In 2012 and 2013 a noticeable activation of thermal denudation was observed in the area due to above average summer air temperatures [31].The processes resulted in thermocirque development at several lake cliffs [32] which caused transport of large portions of thaw material into the lake waters [24].

Materials and Methods
Typical tundra of the bioclimatic subzone D covers the gently sloping upland surfaces and terraces [33,34].Dense shrublands dominate on valley bottoms and gentle hill slopes representing old landslides [35].Wetland vegetation and emerging macrophytes occupy the litoral zone of many lakes on the terraces.

Sample Collection and Measurements
A total of 24 lakes within a 20 km distance from Vaskiny Dachi research station were selected for sampling in order to cover a wide range of CDOM values.The pre-selection was based on the inspection of satellite images in quasi-true color mode visualizing the apparent lake color and various lake types as well as catchments types.Lake color reflects the spectral properties of the water and thus is a proxy for CDOM concentration.
Water samples were collected in calm weather conditions with bottle sampling from the upper 30 cm of the water column close to the shore or in the center of lakes from a boat in August and September 2014 [24].Samples from 24 lakes were collected (Figure 1, Table 1); 18 of them were located on high terraces and six were located on the floodplain of the Mordy-Yakha River.The samples were filtered directly in the field after sampling.Filtrates for CDOM were prepared by filtering through Whatman© (Maidstone, UK) 0.7 µm pore size glass fiber filters and were stored in cold and dark conditions to avoid photodegradation.
The filtrates were measured in the Otto Schmidt Laboratory (Saint-Petersburg, Russia) using a dual-beam spectrophotometer, Specord 200 (Jena Analytic©, Jena, Germany).CDOM absorption spectra were calculated from absorbance (A) measurements in a range of 200-750 nm (a(λ) CDOM ) with 1 nm optical resolution.Logarithmic spectral slopes between 350 and 500 nm (S 350-500 (nm −1 )) were calculated with 440 nm as reference wavelength a(λ 0 ).Parameter a(λ) CDOM decreases exponentially with the increasing of wavelength a(λ 0 ) and S λ (Equation (1), [36]): To calculate the absorption by CDOM per meter (m −1 ), the measured absorbance (A) is transformed to a natural logarithm and corrected for the length of cuvette (Equation ( 2)): where A is absorbance (unitless) and L is the length of the cuvette in meters.In this study, Suprasil quartz cuvettes of 0.05 and 0.10 m length were used.Sample from the lake with thermocirque TC-06 (Figure 1) showed unrealistic high offset values of a(700) CDOM .Offset-corrected values of a(λ) CDOM were obtained by subtracting the absorption value at 700 nm (where no CDOM absorption is assumed) because values not equal to 0 at this wavelength can occur due to scattering [27,30].We restricted a(440) CDOM values from 2014 to a maximum value of 8 m −1 , which has frequently been measured as maximum values in samples of other years.

Remote Sensing And GIS Analyses to Derive Lake and Lake Catchment Characteristics
We used a wide spectrum of geodata sources to derive lake and lake catchment characteristics including high-spatial resolution optical satellite data, synthetic aperture radar (SAR) satellite data and a SAR-derived digital elevation model (DEM).

Geometrical and Radiometrical Pre-Processing of the Geodata
The Central Yamal region was one of the first Arctic regions where TanDEM-X DEM data with 12 m resolution (Table 2) was made available (Deutsches Zentrum von Luft-und Raumfahrt, DLR).To improve the DEM for further processing, the raster data were converted to a point data model and re-interpolated using linear features such as digitized streams, water bodies, structure lines, and railway line.TopoToRaster was used as the interpolation method in ArcGIS 10.2.2.(ESRI Inc.©, Redlands, CA, USA).[37]); ** orthorectification (OR) applied for all optical images using collected ground control points (GCPs) in the field with Differential Global Positioning System DGPS Trimble 5700 and corrected 12 m TanDEM-X digital elevation model (DEM); atmospheric correction (AC) applied for all MS images using ATCOR© algorithm [38].NDVI: normalized difference vegetation index; CHL: chlorophyll-related vegetation index.TerraSAR-X SAR satellite data (Table 2) were delivered in Level 1B SSC (single-look slant range Complex) processing with horizontal-horizontal (HH) polarization.For data processing, the open source software NEST© from the European Space Agency (ESA) was used.The data were terrain-corrected using range Doppler terrain correction and the TanDEM-X DEM product.The output pixel spacing was set to 2 m and projected into UTM Zone 42, WGS-1984.Radiometric normalization was applied and the resulting Sigma θ band was transferred to decibel (dB) using the IDL© 8.6 software.
ALOS PALSAR (Table 2) SAR satellite data were delivered in Level 1.1 processing with horizontal-vertical (HV) polarization for two dates (14 August 2008 and19 September 2008).The multi-looking images were ellipsoid corrected using the geolocation-grid method in NEST©.The resulting intensity images were converted into dB.The mean of the two dates was calculated and gamma-filtered.
Ground control points (GCP) at recognizable and fixed objects in satellite images were collected in August 2014 using Trimble© 5700 differential global positioning system (DGPS) to geometrically correct the very high spatial resolution optical images (Table 2).Pan-sharpened images for optical satellite data from GeoEye-1 and WorldView-2 (Table 2) were obtained applying PANSHARP2 fusion algorithm developed by Zhang [37] for multispectral and panchromatic images using PCI Geomatica 2014 software (PCI Geomatics©).The orthorectification procedure was performed within OrthoEngine© module in PCI Geomatica.The prepared DEM was used to correct the images for relief distortions towards orthoimage.ATCOR© ground reflectance atmospheric correction module [38] was used to correct the multispectral image data of GeoEye-1 and WorldView-2 with PCI Geomatica software.We used "rural" as aerosol type and "subarctic summer" as atmospheric conditions suitable for tundra environment.The steep spectral attenuation derived from dark object substraction (DOS) for both GeoEye-1 and WorldView-2 images indicated high atmospheric transparency, and therefore we set the visibility to 80 km for the atmospheric correction.Ground reflectance data of different tundra surfaces were reliable compared with the bi-directional surface reflectance measured at the Greening of the Arctic Monitoring sites in the Vaskiny Dachi region in summer 2011 [39].A seamless mosaic of two images was produced using ENVI© 5.0 software.
Atmospherically corrected 10 m spatial resolution SPOT-5 (Take 5 product, ESA) were used to extract vegetation with large areal coverage.

Remote Sensing Data Processing
The seamless mosaic of multispectral atmospherically-corrected GeoEye-1 and WorldView-2 satellite data was used to retrieve CDOM concentration in 356 lakes in the area (Table 2).We used the green spectral band versus red spectral band ratio method a(440) CDOM = a(G/R) b developed by Kutser et al. [40,41].The band ratio values were calculated by dividing the reflectance values in the green band (green reflectance, G, 0-100%) by the red band (red reflectance, R, 0-100%).To extract the band ratio values the regions of interest (ROIs) were placed in the center of the lakes avoiding the shallow litoral areas.ROIs were 80-15,000 m 2 depending on the lake size.We related ROI values to in-situ a(440) CDOM from summer 2014 (CDOM, Table 3).The overall time difference between satellite acquisitions and field sampling was approximately 11 months.Satellite-derived ratio values of cloud covered lakes were excluded from this empirical model (LK-001, LK-002, LK-006, LK-008, LK-010, LK-014).Atmospherically corrected multispectral SPOT-5 data were used to calculate the normalized difference vegetation index (NDVI, Table 3) and a chlorophyll-related vegetation index (CHL, Table 3).Median NDVI and CHL per lake catchment were extracted for 356 catchments.Vegetation indices are based on the assumption that the main photosynthetically-active pigments of vegetation absorb in the red wavelength region (650-700 nm).The greener the vegetation, the higher the chlorophyll absorption and the lower the reflectance in the red band.More vegetation biomass results in multiple scattering in the near-infrared (NIR) and hence higher reflectance in the NIR.The commonly used NDVI index [42] is calculated using the following Equation (3): where NIR and R are the reflectance values in NIR and red bands, respectively.Chlorophyll vegetation index (CHL) is calculated as relative absorption depth [38]: where G is the reflectance value in the green band.
The pan-sharpened optical satellite data were used to derive geomorphological features such as thermocirques (TC, Table 3) and lake in-and outlets (INLET and OUTLET, Table 3).The lake shores were inspected in order to find the active thermocirques (i.e., presence or absence).In additional, the area impacted by the thermocirques was calculated.All lakes were grouped according to their inlet-outlet regime within the stream network.This divided the lakes into four groups [43,44]: (1) seepage (no apparent surface inlet or outlets); (2) seepage-inlet (inlets, no apparent surface outlets); (3) headwater (outlets, no apparent surface inlets); and (4) drainage lakes (inlets and outlets).
The TanDEM-X DEM was used to derive topography-related metrics (Table 2) including elevation of lake shorelines, lake catchments, topographic slope values, topographic wetness index (TWI), snow water equivalent (SWE).Lake catchments were calculated using the flow direction raster model in ArcHydro [45].All automatically delineated catchment areas were manually corrected if there were surface outlets from the lake.Mean slopes values were calculated for each catchment averaging the values of each pixel within the catchment.The TWI index was calculated as described in Sørensen et al. [46].SWE was calculated for all catchments using the snow survey data and Geographic Information System (GIS) based modelling [47].A snow survey in the study area was carried out in March 2013.Snow depth was measured using a metal ruler and snow density with the snow sampler VS-43 designed for snow density measurements during snow surveys.
The TerraSAR-X satellite data (Table 2) was used to derive the areal extent of the lake water bodies (LK_AR, Table 3).Due to specular reflection on open water bodies, the resulting lower backscatter values from water can be used to differentiate between land and open water bodies [48].Water body areas were extracted from the TerraSAR-X images acquired in summer 2014 using the backscatter threshold method.The ALOS PALSAR satellite data were used to create a shrub map for the study area and derive the percentage of shrub coverage within the catchments (SHR and SHR_PERC, Table 3).Processed ALOS PALSAR data were used to extract the areas of high shrubs by applying the backscatter threshold (dB > −25), and then converting to vector data.The shrub percentage was calculated as the ratio of the area covered by high shrubs and the area of the catchment.

Statistical Processing
Regression tree model (RTM, rpart package [49]) and boosted regression tree (BRT, dismo package [50]) analyses were performed to investigate the relationship between satellite-derived a(440) CDOM values and 17 lake and lake catchment parameters (environmental variables) of 356 lakes (Table 3).RTM supports the analyses of the importance of multiple explanatory variables (lake and lake catchment characteristics) in respect to a specific response variable (CDOM concentration).RTM can also help to assess the way in which different variables behave in explaining the response variable, and thereby clarify environmental processes.The aim of BRT is a better regression analysis, especially to explore non-linear relationships between independent and dependent variables [50].
To reduce the number of independent variables in the final RTM and BRT setup, all the environmental variables were cross-correlated using corrplot package [51] in R software (R version 3.3.1,[52]).Variables characterized by non-normal distribution were log10-transformed prior to analysis [43,44].If several variables were significantly correlated to each other (R > 0.5), the variable with the highest correlation with CDOM was retained for further analysis.

CDOM of Central Yamal Lakes
The biogeochemistry of the Central Yamal lakes is highly variable with a wide value range for CDOM, indicating high heterogeneity of lake type and influencing processes.In-situ measured a(440) CDOM values of 24 lakes from Central Yamal in 2014 ranged from 0.58 to 8 m −1 with a median value of 2.13 m −1 (Table 4).Values of S 350-500 ranged from 0.012 to 0.018 (median 0.0161, Table 3).Regression of in-situ a(440) CDOM data and satellite G/R ratio (n = 18) showed a significant (R 2 = 0.79) correlation (Figure 2).This relationship allowed us to calculate a(440) CDOM values for 356 lakes in the region ranging from 0.48 to 8.35 m −1 (median value 1.43 m −1 ).The final Equation (5) for calculating a(440) CDOM from reflectance ratio is: a(440  This relationship allowed us to calculate a(440)CDOM values for 356 lakes in the region ranging from 0.48 to 8.35 m −1 (median value 1.43 m −1 ).The final Equation ( 5) for calculating a(440)CDOM from reflectance ratio is: (5)

Lake and Lake Catchment Parameters
GIS and remote sensing data processing allowed us to extract 17 lake and lake catchment parameters from 356 lakes (Table 3).Remote sensing analyses showed that 15 lakes in the study area had shorelines with active thermocirques (4.2%).172 lakes were located on high terraces (48.3%) with an average area of 8.47 ha.184 lakes were located on floodplain of Se-Yakha and Mordy-Yakha Rivers (51.7%) and are generally larger (average area 11.05 ha).
The median lake area equaled 2.45 ha, which indicated that lakes were generally small with several very large lakes (area > 100 ha, n = 7).Most of the lakes (n = 227, 63.8%) were seepage lakes (without apparent inlets or outlets in form of river or stream), while 55 (15.4%) were drainage lakes having an inlet and an outlet.Other lakes were seepage-inlet (n = 11) with an apparent inlet and no outlet, and headwater lakes (n = 62) with an apparent outlet and no inlet (3.1% and 24.2% respectively).Catchment area varied from 0.25 to 713.93 ha with the median value of 20.49 ha.Drainage ratio varied from 0.25 to 181.3 with the median value of 5.93.Five lakes were found with the water level below 0 a.s.l. with a minimum of −1.7 m).The highest lake level was 39.3 m with a

Lake and Lake Catchment Parameters
GIS and remote sensing data processing allowed us to extract 17 lake and lake catchment parameters from 356 lakes (Table 3).Remote sensing analyses showed that 15 lakes in the study area had shorelines with active thermocirques (4.2%).172 lakes were located on high terraces (48.3%) with an average area of 8.47 ha.184 lakes were located on floodplain of Se-Yakha and Mordy-Yakha Rivers (51.7%) and are generally larger (average area 11.05 ha).
The median lake area equaled 2.45 ha, which indicated that lakes were generally small with several very large lakes (area > 100 ha, n = 7).Most of the lakes (n = 227, 63.8%) were seepage lakes (without apparent inlets or outlets in form of river or stream), while 55 (15.4%) were drainage lakes having an inlet and an outlet.Other lakes were seepage-inlet (n = 11) with an apparent inlet and no outlet, and headwater lakes (n = 62) with an apparent outlet and no inlet (3.1% and 24.2% respectively).
Catchment area varied from 0.25 to 713.93 ha with the median value of 20.49 ha.Drainage ratio varied from 0.25 to 181.3 with the median value of 5.93.Five lakes were found with the water level below 0 a.s.l. with a minimum of −1.7 m).The highest lake level was 39.3 m with a median of 6.3 m a.s.l.Floodplain lakes had much lower levels compared to lakes on high terraces (with medians of 4.2 and 15.8 m, respectively).The floodplain lakes had flatter catchments compared to lakes on high terraces with median slope values of 2.7 • and 3.2 • , respectively).The area covered by high willow shrubs was as high as 76.5% of the catchment area with a median of 8.53%.The median NDVI value for all catchments was relatively high for the area at 0.65.

Correlation Matrix
The strongest positive relationship was found between CDOM and the presence of thermocirques on the lake shores (R = 0.49, n = 356).The geomorphological position (floodplain or high terrace, R = 0.27, n = 356) and mean slope value of the catchment (R = 0.22, n = 356) showed weak positive relationships.Using the correlation matrix, interdependent variables can be detected and removed from the final dataset used for the RTM.Comparing the Pearson's correlation coefficients (Figure 3), 10 of 17 variables were chosen according to their association with CDOM and with each other (Table 4).They represent groups of variables: lake and lake catchment morphometry (lake perimeter, drainage ratio), hydrology (inlets and outlets), vegetation of the catchment (NDVI index, area covered by high shrubs), snow regime (SWE in the catchment), and geomorphology of the catchment (geomorphological position, slope).Although POS was highly correlated with CDOM compared to LK_WL (R = 0.22 and R = 0.18), we decided to use LK_WL in a final RTM and BRT analysis because this variable explains more difference among lakes.4).They represent groups of variables: lake and lake catchment morphometry (lake perimeter, drainage ratio), hydrology (inlets and outlets), vegetation of the catchment (NDVI index, area covered by high shrubs), snow regime (SWE in the catchment), and geomorphology of the catchment (geomorphological position, slope).Although POS was highly correlated with CDOM compared to LK_WL (R = 0.22 and R = 0.18), we decided to use LK_WL in a final RTM and BRT analysis because this variable explains more difference among lakes.

Regression Tree Model (RTM)
Ten environmental variables were used in the final RTM dataset (Figure 3).The RTM (Figure 4) shows the most important variables explaining CDOM concentration in Central Yamal lakes.The analysis divided the whole dataset into groups according to the importance of parameter values, with R 2 = 0.61.

Regression Tree Model (RTM)
Ten environmental variables were used in the final RTM dataset (Figure 3).The RTM (Figure 4) shows the most important variables explaining CDOM concentration in Central Yamal lakes.The analysis divided the whole dataset into groups according to the importance of parameter values, with R 2 = 0.61.3) and a threshold value.Codes are given in Table 3. Bold text reflects the number of terminal node.Color of boxes represents the mean value of a(440)CDOM within splits and terminal nodes.
The analysis yielded seven significant splits and eight terminal nodes (Figure 4).Presence/absence of thermocirques (TC) divides the whole dataset into two classes indicating a(440)CDOM of 5.3 (Node 1) and 1.7 m −1 for lakes with-and without thermocirques respectively.The group of lakes without thermocirques (n = 341) on the lake shores is further divided according to lake water level (LK_WL) producing four terminal nodes (Figure 4

Boosted Regression Tree (BRT) Analysis
Ten environmental variables were used in the final BRT dataset (Figure 3).Two variablespresence/absence of thermocirques (TC), and the lake water level (LK_WL)-explained variations of lake CDOM (46.4% and 28.4%, respectively, Figure 5).BRT clearly defined two groups of lakes according to TC presence or absence, and several groups according to the LK_WL parameter.Two groups of LK_WL represented lakes with higher concentration of CDOM.These were the lakes on the lowest level in the floodplain with LK_WL < 1 m and lakes on the terraces between an altitudes of 16 m ≤ LK_WL < 19 m (Figure 5) which also corresponded to the results obtained from RTM The tree contains seven splits (boxes with a blue outline) and eight terminal nodes (boxes with red outline) and shows the mean values of a(440) CDOM before splitting (upper value in the boxes), the lower value describes the number of cases in each node before splitting (356 cases = 100%).Text between boxes reflects the environmental driver (Table 3) and a threshold value.Codes are given in Table 3. Bold text reflects the number of terminal node.Color of boxes represents the mean value of a(440) CDOM within splits and terminal nodes.
The analysis yielded seven significant splits and eight terminal nodes (Figure 4).Presence/absence of thermocirques (TC) divides the whole dataset into two classes indicating a(440) CDOM of 5.3 (Node 1) and 1.7 m −1 for lakes with-and without thermocirques respectively.The group of lakes without thermocirques (n = 341) on the lake shores is further divided according to lake water level (LK_WL) producing four terminal nodes (Figure 4

Boosted Regression Tree (BRT) Analysis
Ten environmental variables were used in the final BRT dataset (Figure 3).Two variablespresence/absence of thermocirques (TC), and the lake water level (LK_WL)-explained variations of lake CDOM (46.4% and 28.4%, respectively, Figure 5).BRT clearly defined two groups of lakes according to TC presence or absence, and several groups according to the LK_WL parameter.Two groups of LK_WL represented lakes with higher concentration of CDOM.These were the lakes on the lowest level in the floodplain with LK_WL < 1 m and lakes on the terraces between an altitudes of 16 m ≤ LK_WL < 19 m (Figure 5) which also corresponded to the results obtained from RTM (terminal nodes 2 and 4, Figure 4).One group represented lakes with moderate CDOM concentration (LK_WL > 19 m) and one group those with the lowest CDOM (1 m ≤ LK_WL < 16 m).The percentage represents the degree of variable impact for the explanation of CDOM differences.The behavior of the line creates peaks and groups of lakes with higher and lower concentration of CDOM.

Organic Matter Content In Arctic Lakes
Results of in-situ and remote sensing derived a(440)CDOM values show moderate lake CDOM values in the Central Yamal Peninsula.A mean of 3.1, and median of 2.1 m −1 for a dataset of 24 lakes sampled in-situ (Table 3) and a mean of 1.9 m −1 and median of 1.4 m −1 for a dataset of satellite-derived CDOM from 356 lakes was observed.These results indicate that Central Yamal lakes are comparable with other Arctic lakes, including more southern lakes in terms of CDOM concentration (Table 5).To our knowledge, few studies on CDOM in lakes are available for Siberian tundra permafrost landscapes.Considering only in-situ data of the Central Yamal including different types of lakes, the mean regional value of 3.2 m −1 [24] is higher than the CDOM concentration sampled in lakes of other tundra landscapes such as the Lena River Delta [53], coastal areas of the Yamal and Gydan Peninsulas [54,55] and lower than CDOM measured in lakes around the Tazovskiy settlement on the Gydan peninsula (6.1 m −1 , [56], Table 5).
Abnizova et al. and Manasypov et al. [57,58] investigated DOC in lake-rich tundra landscapes in the Lena River Delta and Western Siberia.Both studies found lower DOC values than what we found for the lakes in the Central Yamal (unpublished data).It has also been shown that CDOM values in the lakes of the Lena River Delta are lower than those in Central Yamal lakes [53].Lake catchments in Abnizova et al. and Manasypov et al. [57,58] are small and therefore relatively low influenced by snow melt in spring and are not characterized by large coverage of dense shrub biomass that is abundant in the Central Yamal.
Lakes in the boreal biome show higher CDOM than the Arctic biome.For example, Kutser et al.
Table 5. CDOM absorption in Arctic lakes and water bodies found in the literature.In some cases the CDOM absorption results were given in other than 440 nm and we applied the equation 1 [36] to retrieve the absorption values for the needed wave length 440 nm: a(440)CDOM = a (λg)CDOM × exp[−S(440 − λg), where a(λg)CDOM is a CDOM absorption at given in the literature wavelength; S was considered equal to 0.015 nm −1 as often measured in both fresh waters and marine saline waters ([27] The percentage represents the degree of variable impact for the explanation of CDOM differences.The behavior of the line creates peaks and groups of lakes with higher and lower concentration of CDOM.

Organic Matter Content In Arctic Lakes
Results of in-situ and remote sensing derived a(440) CDOM values show moderate lake CDOM values in the Central Yamal Peninsula.A mean of 3.1, and median of 2.1 m −1 for a dataset of 24 lakes sampled in-situ (Table 3) and a mean of 1.9 m −1 and median of 1.4 m −1 for a dataset of satellite-derived CDOM from 356 lakes was observed.These results indicate that Central Yamal lakes are comparable with other Arctic lakes, including more southern lakes in terms of CDOM concentration (Table 5).To our knowledge, few studies on CDOM in lakes are available for Siberian tundra permafrost landscapes.Considering only in-situ data of the Central Yamal including different types of lakes, the mean regional value of 3.2 m −1 [24] is higher than the CDOM concentration sampled in lakes of other tundra landscapes such as the Lena River Delta [53], coastal areas of the Yamal and Gydan Peninsulas [54,55] and lower than CDOM measured in lakes around the Tazovskiy settlement on the Gydan peninsula (6.1 m −1 , [56], Table 5).
Abnizova et al. and Manasypov et al. [57,58] investigated DOC in lake-rich tundra landscapes in the Lena River Delta and Western Siberia.Both studies found lower DOC values than what we found for the lakes in the Central Yamal (unpublished data).It has also been shown that CDOM values in the lakes of the Lena River Delta are lower than those in Central Yamal lakes [53].Lake catchments in Abnizova et al. and Manasypov et al. [57,58] are small and therefore relatively low influenced by snow melt in spring and are not characterized by large coverage of dense shrub biomass that is abundant in the Central Yamal.
Table 5. CDOM absorption in Arctic lakes and water bodies found in the literature.In some cases the CDOM absorption results were given in other than 440 nm and we applied the equation 1 [36] to retrieve the absorption values for the needed wave length 440 nm: a(440) CDOM = a (λg) CDOM × exp[−S(440 − λg), where a(λg)CDOM is a CDOM absorption at given in the literature wavelength; S was considered equal to 0.015 nm −1 as often measured in both fresh waters and marine saline waters ( [27] and references therein).Lakes in the boreal biome show higher CDOM than the Arctic biome.For example, Kutser et al.Moderate CDOM concentrations in Central Yamal lakes are well in line with the overview work of Vincent et al. [15] describing that inland fresh water bodies north of tree line are characterized by low to moderate CDOM concentration which decreases as the distance from tree line increases [12].Several studies [5,63,64] which investigated water bodies in the taiga and close to the tree line and in discontinuous permafrost found mainly acidic and DOC-rich shallow thaw lakes.

Impact of Landscape and Landscape Processes on CDOM of Yamal Lakes
River catchments have been found to determine the organic matter concentration of surface waters [65].In particular, the catchment properties and organic matter transport pathways have been of importance [44,66].Correlations between lake DOC and GIS-extractable environmental variables were studied at a regional scale in different landscape types [44,67] and it was found that lake and catchment areas are not the best predictors of lake DOC.Instead of lake and catchment morphometry, regional specific features such as portion of peatlands or wetlands in the catchment [43,44] and soil carbon density or soil C:N ratio [67] were the most important predictors of lake DOC.However, small-scale regional models will likely not be applicable to explain lake DOC in other regions [44].Soil organic matter concentration can be an important control of lake CDOM in Central Yamal, however it requires additional soil sampling and this variable has not been included in the analysis so far.
Our direct correlation analyses between CDOM and lake area and catchment area, has also shown a weak correlation (R ≤ 0.13).Our statistical analysis allowed the development a regional, Central Yamal theoretical model of organic matter transport (Figure 6) with landscape processes as determinants of lake CDOM.Thermocirques on lake shores led to increased organic matter input [24] and high CDOM in lakes (see Section 5.2.1).Lake water level also has an influence on CDOM (see Section 5.2.2).We assume that the process of the flooding of the lowest-lying lakes in spring due to snowmelt in the catchments and very high river water levels of streams and rivers in the floodplains are the other landscape processes impacting the tundra lakes of the Central Yamal.For the Mackenzie Delta in northwest, Canadian low-Arctic it was shown that variable flooding controlled lake limnology of floodplain lakes over an area of 10,000 km 2 [68].

Impact of Thermocirques on Lake CDOM
An extremely warm summer in 2012 activated thermocirques in the Central Yamal [31,32], which were responsible for large organic matter and sediment input into lakes [24,62].Considering the larger area of analysis in this study, this small group of lakes with thermocirques (n = 15, mean a(440)CDOM = 5.3 m −1 ) was identified in the first split of the regression tree (node 1, Figure 4), indicating, that lakes with thermocirques form a separate group with high CDOM concentration.Recently we have shown that CDOM concentration in lakes can increase by 300-600% after the formation of thermocirques on its shores [24].
In contrast to our results for Central Yamal, Kokelj et al. [23,69] found a decrease in DOC in lakes in the Mackenzie Delta impacted by retrogressive thaw slumps (RTS).Kokelj et al. [24] and Thompson et al. [70] explain this decrease in lake DOC concentration by the removal of DOC due to coagulation and enhanced sedimentation processes with RTS-derived fine-grained clay particles.
Thermocirques of Central Yamal frequently expose peat layers, organic-rich sediments and slightly decomposed root layers that have accumulated on the paleorelief of the terraces of marine sediment sequences [24].The thawing of exposed ground ice after thermocirque formation results in melt water transport with high organic matter input to the lakes (measured DOC concentration in pore water from peat layers was 243 mg/L [24]).Spectral slope (S) may provide information on the source of CDOM [27][28][29].The steepness of the absorption spectra [27] is a proxy for CDOM composition, sources, and the ratio of humic and fulvic acids [28][29][30]71].It has been shown, that the input of allochtonous CDOM is responsible for lower S values [13], which is mainly attributed to a considerable proportion of humic acids having high molecular weight [72].Our field data (Table 3) shows that thermocirque-impacted lakes are characterized by lower S (mean value 0.0133 nm −1 ) compared to non-impacted ones (mean value 0.0165 nm −1 ).In this study, we did not investigate the organic matter quality in detail, but the presence of ancient organic matter in the geological sections of thermocirques in the form of peat can likely explain the differences between CDOM slope values in impacted and not impacted lakes [24].

Impact of Lake Water Level on Lake CDOM
The lakes located in the lowest positions on the floodplain area of the Mordy-Yakha River are characterized by high CDOM concentrations (mean a(440)CDOM value 3.8 m −1 , n = 7, terminal node 4, Figures 4 and 7) especially large lakes which are likely connected to the river and are inundated in the springtime.Our results also identify a small group of lakes (n = 19) that are characterized by

Impact of Thermocirques on Lake CDOM
An extremely warm summer in 2012 activated thermocirques in the Central Yamal [31,32], which were responsible for large organic matter and sediment input into lakes [24,62].Considering the larger area of analysis in this study, this small group of lakes with thermocirques (n = 15, mean a(440) CDOM = 5.3 m −1 ) was identified in the first split of the regression tree (node 1, Figure 4), indicating, that lakes with thermocirques form a separate group with high CDOM concentration.Recently we have shown that CDOM concentration in lakes can increase by 300-600% after the formation of thermocirques on its shores [24].
In contrast to our results for Central Yamal, Kokelj et al. [23,69] found a decrease in DOC in lakes in the Mackenzie Delta impacted by retrogressive thaw slumps (RTS).Kokelj et al. [24] and Thompson et al. [70] explain this decrease in lake DOC concentration by the removal of DOC due to coagulation and enhanced sedimentation processes with RTS-derived fine-grained clay particles.
Thermocirques of Central Yamal frequently expose peat layers, organic-rich sediments and slightly decomposed root layers that have accumulated on the paleorelief of the terraces of marine sediment sequences [24].The thawing of exposed ground ice after thermocirque formation results in melt water transport with high organic matter input to the lakes (measured DOC concentration in pore water from peat layers was 243 mg/L [24]).Spectral slope (S) may provide information on the source of CDOM [27][28][29].The steepness of the absorption spectra [27] is a proxy for CDOM composition, sources, and the ratio of humic and fulvic acids [28][29][30]71].It has been shown, that the input of allochtonous CDOM is responsible for lower S values [13], which is mainly attributed to a considerable proportion of humic acids having high molecular weight [72].Our field data (Table 3) shows that thermocirque-impacted lakes are characterized by lower S (mean value 0.0133 nm −1 ) compared to non-impacted ones (mean value 0.0165 nm −1 ).In this study, we did not investigate the organic matter quality in detail, but the presence of ancient organic matter in the geological sections of thermocirques in the form of peat can likely explain the differences between CDOM slope values in impacted and not impacted lakes [24].

Impact of Lake Water Level on Lake CDOM
The lakes located in the lowest positions on the floodplain area of the Mordy-Yakha River are characterized by high CDOM concentrations (mean a(440) CDOM value 3.8 m −1 , n = 7, terminal node 4, Figures 4 and 7) especially large lakes which are likely connected to the river and are inundated in the springtime.Our results also identify a small group of lakes (n = 19) that are characterized by higher concentrations of CDOM (mean a(440) CDOM value 3.2 m −1 , node 2, Figures 4, 5 and 7).Four of these lakes had a(440) CDOM values greater than 7 m −1 (LK-145, LK-189, LK-288, LK-262) as well as inlets (stream) connecting these lakes with the lake LK-387, the shore of which is impacted by a cryogenic landslide.In order to validate this system of connected lakes, we collected samples in lakes LK-387 and LK-262 in October 2016.The a(440) CDOM were found to be 4.0 m −1 and 5.0 m −1 respectively.It was less than values measured in 2013 (8.3 and 8.2 m −1 respectively) but still high compared to other lakes.Recently we have also shown, that after thermocirque activation the higher CDOM concentration in lakes starts to decrease [24].The absolute height of the water for lakes of this group (n = 19) was between 16-19 m a.s.l.In our study area, this elevation corresponds with II-fluvial geomorphological terrace.We assume these lake catchment areas include portions of higher terraces (III-V), and that their topographic gradient is higher than for lakes located at higher elevations, despite the fact that the catchment slope parameter did not show high importance in the analysis.Catchment slope has been shown to be a relatively good predictor of DOC in lakes of Northern USA and Canada, but also sometimes inversely correlated with lake DOC [73].Inverse correlation is explained by thinner soil organic horizons of steeper slopes and more exposed mineral B horizons which remove the DOC from water percolating through it [73] and by a higher degree of organic-rich waterlogged soil in catchments with flat topography [74].In the area of continuous permafrost, fresh allochtonous organic material from vegetation will more likely reach the lake with steeper slopes as soon as the active layer is not deep.This likely explains the slight positive correlation of catchment slopes with lake CDOM (Figure 3).The topographic gradient implies snow melt water and rain water significantly erode the tundra of Yamal [75] and, consequently, flush organic material from the catchments into surface waters.The mean snow depth and snow density for the region are 30 cm and 0.33 g/cm 3 [47], respectively, which allows the catchments to accumulate up to 1 million m 3 of water, depending on the catchment size and catchment morphology.Given the known depth of studied lakes and the possible volume of water, the snow storage in catchments in form of SWE can account for up to 13% of lake water volume in the summer period [56].Recently we have also shown, that after thermocirque activation the higher CDOM concentration in lakes starts to decrease [24].The absolute height of the water for lakes of this group (n = 19) was between 16-19 m a.s.l.In our study area, this elevation corresponds with II-fluvial geomorphological terrace.We assume these lake catchment areas include portions of higher terraces (III-V), and that their topographic gradient is higher than for lakes located at higher elevations, despite the fact that the catchment slope parameter did not show high importance in the analysis.Catchment slope has been shown to be a relatively good predictor of DOC in lakes of Northern USA and Canada, but also sometimes inversely correlated with lake DOC [73].Inverse correlation is explained by thinner soil organic horizons of steeper slopes and more exposed mineral B horizons which remove the DOC from water percolating through it [73] and by a higher degree of organic-rich waterlogged soil in catchments with flat topography [74].In the area of continuous permafrost, fresh allochtonous organic material from vegetation will more likely reach the lake with steeper slopes as soon as the active layer is not deep.This likely explains the slight positive correlation of catchment slopes with lake CDOM (Figure 3).The topographic gradient implies snow melt water and rain water significantly erode the tundra of Yamal [75] and, consequently, flush organic material from the catchments into surface waters.The mean snow depth and snow density for the region are 30 cm and 0.33 g/cm 3 [47], respectively, which allows the catchments to accumulate up to 1 million m 3 of water, depending on the catchment size and catchment morphology.Given the known depth of studied lakes and the possible volume of water, the snow storage in catchments in form of SWE can account for up to 13% of lake water volume in the summer period [56].The catchment vegetation and soils are a source of fresh terrestrial organic material.The Central Yamal tundra landscape differs from a zonal typical tundra landscape [76] in terms of having considerably higher vegetation biomass providing organic detritus along lake shores and within the lake catchments.The most productive vegetation in this particular area are high shrubs (Salix glauca, S. lanata), [35] on ancient cryogenic landslide surfaces [22] and stream valleys, grasses, on young and The catchment vegetation and soils are a source of fresh terrestrial organic material.The Central Yamal tundra landscape differs from a zonal typical tundra landscape [76] in terms of having considerably higher vegetation biomass providing organic detritus along lake shores and within the lake catchments.The most productive vegetation in this particular area are high shrubs (Salix glauca, S. lanata), [35] on ancient cryogenic landslide surfaces [22] and stream valleys, grasses, on young and old cryogenic landslide surfaces [22], young drained lakes basins, and areas disturbed by off-road vehicle tracks [77].Shrubs up to 2 m tall [34] may grow in depressions, concave slopes and valleys, protected from strong winds and filled by snow in wintertime [47].The presence of this azonal vegetation is due to the high degree of mineralization of marine clayey soils [78] that is made available by active thermal denudation and cryogenic landslides [35].However, this parameter did not show a significant influence on CDOM among lakes, but most likely the presence of shrubs in catchments controls the overall higher concentration of CDOM in lakes compared to other tundra landscapes of Siberia (Table 5).

Potential of Remote Sensing and GIS for Mapping Lake CDOM, Lake Catchments and Parameterization of Lakes
Optical remote sensing with very high, high and medium spatial resolution data permits the extraction of DOC concentration for inland waters [40].In the Arctic, this is critical, since many lakes and ponds are relatively small in size but are very important for the carbon cycle [79,80].However, a major challenge for deriving CDOM and DOC in lakes from optical remote sensing is the need for concomitant in-situ data and the requirement of high-quality satellite surface reflectance data for retrieval of CDOM [40,41,81,82].The application of the green/red band ratio algorithm for derivation of lake CDOM [40,41] uses an empirical relationship between water reflectance and in-situ CDOM related to the spectral properties of CDOM.Organic-rich surface water is characterized by higher absorption in shorter wavelengths (i.e., green reflectance band) than in the longer wavelengths such as the red reflectance band.From our experience, errors can occur applying this algorithm for acquisitions taken in times of high lake turbidity due to winds.The WorldView-2 and the GeoEye-1 acquisitions used in this study were acquired during relatively calm wind conditions with no indication of enhanced lake turbidity due to sediment re-suspension.This is also supported by inspection of the panchromatic band of the images (0.5 m spatial resolution) where no waves can be recognized on the water bodies.
Within our field campaigns, we were able to sample a wide range of lake types from CDOM-rich to low CDOM.This wide range of CDOM concentrations supported the application of the green/red band ratio.The in-situ measured CDOM from 2014 and the band ratio from optical imagery in summer 2013 showed a significant correlation (R 2 = 0.79, n = 18).Before we fitted in-situ CDOM with satellite data, we investigated the variability of CDOM concentration between the years (data available from 2011 [55,83]) in order to avoid fitting lakes with strong CDOM dynamics.Though satellite acquisition closer to the date of sampling is preferred [40], other studies also correlated non-contemporaneous in-situ CDOM data with satellite acquisitions [84].It was found that the temporal variability of CDOM in lakes in northern Canada appears to be relatively small, and large time lags (of several years) between ground data and satellite data produced slightly more scatter than the standard empirical CDOM models with timed field campaign [84].The authors in Cardille et al. [84] found no indication of a bias toward an incorrect model when using CDOM field samples from a variety of years and that the important factor for CDOM algorithm creation is using a sufficiently large range of CDOM [84].We show that using the absorption properties of CDOM we can map CDOM-regimes of the lakes in Central Yamal across a wide range of CDOM concentrations.Further studies with more field data from ongoing sampling for CDOM in Yamal lakes will provide further evaluation and accuracy experiments for satellite-derived CDOM.

Conclusions
Our analysis shows that the landscape processes in lake catchments are important controls of CDOM in Central Yamal lakes.Specific landscape features of Central Yamal are (1) a wide distribution of tabular ground ice and the occurrence of thermocirques on the lake shores; (2) complex topography with highly dissected terrain; and (3) the presence of high shrubs in valleys and topographical depressions.All these specific features make the lakes of the region relatively rich in CDOM, compared to other Arctic regions.
We found that the occurrence of thermocirques on the lake shores is the most important control on lake CDOM in the region.Terrestrial organic matter input from thawed permafrost resulted in lakes with thermocirques being significantly different from lakes without.It remains unclear what the CDOM of these impacted lakes would become after stabilization of thermocirque development.Our analysis also shows that big floodplain lakes connected to the Mordy-Yakha River receive more organic matter, which should be studied further in detail.We did not find a significant correlation between catchment slope and lake CDOM.However, it is likely that lakes with catchment areas that incorporate all geomorphological levels can also receive more organic matter due to higher topographical gradients.
Remote sensing and GIS are important techniques for the retrieval of lake and lake catchment characteristics, and these geospatial data are important to investigate the statistical relationships within "lake-lake catchment" systems.In this study, we used high resolution GeoEye-1 and Worldview-2 multispectral satellite images in order to derive a(440) CDOM values.Further application of freely available Landsat-8 and Sentinel-2 images with a sufficient radiometric and spatial resolution can be used for the assessment of terrestrial and freshwater ecosystem interactions in a larger spatial extent.This can also be a source of lake parameters in lake models.

Figure 1 .
Figure 1.Map of the Vaskiny Dachi study region, showing the location of sampling (red points) during the 2014 field campaign.Thermocirques visited in the field are marked by yellow labels.Background: near-infrared band image of Landsat 8 satellite (19 July 2013) under half-transparent color-coded TanDEM-X digital elevation model (UTM Zone 42 North, WGS84).

Figure 1 .
Figure 1.Map of the Vaskiny Dachi study region, showing the location of sampling (red points) during the 2014 field campaign.Thermocirques visited in the field are marked by yellow labels.Background: near-infrared band image of Landsat 8 satellite (19 July 2013) under half-transparent color-coded TanDEM-X digital elevation model (UTM Zone 42 North, WGS84).

Figure 2 .
Figure 2. Comparison of measured CDOM absorption at 440 nm (2014) from the water samples versus the green reflectance/red reflectance (G/R) ratio retrieved from mosaicked GeoEye-1 5 July 2013 and WorldView-2 21 July 2013 surface reflectance data set.

Figure 2 .
Figure 2. Comparison of measured CDOM absorption at 440 nm (2014) from the water samples versus the green reflectance/red reflectance (G/R) ratio retrieved from mosaicked GeoEye-1 5 July 2013 and WorldView-2 21 July 2013 surface reflectance data set.
Remote Sens. 2018, 10, x FOR PEER REVIEW 10 of 21 4.3.1.Correlation Matrix The strongest positive relationship was found between CDOM and the presence of thermocirques on the lake shores (R = 0.49, n = 356).The geomorphological position (floodplain or high terrace, R = 0.27, n = 356) and mean slope value of the catchment (R = 0.22, n = 356) showed weak positive relationships.Using the correlation matrix, interdependent variables can be detected and removed from the final dataset used for the RTM.Comparing the Pearson's correlation coefficients (Figure 3), 10 of 17 variables were chosen according to their association with CDOM and with each other (Table

Figure 3 .
Figure 3. Correlation matrix of CDOM: lake and lake catchment dataset.Red color indicates positive relation of the parameters to CDOM, blue-inverse correlation.White color indicates correlations with p > 0.05.See Table 3 for variable codes.Variables used in the final statistical processing are shown in bold.

Figure 3 .
Figure 3. Correlation matrix of CDOM: lake and lake catchment dataset.Red color indicates positive relation of the parameters to CDOM, blue-inverse correlation.White color indicates correlations with p > 0.05.See Table 3 for variable codes.Variables used in the final statistical processing are shown in bold.

Figure 4 .
Figure 4. Regression tree explaining the environmental drivers of CDOM in Central Yamal lakes.The tree contains seven splits (boxes with a blue outline) and eight terminal nodes (boxes with red outline) and shows the mean values of a(440)CDOM before splitting (upper value in the boxes), the lower value describes the number of cases in each node before splitting (356 cases = 100%).Text between boxes reflects the environmental driver (Table3) and a threshold value.Codes are given in Table3.Bold text reflects the number of terminal node.Color of boxes represents the mean value of a(440)CDOM within splits and terminal nodes.

Figure 4 .
Figure 4. Regression tree explaining the environmental drivers of CDOM in Central Yamal lakes.The tree contains seven splits (boxes with a blue outline) and eight terminal nodes (boxes with red outline) and shows the mean values of a(440) CDOM before splitting (upper value in the boxes), the lower value describes the number of cases in each node before splitting (356 cases = 100%).Text between boxes reflects the environmental driver (Table3) and a threshold value.Codes are given in Table3.Bold text reflects the number of terminal node.Color of boxes represents the mean value of a(440) CDOM within splits and terminal nodes.

21 Figure 5 .
Figure 5. Results of boosted regression tree (BRT) analysis explaining the environmental drivers of CDOM in Central Yamal lakes.Number of plots decreased down to 2, showing the most important drivers.The percentage represents the degree of variable impact for the explanation of CDOM differences.The behavior of the line creates peaks and groups of lakes with higher and lower concentration of CDOM.

Figure 5 .
Figure 5. Results of boosted regression tree (BRT) analysis explaining the environmental drivers of CDOM in Central Yamal lakes.Number of plots decreased down to 2, showing the most important drivers.The percentage represents the degree of variable impact for the explanation of CDOM differences.The behavior of the line creates peaks and groups of lakes with higher and lower concentration of CDOM.

Figure 6 .
Figure 6.Theoretical scheme of the organic matter transport in Central Yamal lakes.Lakes are shown in different colors: thermocirque-impacted and floodplain lakes are more colored (brownish color) representing higher CDOM concentration.Red arrows represent the ways of organic matter transport to lakes from surrounding catchments.

Figure 6 .
Figure 6.Theoretical scheme of the organic matter transport in Central Yamal lakes.Lakes are shown in different colors: thermocirque-impacted and floodplain lakes are more colored (brownish color) representing higher CDOM concentration.Red arrows represent the ways of organic matter transport to lakes from surrounding catchments.

Figure 7 .
Figure 7. CDOM concentration of eight groups of lakes obtained from the regression tree model (RTM) (Figure 4).Numbers of the node can be taken from the Figure 4.

Figure 7 .
Figure 7. CDOM concentration of eight groups of lakes obtained from the regression tree model (RTM) (Figure 4).Numbers of the node can be taken from the Figure 4.

Table 1 .
Central Yamal lakes from which the samples were taken for colored dissolved organic matter (CDOM).

Table 2 .
List of remote sensing data used.* Different types of data used (MS-multispectral, PS-pan-sharpened (PANSHARP2 model,

Table 3 .
List of lake and lake catchment characteristics used in CDOM statistical model.Variables in bold type were used in the final statistical processing.
Remote Sens. 2018, 10, x FOR PEER REVIEW 15 of 21 higher concentrations of CDOM (mean a(440)CDOM value 3.2 m −1 , node 2, Figures 4, 5 and 7).Four of these lakes had a(440)CDOM values greater than 7 m −1 (LK-145, LK-189, LK-288, LK-262) as well as inlets (stream) connecting these lakes with the lake LK-387, the shore of which is impacted by a cryogenic landslide.In order to validate this system of connected lakes, we collected samples in lakes LK-387 and LK-262 in October 2016.The a(440)CDOM were found to be 4.0 m −1 and 5.0 m −1 respectively.It was less than values measured in 2013 (8.3 and 8.2 m −1 respectively) but still high compared to other lakes.