Spatial Analysis of Temperate Forest Structure : A Geostatistical Approach to Natural Forest Potential

Forest ecosystems represent an important means of ecosystem services; they are key as carbon sinks, water collectors, soil stabilizers, suppliers of great biological diversity, among other benefits. In addition, regionalization based on forest conditions provides a valuable approach to understanding and analyzing spatial patterns, which is useful as a tool for the implementation of forest ecosystem protection and conservation programs. In this research, the structure of a temperate forest in the western Sierra Madre region of Mexico was analyzed and characterized. The study unit was the watershed and the analysis used a geospatial approach combined with multivariate techniques such as: Principal Component Analysis, Cluster Analysis (CA), Discriminant Analysis (DA) and Multivariate Analysis of Variance. We evaluated the relationships among spectral satellite data, thematic maps and structural forest variables. A total of 345 watersheds were grouped based on these variables. The grouping of watersheds under low, medium and high production conditions was carried out with CA, defining 3 groups. The validation of the grouping was performed through DA, estimating errors with the restitution method, as well as with the cross-validation method. Significant differences were found among the groups. The grouping of watersheds provides observable evidence of the variability of the forest condition throughout the area. This study allows identifying forest areas with different levels of productivity and can help to detect levels of vulnerability and ecological fragility in natural forests in temperate zones.


Introduction
Forest ecosystems provide essential benefits for humanity, including the protection of biodiversity, climate regulation, carbon sinks, among others [1,2].The assessment of changes in forest ecosystems and the understanding of their causes are of great concern.Factors that can alter the forest structure such as fire, pests, human activities related to the settlement or opening of agricultural land and the extraction of resources, generate loss of biomass that influences biogeochemical cycles [3].Climatic characteristics determine the health of the world´s forest ecosystems [4].With a continental area of approximately two million square kilometers, Mexico is one of the richest countries in biodiversity [5].The temperate forests of Mexico cover approximately 32 million hectares, equivalent to 18% of the territory.In these ecosystems we can find a great diversity of associations between pines and oaks that are not present in another part of the world [6].These forests are mainly composed of conifers and broad-leaved trees, where pine trees (Pinus sp.), Oyameles (Abies sp.), Pinabetes (Picea sp., Pseudotsuga sp.) and Oak (Quercus sp.) are the dominant genres in their composition and structure, predominating in the temperate zones of the main mountainous regions of the country [7].
However, Mexico is also the country with the highest rates of deforestation worldwide [8].Many of its native ecosystems are gradually being reduced to small remnants of the original vegetation [9].To ensure the sustainability of forests, we need to characterize their structure [10].This implies the understanding of the different variables associated with the structure of forests [11].Forest structural variables, such as volume or biomass, average diameter and height, are important data needed to assess forest resources [12,13].Due to the extensive surface of temperate forests that are distributed in Mexico, together with a very rugged orography, only a portion of the forest can be sampled [14].Under these conditions, it is important to determine the spatial and temporal distribution of the forest structure.These actions allow classifying the forest coverage according to its natural potential [15].
The forest condition is defined in this study as the related set of forest characteristics determined by the number of trees per hectare, their spectral response, their distance from anthropogenic activities, among others.A good condition would present a high number of trees per hectare, a low spectral response and a high distance from anthropogenic activities.At the regional level, a feasible procedure to monitor the forest condition on a regular and continuous basis is based on satellite data, as well as field information [16].Satellite image data represent a source of key information in the monitoring of the forest condition [17].The measurement of the forest structure through remote sensing reduces costs by covering large areas of land [18].In addition, the analyzed data serve to determine the health status of the forests [19].There is a great variety of studies where satellite images, as well as field data, have been used to characterize the structure of the forest [20][21][22][23].Many of them explain the relationship of the forest structure with the spectral satellite data.Traditionally, the measurement of the forest structure considers the forest's dasometric variables.However, there are environmental variables that influence the productivity of the forest and represent an important element in forest management [24,25].A large number of biophysical, biological, topographic and anthropogenic variables are associated with forest productivity [26].
To identify these different forest conditions and variability at the regional level, regionalization or spatial grouping of data is applied.The process of delimitation and classification of areas with homogeneous characteristics according to their environmental properties distributed in the forest landscape allows the definition of environmental units on a spatial scale.In relation to the above, the classification, modeling and interpretation of the monitored data are the most important steps in the evaluation of the forest structure.The spatial identification of regions with homogeneous characteristics often lacks an accepted and clearly articulated theoretical basis.Regions are typically delineated by expert criteria, which is sometimes subjective.The spatial continuity of the resulting regions is rarely managed quantitatively.In this sense, geostatistical multivariate techniques allow us to study the spatial variability of forests, forest structure, topographic and biophysical factors, as well as the relationships that may exist among these variables.Principal Component Analysis (PCA), Cluster Analysis (CA) and Discriminant Analysis (DA) have been used in recent years to analyze environmental variables and extract significant information [27][28][29][30].The use of these techniques allows defining new variables or groups that provide information about the spatial variability of forests [31].There are few studies that integrate variables of the forest structure by using remote sensing, biophysical, topographic or proximity (roads, communities) data and their interaction [32][33][34].Such studies have been carried out mainly in developing countries like Mexico, where there are large gaps of information that can be reinforced with the integration of different sources.The objective of this study was to analyze the forest condition through environmental, topographic, proximity and spectral variables, employing multivariate techniques with a geospatial approach in a forest area located in the municipality of 'Guadalupe y Calvo', Chihuahua, Mexico.

Study Area
The study area is located in the municipality of 'Guadalupe y Calvo', in the state of Chihuahua, Mexico, between parallels 107 • 00 W-26 • 30 N and 106 • 30 W-26 • 00 N (Figure 1).The area comprehends 113.73 ha, of which 95.13 ha are temperate forests (15% of the total forest area of the municipality); 42.60 ha are exploitable pine and oak woods [35].The region is characterized by its high number of endemic species, estimated at around 4000 plant species [36].The area is one of the forest regions with the highest biodiversity, which supports various environmental services for the region.It has a unique and wide system of deep canyons, resulting in a mixture of temperate and tropical ecosystems [37].The main land uses of the study area are coniferous and hardwood forests.Forestry is the main productive activity, representing 42% of the income for the inhabitants.It includes the extraction of wood and the use of dead wood [38].The second productive activity is livestock for self-consumption, which represents 24% of the economy [39].

Study Area
The study area is located in the municipality of 'Guadalupe y Calvo', in the state of Chihuahua, Mexico, between parallels 107°00' W-26°30' N and 106°30' W-26°00' N (Figure 1).The area comprehends 113.73 ha, of which 95.13 ha are temperate forests (15% of the total forest area of the municipality); 42.60 ha are exploitable pine and oak woods [35].The region is characterized by its high number of endemic species, estimated at around 4000 plant species [36].The area is one of the forest regions with the highest biodiversity, which supports various environmental services for the region.It has a unique and wide system of deep canyons, resulting in a mixture of temperate and tropical ecosystems [37].The main land uses of the study area are coniferous and hardwood forests.Forestry is the main productive activity, representing 42% of the income for the inhabitants.It includes the extraction of wood and the use of dead wood [38].The second productive activity is livestock for self-consumption, which represents 24% of the economy [39].

Forest Data
The data were collected between January and May of 2014 and comprised four thousand 0.1-ha circular ground survey plots (Figure 2).Sample measurements included tree height recorded using digital hypsometer and tree diameter at the breast height (dbh) using a tape.In all plots, the position was determined with a GPS.Canopy closure status was determined by recording the type and proportion of shrub stratum vegetation present in each sample plot using a 1-meter square quadrat split into 4 equal quadrants.Plots with more than 50% dead vegetation on the forest floor were classified as closed canopy.Conversely, those with less than 50% of under growth vegetation were classified as open canopy.

Forest Data
The data were collected between January and May of 2014 and comprised four thousand 0.1-ha circular ground survey plots (Figure 2).Sample measurements included tree height recorded using digital hypsometer and tree diameter at the breast height (dbh) using a tape.In all plots, the position was determined with a GPS.Canopy closure status was determined by recording the type and proportion of shrub stratum vegetation present in each sample plot using a 1-m square quadrat split Forests 2019, 10, 168 into 4 equal quadrants.Plots with more than 50% dead vegetation on the forest floor were classified as closed canopy.Conversely, those with less than 50% of under growth vegetation were classified as open canopy.

Geospatial Data
We used data from the Landsat 8 OLI satellite.The original bands were transformed into physical reflectance tools.The spectral value was extracted using a 3×3 window to minimize the error [40].The Normalized Difference Vegetation Index (NDVI) and the Modified Soil Adjusted Vegetation Index 2 (MSAVI2) were generated (Equations ( 1) and ( 2)).
where: p N IR = near infrared, p red = red band.
where: p N IR = near infrared, p red = red band.The variables considered for the analysis were recorded in the same reference system (WGS84) and were converted to raster format with the same number of rows and columns by using ArcGis 10.3 © software (Environmental Systems Research Institute, Redlands, CA, USA.; https://www.esri.com/en-us/home).We used 13 variables for the analysis (Table 1).They were selected based on the relationships found in other studies [41][42][43].The variables were measured in 345 watershed, delimited within the study area (Figure 2).The size of the watershed varied from 100 to 2100 ha, with an average of 424.02 ha and a standard deviation of 274.07 ha.The geographic information system (GIS) was used to delineate watershed.A digital terrain model [44] was used for the generation of the watersheds [45].The details of the procedure for the delimitation of watersheds in GIS can be found in the study by Hamdy et al. [46].

Statistical Analyses
To identify the relationships among the set of variables (forest and geospatial), a correlation analysis was used in the SAS © software, version 9.1.3[47].Subsequently, a PCA was performed through the routine PRINCOMP with the SAS software.This analysis helped to reduce the amount of data; it was carried out using Equation 3.
where:  represents the component coefficient,  represents the weight of the component,  represents the measured value of the variable,  corresponds to the number of the component,  represents the number of the sample and  represents the total number of variables.The watersheds studied were classified by Ascending Group Analysis (AHC) by using the Ward criterion (SAS program, CLUSTER rutine).Ward's criterion was applied to the AHC by using the square Euclidean distance to explore the clustering patterns of the 345 watersheds.The group definition was determined based on the coefficient of determination (R 2 ), the pseudo T 2 , the Cubic Clustering Criterion (CCC) and the pseudo F statistic [48].
Group formation was evaluated based on the original variables by using Discriminant Analysis (DA) (Equation 4).Data was analyzed to determine whether the variables contribute to discrimination among the groups.Through this analysis, we were able to determine the association of each watershed with a group.The classification error of the generated DA was calculated with the restitution method as well as with the cross-validation method.Finally, the variance was analyzed to verify if there were significant differences between the classification and the independent variables as a whole (MANOVA, [49]).

𝑓 (𝐺𝑖) = 𝑘 + 𝑤 𝑃 (4)
where:  represents the number of groups,  = represent an inherent constant for each group,  number of parameters, and  weight of the coefficient assigned by the DA for a given parameter  .

Statistical Analyses
To identify the relationships among the set of variables (forest and geospatial), a correlation analysis was used in the SAS © software, version 9.1.3[47].Subsequently, a PCA was performed through the routine PRINCOMP with the SAS software.This analysis helped to reduce the amount of data; it was carried out using Equation (3).
where: z represents the component coefficient, a represents the weight of the component, x represents the measured value of the variable, i corresponds to the number of the component, j represents the number of the sample and m represents the total number of variables.The watersheds studied were classified by Ascending Group Analysis (AHC) by using the Ward criterion (SAS program, CLUSTER rutine).Ward's criterion was applied to the AHC by using the square Euclidean distance to explore the clustering patterns of the 345 watersheds.The group definition was determined based on the coefficient of determination (R 2 ), the pseudo T 2 , the Cubic Clustering Criterion (CCC) and the pseudo F statistic [48].
Group formation was evaluated based on the original variables by using Discriminant Analysis (DA) (Equation ( 4)).Data was analyzed to determine whether the variables contribute to discrimination among the groups.Through this analysis, we were able to determine the association of each watershed with a group.The classification error of the generated DA was calculated with the restitution method as well as with the cross-validation method.Finally, the variance was analyzed to verify if there were significant differences between the classification and the independent variables as a whole (MANOVA, [49]).
where: Gi represents the number of groups, k i = represent an inherent constant for each group, n number of parameters, and w ij weight of the coefficient assigned by the DA for a given parameter P ij .

Results
In the correlation analysis, we found significant positive correlations among the variables (Table 2).The evaluation shows different relationship conditions for the forest in the study area; this condition allowed us to perform a differential characterization as a measure to explain the forest heterogeneity.The PCA showed that the first four components explain 71% of the total variability provided by the 13 original variables.The values of the eigenvalue and the accumulated variability explain the proportion of the original variables (Figure 3).

Results
In the correlation analysis, we found significant positive correlations among the variables (Table 2).The evaluation shows different relationship conditions for the forest in the study area; this condition allowed us to perform a differential characterization as a measure to explain the forest heterogeneity.The PCA showed that the first four components explain 71% of the total variability provided by the 13 original variables.The values of the eigenvalue and the accumulated variability explain the proportion of the original variables (Figure 2).Table 3 shows the characteristics of the first four components and their eigenvectors.The variables that contribute the most to the PC1 (weak coefficients) are Average total volume per tree, Average quadratic diameter, Distance to roads, Slope and Mean annual temperature that represent the forest structural and biophysical variables.The moderate coefficients of PC2 correspond to Number of trees per hectare, Basal area per hectare and Total volume of trees per hectare, which are structural variables of the forest.PC3 has moderate-high coefficients for Average total volume per tree, Average quadratic diameter, NDVI and MSAVI2, which are part of the structural and spectral forest variables.Finally, PC4 has the highest coefficients for the spectral variables Spectral band 3, Table 3 shows the characteristics of the first four components and their eigenvectors.The variables that contribute the most to the PC1 (weak coefficients) are Average total volume per tree, Average quadratic diameter, Distance to roads, Slope and Mean annual temperature that represent the forest structural and biophysical variables.The moderate coefficients of PC2 correspond to Number of trees per hectare, Basal area per hectare and Total volume of trees per hectare, which are structural variables of the forest.PC3 has moderate-high coefficients for Average total volume per tree, Average quadratic diameter, NDVI and MSAVI2, which are part of the structural and spectral forest variables.Finally, PC4 has the highest coefficients for the spectral variables Spectral band 3, Spectral band 7 and NDVI.The grouping of the sites based on the variables is presented in Figure 4.The variables can be grouped into 3: Group 1: Mean annual temperature, Distance to water bodies, Average quadratic diameter, Average total volume per tree, Slope; Group 2: Spectral band 7, MSAVI2, Distance to water bodies, Spectral band 3; and Group 3: Total volume of trees per hectare, Basal area per hectare and Number of trees per hectare.The PCA results indicate that most of the variations in the forest structure can be explained by Average total volume per tree, Average quadratic diameter, Distance to roads, Slope and Mean annual temperature.The CA determined three groups; 85.2% of the total variability is explained through the three determined groups.Other statistical criteria that reinforced this decision were the CCC, the pseudo T 2 and the pseudo F (Figure 5).The PCA results indicate that most of the variations in the forest structure can be explained by Average total volume per tree, Average quadratic diameter, Distance to roads, Slope and Mean annual temperature.The CA determined three groups; 85.2% of the total variability is explained through the three determined groups.Other statistical criteria that reinforced this decision were the CCC, the pseudo T 2 and the pseudo F (Figure 5).The groups were evaluated and were statistically different (p ≤ 0.0001).Group 1 comprises 133 watersheds, Group 2 comprises 140 watersheds and Group 3 comprises a total of 72 watersheds (Figure 7).   Figure 6 presents the grouping of the watersheds.A line was drawn around the value of 0.83 to illustrate the cut-off point that delimits the number of groups established according to the Euclidean distance.The line layout can vary according to the statistical and practical criteria defined, increasing or decreasing the number of groups.The groups were evaluated and were statistically different (p ≤ 0.0001).Group 1 comprises 133 watersheds, Group 2 comprises 140 watersheds and Group 3 comprises a total of 72 watersheds (Figure 7).The groups were evaluated and were statistically different (p ≤ 0.0001).Group 1 comprises 133 watersheds, Group 2 comprises 140 watersheds and Group 3 comprises a total of 72 watersheds (Figure 7).Group 1 (Bilbao group, n = 133).Group 1 had the lowest average of the Average total volume per tree, a moderate value for the number of trees per hectare, the smallest value for the basal area per hectare and for the total volume of trees per hectare.Regarding the Average quadratic diameter, this group has an intermediate average value of 21.85 cm.Bands 3 and 7 in this group showed intermediate values between the other groups (0.036 and 0.08).The vegetation indexes showed higher values than Group 1 and 2 for NDVI with 0.55 and 0.18 for MSAVI2, indicating good coverage with respect to the photosynthetic activity of the vegetation.There are great distances to water bodies.The distance to the roads is shorter than distances of Group 2 and 3. Watersheds belonging to group 1 present the lowest slope averages of the three groups.The mean annual temperature is also the lowest of the groups.Group 1 (Bilbao group, n = 133).Group 1 had the lowest average of the Average total volume per tree, a moderate value for the number of trees per hectare, the smallest value for the basal area per hectare and for the total volume of trees per hectare.Regarding the Average quadratic diameter, this group has an intermediate average value of 21.85 cm.Bands 3 and 7 in this group showed intermediate values between the other groups (0.036 and 0.08).The vegetation indexes showed higher values than Group 1 and 2 for NDVI with 0.55 and 0.18 for MSAVI2, indicating good coverage with respect to the photosynthetic activity of the vegetation.There are great distances to water bodies.The distance to the roads is shorter than distances of Group 2 and 3. Watersheds belonging to group 1 present the lowest slope averages of the three groups.The mean annual temperature is also the lowest of the groups.an abundant vegetation cover.In these watersheds, there are trees with large diameters.The NDVI showed the highest value for this group, which can be attributed to the high photosynthetic activity.This group of watersheds has a considerable Distance to water bodies, indicating that they are remote areas without alteration, supported by the Distance to the roads.The slope shows the highest values, which may be an indicator of remote areas with rugged topography.The Mean annual temperature was the highest of the groups.
The classification of the watersheds in the groups was evaluated through the methods of restitution and cross-validation of the DA.A total of 13 watersheds were poorly classified, which resulted in a total error of 0.03.With the method of cross-validation, 18 watersheds were reported as poorly classified with a total error of 0.04.The group watershed map generated in the CA was modified based on the DA (Figure 8).Using the results of the cross-validation, the misclassified observations were changed to those suggested.
Forests 2019, 10, x FOR PEER REVIEW 10 of 16 have an abundant vegetation cover.In these watersheds, there are trees with large diameters.The NDVI showed the highest value for this group, which can be attributed to the high photosynthetic activity.This group of watersheds has a considerable Distance to water bodies, indicating that they are remote areas without alteration, supported by the Distance to the roads.The slope shows the highest values, which may be an indicator of remote areas with rugged topography.The Mean annual temperature was the highest of the groups.
The classification of the watersheds in the groups was evaluated through the methods of restitution and cross-validation of the DA.A total of 13 watersheds were poorly classified, which resulted in a total error of 0.03.With the method of cross-validation, 18 watersheds were reported as poorly classified with a total error of 0.04.The group watershed map generated in the CA was modified based on the DA (Figure 8).Using the results of the cross-validation, the misclassified observations were changed to those suggested.Finally, the multivariate analysis of variance (MANOVA) showed that there are significant differences among watershed groups with respect to the original variables (Wilk's Lambda = 0.1, Pr> F <0.001), (Table 3).Finally, the multivariate analysis of variance (MANOVA) showed that there are significant differences among watershed groups with respect to the original variables (Wilk's Lambda = 0.1, Pr > F < 0.001), (Table 4).

Discussion
The multivariate techniques and spatial information maps were useful to interpret the relationships among the multiple variables.The PCA has been previously used to examine and interpret the spatial behavior of the forest [50,51].The relationship between variables from field sampling, thematic maps, and spectral maps provided significant information on the condition of the forest.In this study, four components were needed to explain the original data set.The four components showed the behavior of the variables in the watersheds.
According to Yidina et al. [52], in the analysis of dimensionality reduction of variables, PC1 usually represents the most important mixture of processes.The PCA was useful in the interpretation of qualitative variables.The dominant variables in the first main component were Average total volume per tree, Average quadratic diameter, corresponding to forest variables.Meanwhile, Distance to roads, Slope and Mean annual temperature, presented a strong relationship in the variation of the forest structure in the 'Ejido Chinatu' community.The above agrees with the study by Castillo-Rodríguez et al. [53].The results of PC1 are consistent with the distribution of vegetation through the altitudinal gradient, where temperature and topography play an important role in the presence of certain species of panaceas [54,55].This is also consistent with the mentioned by Helman et al. [56], where they found that mean annual temperatures influenced forest productivity in the Mediterranean Mount Carmel forests in Israel.The values of the coefficients within the PC1 present similarities that imply the variables have a similar influence on the variation of the forest structure.PC2 consisted of the variables of Number of trees per hectare, Basal area per hectare and Total volume of trees per hectare, which had weak and moderate coefficients.PC2 can be interpreted as a component that describes the structure of the forest in relation to the Number of trees, the Basal area and the volume measured per hectare.PC3 was represented by the Average total volume per tree, Average quadratic diameter and the spectral indices NDVI and MSAVI2.This is consistent with the mentioned by Kumar et al. [57], who related structural variables of the forest such as diameter and height with the NDVI, obtaining strong relationships of R 2 = 0.90.This component, similar to PC1, presents information on the forest structure based on the MSAVI2 spectral indices.As shown in the variable displacement plot (Figure 4), as the Average quadratic diameter and the Average total volume per tree decrease, the MSAVI2 increases.The MSAVI2 spectral index helps us to understand the reflectance emitted by bare soil.In this case, lower Average quadratic diameter and higher MSAVI2 indicate that a large number of trees are small.
In agreement with the spatial distribution of the groups shown in Figures 7 and 8, the grouping by CA presented a homogeneous geographical behavior.Group 1 represents the watersheds that correspond to the central corridor, where forest conditions for exploitation could be more accessible.The proximity to the roads is an indicator of anthropic presence [58], and the moderate slopes for this group of watersheds make wood extraction more feasible.Espinosa et al. [59] found that topography is related to species richness in dry forests in southern Ecuador.This is consistent with the distribution of forest variables across the 'Ejido Chinatu', where the increase in slopes ranges from flat to very steep slopes.The moderate spectral values of bands 3 and 7 indicate there is moderate absorption by vegetation and areas that are in a state of recovery.This group shows geographical continuity without being very isolated.Group 2 corresponds to watersheds that have average values with respect to almost all the variables, considering it as a group of watersheds that can be under a moderate anthropic intervention (transition zone) [60].It is divided into several homogeneous groups and does not present cases of isolated watersheds.Finally, Group 3 contains watersheds that, due to their spatial distribution, are further away from anthropogenic activities and may not have forest management.The low reflectivity in bands 3 and 7, and the high photosynthetic activity represented by the NDVI values indicate good vegetation conditions [61].These watersheds are also divided into groups and present an isolated watershed.
The PCA detects the factors that control the spatial variation of the forest structure.Although the PCA is generally considered as a statistical exploratory technique [62], it is capable of being incorporated into the results that explain the distribution of a particular landscape [63].GIS-based maps have the ability to visualize the spatial relationships between environmental data and other attributes, as reported by Facchinelli et al. [64].The CA was useful to identify those watersheds that are similar in terms of their multiple environmental characteristics.Riitters and Coulston [65] used spatial statistics to identify the geographical concentrations of forests located near deforested or clear areas.Wang et al. [66] used the multivariate techniques PCA and CA for the delimitation of environmental units.Kupfer et al. [67] regionalized 2100 watersheds based on landscape metrics.Trakhtenbrot and Kadmon [68] used CA for the identification of sites that represent the diversity of species.Ramachandra et al. [69] analyzed information on landscape metrics and social variables using multivariate techniques in forests in Uttara Kannada District, India.
Based on the results of PCA, CA and DA it was possible to understand the multivariate relationship of the set of variables.At the same time, with the combination of geospatial data and multivariate techniques, it was possible to analyze the spatial variability from a point of view of reducing the dimensionality of the information.The combination of both was useful to examine patterns in common group of variables, allowing us to summarize the multiple relationships of variables in geographic regions to use in forest management analysis.
The identification and delimitation of geographic regions in forests is an active area of ongoing research.Technological and methodological advances allow analyzing data from free sources and field information.This study provides a perspective for the analysis of this information, helping us to enhance its interpretation with a more quantitative approach.Traditional grouping and classification methods are widely used, leading to the delimitation of coherent forest region classifications from a geospatial point of view.
From a methodological point of view, the study provides the identification of forest regions that are possibly more vulnerable to the effects of anthropogenic activities, such as changes in land use/ land cover, deforestation and degradation.In addition, the methodology serves to identify the forest geographical areas with potential for conservation.Such areas are of great importance for the development of conservation plans and protection of forest areas, promoted mainly by government agencies.
In particular, the increasingly availability of free and georeferenced data sources, such as those from remote sensing, spectral indicators, digital elevation models and information derived from them, represents a valuable resource for quantitative approaches to zoning.The presentation of geospatial information from forest areas allows users to examine the characteristics of each area, their variability and their level of productivity.However, the detailed presentation of these variations in a broader regional context, where transition zones could be detected, is complex but promising for future studies.

Conclusions
The use of GIS associated with geostatistical techniques represents a solid scientific tool for regionalization and grouping of landscape features.The features and regionalization of hydrological watersheds based on environmental attributes and vegetation structures are key to the planning and environmental management of the territories.The application of this methodology allows the rapid integration of several environmental and biological attributes, which can be grouped according to their characteristics.In this way, we can define the productive potentials through the regionalization analysis.Forest regionalization mapping can benefit a wide variety of management, conservation and protection activities.Thus, if a forest pattern has been identified in one region as favorable or problematic, it can be replicated in another region with similar conditions.
Multivariate statistical methods, forest structure, topographies and satellites data can be useful to group and regionalize forest areas.The multivariate geographic information system approach showed the spatial relationships between the different variables.Although, the results of the present study provided preliminary conclusions about the characteristics of the watershed groups, more studies such as multicriteria evaluation techniques, interpolation methods, among others, are needed to obtain a better understanding of the source of variation in the forest structure.

Figure 2 .
Figure 2. Sampling of structural variables of the forest (a), study unit: watershed (b).

Figure 2 .
Figure 2. Sampling of structural variables of the forest (a), study unit: watershed (b).

Figure 3 .
Figure 3. Plot indicating the eigenvalues and their contribution to the total variance (a) and explained variance (b).

Figure 3 .
Figure 3. Plot indicating the eigenvalues and their contribution to the total variance (a) and explained variance (b).
ATVT = Average total volume per tree, NTPH = Number of trees per hectare, BAPH = Basal area per hectare, TVTPH = Total volume of trees per hectare, AQD = Average quadratic diameter, SB3 = Spectral band 3, SB7 = Spectral band 7, NDVI = Normalized difference vegetation index, MSAVI2 = Modified soil-adjusted vegetation index 2, DR = Distance to roads, DWB = Distance to water bodies, MAT = Mean annual temperature.Bold letters indicate the dominant coefficients.Forests 2019, 10, x FOR PEER REVIEW 7 of 16Spectral band 7 and NDVI.The grouping of the sites based on the variables is presented in Figure4.The variables can be grouped into 3: Group 1: Mean annual temperature, Distance to water bodies, Average quadratic diameter, Average total volume per tree, Slope; Group 2: Spectral band 7, MSAVI2, Distance to water bodies, Spectral band 3; and Group 3: Total volume of trees per hectare, Basal area per hectare and Number of trees per hectare.

Figure 4 .
Figure 4. Loading plot of the first two principal components (PC) showing the contribution of each variable.

Figure 4 .
Figure 4. Loading plot of the first two principal components (PC) showing the contribution of each variable.

Figure 5 .
Figure 5. Statistical criteria for determining the number of groups to be established.CCC = Cubic Clustering Criterion.NC = Number of clusters.

Figure 6
Figure6presents the grouping of the watersheds.A line was drawn around the value of 0.83 to illustrate the cut-off point that delimits the number of groups established according to the Euclidean distance.The line layout can vary according to the statistical and practical criteria defined, increasing or decreasing the number of groups.

Figure 6 .
Figure 6.Group tree of watersheds using cluster analysis based on the coefficient of determination (R 2 ).

Figure 5 .
Figure 5. Statistical criteria for determining the number of groups to be established.CCC = Cubic Clustering Criterion.NC = Number of clusters.

Figure 6
Figure6presents the grouping of the watersheds.A line was drawn around the value of 0.83 to illustrate the cut-off point that delimits the number of groups established according to the Euclidean distance.The line layout can vary according to the statistical and practical criteria defined, increasing or decreasing the number of groups.

Forests 2019 , 16 Figure 5 .
Figure 5. Statistical criteria for determining the number of groups to be established.CCC = Cubic Clustering Criterion.NC = Number of clusters.

Figure 6 .
Figure 6.Group tree of watersheds using cluster analysis based on the coefficient of determination (R 2 ).

Figure 6 .
Figure 6.Group tree of watersheds using cluster analysis based on the coefficient of determination (R 2 ).

Group 2 (
Limeade group, n = 140).Watersheds of this group are characterized by having the lowest average value of the Average total volume per tree (0.28) and Average quadratic diameter (0.2157).The highest values are for the Number of trees per hectare, the Basal area per hectare and the Total volume of trees per hectare.Bands 3 and 7 had the highest values (0.039 and 0.081).The NDVI and the MSAVI2 had the smallest values.These watersheds have intermediate values towards the Distance to water bodies and the Distance to roads, as well as with the Slope and the Mean annual temperature.Due to the characteristic values of the Average total volume per tree and the Average quadratic diameter, a large number of trees with small trunks are distributed in these watersheds.This group can be characterized as a group with medium forest productive potential, possibly a young forest.Group 3 (Inch Worm group, n = 72).This group is characterized by having the highest average value of the Average total volume per tree (0.33) and Average quadratic diameter (0.23).Low values were observed for the Number of trees per hectare, the Basal area per hectare and the Total volume of trees per hectare.Band 3 and band 7 showed the lowest reflectance values, indicating that they

Group 2 (
Limeade group, n = 140).Watersheds of this group are characterized by having the lowest average value of the Average total volume per tree (0.28) and Average quadratic diameter (0.2157).The highest values are for the Number of trees per hectare, the Basal area per hectare and the Total volume of trees per hectare.Bands 3 and 7 had the highest values (0.039 and 0.081).The NDVI and the MSAVI2 had the smallest values.These watersheds have intermediate values towards the Distance to water bodies and the Distance to roads, as well as with the Slope and the Mean annual temperature.Due to the characteristic values of the Average total volume per tree and the Average quadratic diameter, a large number of trees with small trunks are distributed in these watersheds.This group can be characterized as a group with medium forest productive potential, possibly a young forest.Group 3 (Inch Worm group, n = 72).This group is characterized by having the highest average value of the Average total volume per tree (0.33) and Average quadratic diameter (0.23).Low values were observed for the Number of trees per hectare, the Basal area per hectare and the Total volume of trees per hectare.Band 3 and band 7 showed the lowest reflectance values, indicating that they have Forests 2019, 10, 168 10 of 16

Table 1 .
Variables used in the data analysis.

Table 2 .
Matrix of different variable correlations considered in the analysis.

Table 2 .
Matrix of different variable correlations considered in the analysis.

Table 3 .
Eigenvectors of the correlation matrix.

Table 3 .
Eigenvectors of the correlation matrix.

Table 3 .
MANOVA test criteria and approximations of F.
DF = degree of freedom, Pr > F= p-value of F statics

Table 4 .
MANOVA test criteria and approximations of F.