Bioclimatic Classification of Northeast Asia Reflecting Social Factors : Development and Characterization

Biodiversity is rapidly declining globally and efforts are needed to mitigate this continually increasing loss of species. Clustering of areas with similar habitats can be used to prioritize protected areas and distribute resources for the conservation of species, selection of representative sample areas for research, and evaluation of impacts due to environmental changes. In this study, Northeast Asia (NEA) was classified into 27 bioclimatic zones using statistical techniques, and it was re-clustered into 14 groups to identify the environmental characteristics of these zones. In particular, we added land cover variables into the clustering to reflect not only simple climate but also social factors influencing biological habitats. In total, 53 bioclimatic variables were constructed, and principal components were generated using correlation analysis and principal component analysis (PCA). The iterative self-organizing data analysis technique algorithm (ISODATA) was used to cluster the principal components and land cover variable. The constructed bioclimatic zones were assigned codes and descriptive names based on aridity, seasonality, and naturality, and their environmental characteristics were identified. This study is significant in that it improved the understanding of biological habitats of NEA and established a basis for monitoring the distribution of species and for efficient and systematic management of biodiversity.


Introduction
Biodiversity around the world is rapidly decreasing and will continue to deteriorate if no action is taken [1].High biodiversity is the most important basis of ecosystem resilience and leads to stable water and food supplies for human society [2,3].Moreover, there is a lot of evidence that biodiversity strongly influences ecosystem services that are essential benefits to humanity [4,5].The major causes of biodiversity decline are habitat loss and degradation, pollution, climate changes, over-exploitation, and invasive species [4,6].To cope with these issues, there is increasing effort worldwide to conserve biodiversity and manage natural resources efficiently [7,8].
For efficient and systematic management of biodiversity, it is necessary to understand the current environment of such.Identification of environmental characteristics provides useful information for conservation of species because similar environments are likely to have similar flora and fauna [9].Therefore, identifying the areas with similar environmental characteristics based on accurate environmental information and spatially expressing them can provide a basis for biodiversity conservation and efficient management [10].For example, it can be used to prioritize allocation of resources for biodiversity conservation, to set the spatial range of protected areas, and to set standards for selecting sample areas for research [11][12][13][14].
Many countries such as the UK, Australia, Canada, and some in Europe have been conducting classification research based on ecological characteristics and actively applying the classifications into Sustainability 2017, 9, 1137; doi:10.3390/su9071137www.mdpi.com/journal/sustainabilitynatural conservation strategies and policies [15][16][17].However, in most cases, only natural factors such as climate, topography, vegetation, etc. were taken into consideration; thus, social factors, which have significant impact on habitats, have not been included.Even though appropriate natural environmental conditions exist, there are areas where organisms cannot live because of anthropogenic factors, such as land use and population density.Several studies have suggested that the human transformation of land cover is a key driver in the loss of biodiversity and ecosystem services [18,19].Therefore, it is necessary to reflect social factors in the classification process.
In addition, the problem of biodiversity loss should be addressed on regional, national, and international scales for effective implementation [20,21].To do this, cooperation is underway to integrate diverse environmental data across these scales and to manage biodiversity efficiently worldwide, including in Europe and North America [16,17].The mid-latitude region (MLR), which is facing drastic environmental change, needs to implement biodiversity management through such cooperation to ensure resilience of ecosystems.In particular, Northeast Asia (NEA) including North Korea, where rapid biodiversity reduction has been reported because of continuous deforestation and expansion of desertification by accelerating development, needs environmental cooperation between countries and active biodiversity management [22].Therefore, developing bioclimatic classification in NEA can provide a framework for such environmental monitoring and management through international collaboration.
However, there are some limitations to acquiring various environmental data and to understanding local situations, particularly in the case of North Korea.For this reason, it is more appropriate to use data provided at the global level and quantitative analytical approaches, which are more objective and consistent [23].Among the several quantitative analytical approaches, Global Environmental Stratification (GEnS) produced by Metzger et al. [24] was developed to support global biodiversity monitoring and ecosystem research based on statistical clustering.GEnS has classified the world's land into 125 strata with 30arc-seconds resolution using current climate data, and the accuracy of this has been verified and used in various ways [25].
Based on this research methodology, the aims of this study were to establish bioclimatic zones in the NEA, reflecting not only climate but also anthropogenic factors, which are the main causes of habitat destruction.In addition, by identifying the environmental characteristics of each bioclimatic zone, we wanted to enhance understanding of the biological climate of NEA to provide a basis for monitoring and managing biodiversity and for MLR collaborative research.

Study Area (Northeast Asia)
In this study, we set up the boundary of NEA with the Korean Peninsula, which is located between the continents and the oceans, as the center.More specifically, the Korean Demilitarized Zone (DMZ), which was established at the end of the Korean War (1953) to control military hostilities between North Korea and South Korea [26] was used and the range of NEA was set to 10 degrees latitude and longitude from the DMZ.This includes the southern part of China, parts of Mongolia, North Korea, and the southern part of Japan.The study area is shown in Figure 1.The latitudes of the plot range from 28 • 58 57 to 47 • 17 15 and the longitudes of the plot range from 121 • 0 56 to 137 • 17 12 .
The eastern part of China is mainly a plain area, and the northwest part contains the Gobi and Taklamakan deserts, which are influenced by the continental climate, which includes monsoons.In Korea and Japan, most of each country is mountainous, with the Chinese continent on the west and the Pacific Ocean on the east, which are affected by the temperate monsoon climate.Because of these climatic and geographical features, the study area includes a variety of land cover types, such as desert, grassland, farmland, and temperate forest [27].

Definition of Bioclimatic Classification
Previous studies have used mixed terminology to identify ecologically homogeneous regions in many countries.For example, in the UK, Hossell et al. [15] used the term "bioclimatic classification," Metzger [16] used "climatic stratification" for Europe, "Interim Biogeographic Regionalization for Australia (IBRA)" was used by the Australian Ministry of Environment, and the US Environmental Protection Agency used "ecoregions".Although several studies have used similar concepts under various terms, the terms and concepts were not clearly defined.Therefore, in this study, we define the term "bioclimatic zone" as the area where environmental factors affecting the habitat and species distribution, such as climate, geology, and land cover, are homogenous.

Bioclimatic Variables and Data
Species are affected by climatic factors, and bioclimatic variables capture various climatic factors that are likely to have relevance to ecological or physiological constraints that determine the distribution of species [28].Bioclimatic variables, which are derived from monthly temperature and precipitation, have biological significance because they contain annual climate information (e.g., annual mean temperature and annual precipitation), seasonality (e.g., annual range in temperature and precipitation), and extreme environmental conditions (e.g., temperature of the coldest and warmest months and precipitation of the wet and dry quarters) [29].The 19 bioclimatic variables provided by WorldClim (climate dataset) at the global scale and high resolution are often used in species distribution models.
Metzger et al. [24] identified relevant bioclimatic variables and provided 42 that can be calculated using the WorldClim global dataset [29] and CGIAR-CSI data [30].In the current study, based on these 42 variables, an additional 11 bioclimatic variables that affect species distributions were constructed based on literature review using the same dataset (Table S1).These bioclimatic variables were established by using the WorldClim global data set v1.4 for 1960-1990 [29], CGIAR-CSI Global Aridity and PET database [30], and CGIAR-CSI SRTM Digital Elevation Model Database v. 4.1 [31].All variables were constructed with the greatest spatial resolution of 30 arcsec (equivalent to 0.86 km 2 at the equator) provided by WorldClim, using ArcGIS 10.2 (ESRI) and R version 3.3.1.
In addition, land cover data were obtained from the Global Land Cover Facility (GLCF) website, (http://www.glcf.umd.edu).This website provides AVHRR Global Land Cover Classification of 1 km resolution, and this land cover map was constructed using the NASA/NOAA Pathfinder Land (PAL) dataset of 14 years (1981-1994).This land cover map is composed of 14 classes, and in the current study, it was reclassified in various ways using ArcGIS and R. Roads and railroads data were taken from Digital Chart of the World (DCW) database and combined as one vector (line) data.The DCW is an Environmental Systems Research Institute, Inc. product originally developed for the US Defense Mapping Agency (DMA).The data contained in the DCW were used to generate a 1:1,000,000-scale vector database covering the entire surface of the earth [32].Population density was obtained from the LandScan 2000 Global Population 30-arc-second Database (LandScan 2000) which was developed by Oak Ridge National Laboratory for the United States Department of Defense [33].The population

Definition of Bioclimatic Classification
Previous studies have used mixed terminology to identify ecologically homogeneous regions in many countries.For example, in the UK, Hossell et al. [15] used the term "bioclimatic classification", Metzger [16] used "climatic stratification" for Europe, "Interim Biogeographic Regionalization for Australia (IBRA)" was used by the Australian Ministry of Environment, and the US Environmental Protection Agency used "ecoregions".Although several studies have used similar concepts under various terms, the terms and concepts were not clearly defined.Therefore, in this study, we define the term "bioclimatic zone" as the area where environmental factors affecting the habitat and species distribution, such as climate, geology, and land cover, are homogenous.

Bioclimatic Variables and Data
Species are affected by climatic factors, and bioclimatic variables capture various climatic factors that are likely to have relevance to ecological or physiological constraints that determine the distribution of species [28].Bioclimatic variables, which are derived from monthly temperature and precipitation, have biological significance because they contain annual climate information (e.g., annual mean temperature and annual precipitation), seasonality (e.g., annual range in temperature and precipitation), and extreme environmental conditions (e.g., temperature of the coldest and warmest months and precipitation of the wet and dry quarters) [29].The 19 bioclimatic variables provided by WorldClim (climate dataset) at the global scale and high resolution are often used in species distribution models.
Metzger et al. [24] identified relevant bioclimatic variables and provided 42 that can be calculated using the WorldClim global dataset [29] and CGIAR-CSI data [30].In the current study, based on these 42 variables, an additional 11 bioclimatic variables that affect species distributions were constructed based on literature review using the same dataset (Table S1).These bioclimatic variables were established by using the WorldClim global data set v1.4 for 1960-1990 [29], CGIAR-CSI Global Aridity and PET database [30], and CGIAR-CSI SRTM Digital Elevation Model Database v. 4.1 [31].All variables were constructed with the greatest spatial resolution of 30 arcsec (equivalent to 0.86 km 2 at the equator) provided by WorldClim, using ArcGIS 10.2 (ESRI) and R version 3.3.1.
In addition, land cover data were obtained from the Global Land Cover Facility (GLCF) website, (http://www.glcf.umd.edu).This website provides AVHRR Global Land Cover Classification of 1 km resolution, and this land cover map was constructed using the NASA/NOAA Pathfinder Land (PAL) dataset of 14 years (1981-1994).This land cover map is composed of 14 classes, and in the current study, it was reclassified in various ways using ArcGIS and R. Roads and railroads data were taken from Digital Chart of the World (DCW) database and combined as one vector (line) data.The DCW is an Environmental Systems Research Institute, Inc. product originally developed for the US Defense Mapping Agency (DMA).The data contained in the DCW were used to generate a 1:1,000,000-scale vector database covering the entire surface of the earth [32].Population density was obtained from the LandScan 2000 Global Population 30-arc-second Database (LandScan 2000) which was developed by Oak Ridge National Laboratory for the United States Department of Defense [33].The population distribution provided by LandScan 2000 is determined by calculating a probability coefficient for each 30-arc-second cell and then applying the coefficient to the best available census counts for each country [34].

Construction Method for Bioclimatic Zones
The classification of bioclimatic zones in NEA proceeded in three major stages (Figure 2).Firstly, principal components, which consisted of the best-suited bioclimatic variables for classification at the NEA level, were created.Secondly, social factors affecting the habitat of species were examined and applied into the bioclimatic classification.Finally, clustering was carried out using the extracted principal components and social factor (land cover) variables to establish the bioclimatic classification.To understand the environmental characteristics of each zone, several environmental data, such as temperature, precipitation, and land cover, were analyzed, and based on this analysis, codes and descriptive names of each zone were assigned.All calculations and analyses were performed using ArcGIS 10.2 (ESRI, Redlands, CA, USA) and R version 3.1.3(R Foundation for Statistical Computing, Vienna, Austria).
Sustainability 2017, 9, 1137 4 of 18 distribution provided by LandScan 2000 is determined by calculating a probability coefficient for each 30-arc-second cell and then applying the coefficient to the best available census counts for each country [34].

Construction Method for Bioclimatic Zones
The classification of bioclimatic zones in NEA proceeded in three major stages (Figure 2).Firstly, principal components, which consisted of the best-suited bioclimatic variables for classification at the NEA level, were created.Secondly, social factors affecting the habitat of species were examined and applied into the bioclimatic classification.Finally, clustering was carried out using the extracted principal components and social factor (land cover) variables to establish the bioclimatic classification.To understand the environmental characteristics of each zone, several environmental data, such as temperature, precipitation, and land cover, were analyzed, and based on this analysis, codes and descriptive names of each zone were assigned.All calculations and analyses were performed using ArcGIS 10.

Selecting Bioclimatic Variables
Correlation analysis (CA) and principal component analysis (PCA) were performed using the statistical program R version 3.1.3to derive the main bioclimatic variables.Firstly, all variables were tested for multicollinearity by examining Pearson correlation coefficients.Only one variable from a set of highly correlated variables (r > 0.95) was included in the first PCA.PCA is the process of creating fewer variables that are not correlated as "principal components" with the aim of explaining the maximum variance using the smallest number of principal components [35].In other words, it is

Selecting Bioclimatic Variables
Correlation analysis (CA) and principal component analysis (PCA) were performed using the statistical program R version 3.1.3to derive the main bioclimatic variables.Firstly, all variables were tested for multicollinearity by examining Pearson correlation coefficients.Only one variable from a set of highly correlated variables (r > 0.95) was included in the first PCA.PCA is the process of creating fewer variables that are not correlated as "principal components" with the aim of explaining the maximum variance using the smallest number of principal components [35].In other words, it is used to reduce the number of variables and to avoid multicollinearity problems when correlations between independent variables occur.
The first PCA was performed on the remaining list to identify those variables that represented the trend of the data.As a result of the first PCA, only variables with eigenvector loadings >0.1 in the first three principal components were selected for the second PCA.The second PCA was performed for the creation of independent principal components through combination of key variables.The principal components were selected based on the cumulative contribution ratio of 80% and were retained for further analysis.

Selecting Socio-Spatial Variables
The social factors, i.e. anthropogenic factors affecting the biological habitat, were examined for inclusion in the bioclimatic classification.Specifically, variables were selected according to the availability of high-resolution spatial data for NEA, the availability in the statistical model, and whether the variable reflects its impact on the biological habitat.Based on these criteria, land cover data were considered suitable for use.Because land cover is nominal data, numerical values were given for each class to be used in the statistical methods.The distance between each class was regarded as equal and the same interval was applied.A method of introducing 14 classes of raw data as it is and a method of using fewer classes by reclassifying have been tried.The reclassification method was sought as a way to reflect the naturality of land cover that can function as organism habitat and adopted a clustering result with high feasibility and usability depending on the extent of fragmentation.

Clustering
Clustering analysis was performed with the principal components that resulted from the second PCA of bioclimatic variables and reclassified land cover.A key principle of clustering analysis is transforming the characteristics of the analysis target into similarity distances and ordering the clusters based on distance [36].In this study, the iterative self-organizing data analysis technique algorithm (ISODATA) that Metzger et al. [16,24] used in making GEnS was used to cluster the principal components of the bioclimatic variables and land cover into bioclimatic zones.
Firstly, image fusion was carried out by arranging the first principal component into the Red band, the second principal component with cumulative contribution >80% into the Green Band, and reclassified land cover into the Blue Band.The fused images were clustered using the ISO Cluster tool of ArcMap 10.2.2.The number of zones was subjectively determined based on literature review and prior experience, so that they were appropriate for the objectives and application, i.e., monitoring and managing biodiversity.

Nomination and Characterization
To understand the characteristics of the classified zones, we calculated the annual mean temperature and precipitation and the altitude of each zone.The dendrogram tool in ArcGIS was used to calculate the Euclidean distance between the zones and then to aggregate zones into larger groups.This regrouping was conducted to efficiently describe and characterize each group.For each group, the environmental characteristics, such as aridity, water use efficiency, and land cover, were analyzed, and based on this analysis, codes and descriptive names were assigned.After that, the characteristics were briefly described for each group.

Selecting Variables of NEA
The results of the correlation coefficient analysis showed that 25 out of 53 variables had Pearson coefficients greater than 0.95 with at least one other variable (Figure S1).Among these 25 highly correlated variables, only eight were readily interpretable and thus included in the PCA.Because of the first PCA with 36 independent variables, the first three components explained all the variation.Moreover, these three principal components were almost determined by the precipitation effectiveness index (PEI), aridity index (AI), and potential evapotranspiration seasonality (PET seasonality).
To generate the principal components used for the cluster analysis, the second PCA was performed, normalizing these three main variables (Table 1).The first two principal components explaining 86% of the total variance were selected and retained for further analysis.Principal component 1 (PC1) was highly correlated with the AI, which shows moisture availability for potential growth of vegetation [34] and PET seasonality, which is a measure of seasonality of moisture available for vegetation [37], and principal component 2 (PC2) was almost determined by PEI, which is important for vegetation growth [38].
To reflect the social factors, we considered land cover, land use, population density, roads and railroads, and economic index as social variables.Among these variables, only land cover, population density, and roads and railroads data were available with high resolution at the NEA level.The clustering results using these three variables showed fragmentation of the population density and roads and railroads data instead of natural clustering (Figure 3b).If a zone is too fragmented, it is difficult to identify the characteristics of that zone and utilize it.For these reasons, land cover was selected as the most appropriate social variable.
the first PCA with 36 independent variables, the first three components explained all the variation.Moreover, these three principal components were almost determined by the precipitation effectiveness index (PEI), aridity index (AI), and potential evapotranspiration seasonality (PET seasonality).
To generate the principal components used for the cluster analysis, the second PCA was performed, normalizing these three main variables (Table 1).The first two principal components explaining 86% of the total variance were selected and retained for further analysis.Principal component 1 (PC1) was highly correlated with the AI, which shows moisture availability for potential growth of vegetation [34] and PET seasonality, which is a measure of seasonality of moisture available for vegetation [37], and principal component 2 (PC2) was almost determined by PEI, which is important for vegetation growth [38].
To reflect the social factors, we considered land cover, land use, population density, roads and railroads, and economic index as social variables.Among these variables, only land cover, population density, and roads and railroads data were available with high resolution at the NEA level.The clustering results using these three variables showed fragmentation of the population density and roads and railroads data instead of natural clustering (Figure 3b).If a zone is too fragmented, it is difficult to identify the characteristics of that zone and utilize it.For these reasons, land cover was selected as the most appropriate social variable.When using the raw data classes as the land cover variables, the bioclimatic zones tended to be too subdivided.For example, in zones that contain different forest types, the difference between each forest type was recognized as the difference between urban and agricultural areas.Therefore, from the viewpoint of biological habitat, the land cover was reclassified into three categories.The reclassification criterion was the ecosystem function of each land cover to support biodiversity.Urbanization and agriculture are two of the major causes of ecosystem destruction, threatening biodiversity through direct habitat conversion [39][40][41] and indirect impacts such as pollutant emissions and disturbance of water and nutrient cycles [42,43].In particular, because cropland and grassland are structurally simplified compared to forest, and susceptible to humans, there are fewer species that can inhabit than forest or water [44,45].In addition, due to the climatic and environmental characteristics of NEA, most grassland is distributed near cropland, rather than the natural grassland.The Inner Mongolia grassland in the northwest of NEA is a natural area, but its biodiversity is not high due to its dry climate [46].In this respect, forest and water are classified as natural area; cropland, grassland and bare land are classified as semi-natural area; and built-up area is classified as artificial area.
Cluster analysis was performed by fusion with PC1 and PC2, which were produced via bioclimatic variables, and the reclassified land cover (Figure 4).When using the raw data classes as the land cover variables, the bioclimatic zones tended to be too subdivided.For example, in zones that contain different forest types, the difference between each forest type was recognized as the difference between urban and agricultural areas.Therefore, from the viewpoint of biological habitat, the land cover was reclassified into three categories.The reclassification criterion was the ecosystem function of each land cover to support biodiversity.Urbanization and agriculture are two of the major causes of ecosystem destruction, threatening biodiversity through direct habitat conversion [39][40][41] and indirect impacts such as pollutant emissions and disturbance of water and nutrient cycles [42,43].In particular, because cropland and grassland are structurally simplified compared to forest, and susceptible to humans, there are fewer species that can inhabit than forest or water [44,45].In addition, due to the climatic and environmental characteristics of NEA, most grassland is distributed near cropland, rather than the natural grassland.The Inner Mongolia grassland in the northwest of NEA is a natural area, but its biodiversity is not high due to its dry climate [46].In this respect, forest and water are classified as natural area; cropland, grassland and bare land are classified as semi-natural area; and built-up area is classified as artificial area.
Cluster analysis was performed by fusion with PC1 and PC2, which were produced via bioclimatic variables, and the reclassified land cover (Figure 4).

Bioclimatic Classification of NEA
From the clustering, NEA could be divided into 27 bioclimatic zones, which were numbered in ascending order of the average values of the first principal component generated for clustering.As the numbers increased, the zone's position tended to shift from southeast to northwest (Figure 5).It was considered that PC1, which is mainly composed of the AI and PET seasonality, reflects the characteristics of becoming dry and continental according to climate and geographical factors in NEA, which is located along the sea.In other words, it represents the transition from oceanic climate to continental climate.

Bioclimatic Classification of NEA
From the clustering, NEA could be divided into 27 bioclimatic zones, which were numbered in ascending order of the average values of the first principal component generated for clustering.As the numbers increased, the zone's position tended to shift from southeast to northwest (Figure 5).It was considered that PC1, which is mainly composed of the AI and PET seasonality, reflects the characteristics of becoming dry and continental according to climate and geographical factors in NEA, which is located along the sea.In other words, it represents the transition from oceanic climate to continental climate.

Bioclimatic Classification of NEA
From the clustering, NEA could be divided into 27 bioclimatic zones, which were numbered in ascending order of the average values of the first principal component generated for clustering.As the numbers increased, the zone's position tended to shift from southeast to northwest (Figure 5).It was considered that PC1, which is mainly composed of the AI and PET seasonality, reflects the characteristics of becoming dry and continental according to climate and geographical factors in NEA, which is located along the sea.In other words, it represents the transition from oceanic climate to continental climate.Mean annual temperatures ranged from 15.8 °C for zone 1, located along the south coast at an average altitude of 44.6 m, to 0.83 °C for zone 14, which is at an average altitude of 860.5 m.The mean Mean annual temperatures ranged from 15.8 • C for zone 1, located along the south coast at an average altitude of 44.6 m, to 0.83 • C for zone 14, which is at an average altitude of 860.5 m.The mean annual temperature decreased overall from south to north, but decreased as the altitude increased.Annual precipitation ranged from 401 mm to 2137 mm and decreased from the ocean to the inland side.Regarding altitude, the deviation between the mountainous areas and the flat areas was severe, but the overall trend was an increase from the ocean to inland areas.
Meanwhile, natural areas (i.e., forest and water) and artificial areas (i.e., cities) were distinguished by reflecting land cover.This is considered meaningful in terms of biological habitat, because the zones where the climate is similar, but land cover is different, were divided.In addition, the boundaries of the zones were distinguished along the river and mountain ranges.However, as the regions tend to be fragmented, additional corrections are needed.
Several regions were geographically dispersed, but assigned into one zone.For example, zone 12 includes the Kuma plateau in the northern part of the Korean Peninsula, the Sikhote-Alin Mountains in Russia, and the mountains of Hotakake and Ontake in Japan, as these areas are biologically similar.In this way, we could understand the distribution of regions with similar bioclimates in NEA.
Of the 27 areas, there were areas that occupied very small areas or severely fragmented areas.Because the fragmented and small-size zones are difficult to characterize and describe, the 27 zones were aggregated into 14 larger groups based on a dendrogram (Figure 6).The environmental characteristics of these 14 groups were identified and the descriptive names of the groups were assigned according to properties, and the naming criterion was applied to the relative standard in NEA through histogram distribution (Table 2).
Sustainability 2017, 9, 1137 8 of 18 annual temperature decreased overall from south to north, but decreased as the altitude increased.Annual precipitation ranged from 401 mm to 2137 mm and decreased from the ocean to the inland side.Regarding altitude, the deviation between the mountainous areas and the flat areas was severe, but the overall trend was an increase from the ocean to inland areas.Meanwhile, natural areas (i.e., forest and water) and artificial areas (i.e., cities) were distinguished by reflecting land cover.This is considered meaningful in terms of biological habitat, because the zones where the climate is similar, but land cover is different, were divided.In addition, the boundaries of the zones were distinguished along the river and mountain ranges.However, as the regions tend to be fragmented, additional corrections are needed.
Several regions were geographically dispersed, but assigned into one zone.For example, zone 12 includes the Kuma plateau in the northern part of the Korean Peninsula, the Sikhote-Alin Mountains in Russia, and the mountains of Hotakake and Ontake in Japan, as these areas are biologically similar.In this way, we could understand the distribution of regions with similar bioclimates in NEA.
Of the 27 areas, there were areas that occupied very small areas or severely fragmented areas.Because the fragmented and small-size zones are difficult to characterize and describe, the 27 zones were aggregated into 14 larger groups based on a dendrogram (Figure 6).The environmental characteristics of these 14 groups were identified and the descriptive names of the groups were assigned according to properties, and the naming criterion was applied to the relative standard in NEA through histogram distribution (Table 2).Firstly, Natural (N) and Artificial (A) were classified based on the natural land cover ratio (>80%) in the zone.After that, zones were classified into semi-arid (a), dry (d), moist (m), and wet (w) based on the AI, which is the main determinant variable of the principal component.Finally, using PET seasonality, which assorted Oceanic, Semi-continental and Continental, and PEI, which divided into Low, Mid, and High precipitation efficiency, seven codes from A to G were assigned based on continental and oceanic climate and precipitation efficiencies (Table 3).Then, each of the divided zones was named according to the degree of aridity and land cover status, which occupies the largest portion for each zone (Table 4).
The color legend was constructed using the annual mean temperature, naturality, and AI to define the RGB scheme.Specifically, annual mean temperature was used to define the amount of red, the AI the blue, and naturality the green colorations (Figure 7).

Environmental Characteristics of Bioclimatic Zones (Response of Land Cover)
To characterize the environmental characteristics of the classified zones, the distribution of basic climate information such as annual mean temperature, precipitation, altitude, and land cover (Table 5) were studied.Furthermore, the AI, PEI, and PET seasonality, which are the main variables that determined the principal components, were analyzed for each zone (Figure 8).In addition, we extracted the land cover information for each zone and confirmed the naturality of the zone, which is the proportion of natural areas such as forest and water within the zone.The environmental characteristics of each zone are presented in Figure S2.

Environmental Characteristics of Bioclimatic Zones (Response of Land Cover)
To characterize the environmental characteristics of the classified zones, the distribution of basic climate information such as annual mean temperature, precipitation, altitude, and land cover (Table 5) were studied.Furthermore, the AI, PEI, and PET seasonality, which are the main variables that determined the principal components, were analyzed for each zone (Figure 8).In addition, we extracted the land cover information for each zone and confirmed the naturality of the zone, which is the proportion of natural areas such as forest and water within the zone.The environmental characteristics of each zone are presented in Figure S2.

Assessment of Bioclimatic Classification by Qualitative Interpretation
To consider as many bioclimatic variables as possible, in this study, 53 bioclimatic variables were constructed and used by integrating the variables used in previous studies.As a result, we extracted the main bioclimatic variables suitable for NEA that differed from the main variables at the global

Assessment of Bioclimatic Classification by Qualitative Interpretation
To consider as many bioclimatic variables as possible, in this study, 53 bioclimatic variables were constructed and used by integrating the variables used in previous studies.As a result, we extracted the main bioclimatic variables suitable for NEA that differed from the main variables at the global level.It was considered that the bioclimatic classification of NEA, which was constructed by adding these variables, better classified the regional biological climate than the bioclimate map of the world did.Specifically, we compared the results with those of Metzger et al. [24] (Figure 9).For convenience of comparison, the results of Metzger et al. [24] were expressed in color scheme similar to that used in this study.
From the aspect of regional climate pattern, it was judged that the result of this study distinguishes oceanic climate well.South Korea shows a clear climatic difference between the Baekdudaegan Mountains and the East Sea [47].These differences are reflected in the results of this study (Figure 9(B1)), whereas the study by Metzger et al. [24] does not distinguish these oceanic climates and classifies the northern part of the Korean peninsula as the same zone (Figure 9(B2)).As can be seen by comparing Figure 9(C1,C2), the two studies clearly differ.In this study, the coastal areas including Shanghai lie in the AmF zone, with higher temperature and precipitation than the surrounding area has, whereas, in Metzger et al. [24], the climate in the coastal region is similar to that in the surrounding area and shows a horizontal climate distribution rather than distinguishing the oceanic climate.Actually, according to the Chinese climate map provided by the Ministry of Construction of the People's Republic of China, the area near Shanghai and Ning Bo is affected by sea breeze and classified as being in a climate zone that has high temperature and high summer precipitation, which is different from that inland.These results are probably because the PEI selected as the main variable reflects the NEA monsoon climate, which has high precipitation seasonality.
What stands out most in this study are the effects of artificial factors on biological habitat along with natural factors (i.e., bioclimate) from application of the land cover to the statistical methods, which is unlike previous studies that used only climatic variables.In this study, land cover was added as one variable and the ratio of area occupied by natural areas (forest and water) was defined as naturality.Figure 9 displays an example in which naturality is expressed in detail.Firstly, the area that was classified into one zone in Figure 9(A2) is divided according to the land cover in Figure 9(A1).To be specific, in the case of Figure 9(A1), the river is classified as the NdA zone, which is divided into two detailed zones according to the climate.In addition, the distribution of the widely distributed AdB zone and the NdB zone are classified according to the distribution of forest, water, and urban areas.
In the case of Figure 9(B1), the forest is classified as the NwD and NmE zones, major cities in South Korea as the AmF zone, cropland around the cities as the AmC zone, and cropland around the mountains as the AwG zone.It can be judged that the environmental factors affecting the biological habitat are well distinguished in the adjacent areas according to the land cover condition.On the other hand, the distribution pattern of mountain ranges is similar in Figure 9(C2), but the details of these regional climates are not reflected.In the case of Figure 9(C1), the water and urban areas are expressed separately.Urban areas including Shanghai in China are assigned into the AmF zone, which includes urban areas on the Korean Peninsula, and the surrounding cropland is classified as AmC.In addition, the water areas are divided into the NmD zone, which were divided into two sub-zones along the river stem.The AwG zone and the NwG zone in the lower part are also classified according to forests and cropland.
In summary, from the results of classifying with the land cover as the third principal component as well as climate, it can be concluded that the environmental factors affecting biological habitat are well classified according to the climatic conditions in the same land cover, or according to the land cover in a similar climatic environment.In this way, the bioclimatic classification reflecting land cover can provide information that is more useful when applied practically, because it is appropriate for different types of land cover to be used in different policies.

Implications and Limitations
The results of this study can be used for monitoring and managing biodiversity if changes of the bioclimatic zones due to environmental variation, such as climate change or land cover change, are simulated.Through the simulation of regional changes in response to the environmental variation, it is possible to distinguish between rapidly changing zones and non-changing zones and to suggest different adaptation strategies for each zone [11][12][13][14].For example, in zones with rapid changes due to climate change, there is a possibility of desertification and a high probability that the survival of species living in these areas will be threatened.Therefore, special adaptation measures are needed to conserve species inhabiting those zones.On the other hand, species in the rapidly changing surrounding zones are likely to move to the zones that are changing less, thus conservation strategies will be needed, such as making the less changing zones core sites of conservation.To use these strategies, it is necessary to identify the vulnerable species of climate change and major species in each region based on the environmental characteristics identified in this study.
Classification using statistical methods is limited in that field verification is not possible [11,16].Thus, the results of previous studies have been verified by comparing them with existing classification maps or by CA with other environmental data [16].However, in the current study, it was difficult to acquire environmental spatial data of NEA including North Korea.Therefore, instead of using those verification methods, we tried to analyze the characteristics of each zone and determined that the bioclimatic zones reflect the climatic characteristics and land cover of NEA quite well.In the future, it will be necessary to verify whether the climate affecting the biological habitats is well distinguished through other methods, such as identifying the distribution of representative flora for each zone.
Although the environmental data (climate and land cover data) used in this study are high-resolution data currently available for NEA, they are not the latest data reflecting current climate and land cover.It is necessary to use more detailed data that can reflect social factors, which include not only the distinction between natural and artificial areas but also the land use intensity and the human intervention.In addition, to reflect the land cover in classification using statistical techniques, the nominal data were converted numerically and reclassified into three categories: natural, semi-natural, and artificial areas.This was an attempt to determine whether the areas were habitable.It will be necessary to apply improved methods such as assigning weights to land cover/use types or developing clustering methods that can include species distribution information.

Conclusions
To preserve biodiversity, it is necessary to identify areas where the environments are similar and to express them spatially.In this study, NEA was classified into 27 bioclimatic zones, which were clustered into 14 larger groups, and the environmental characteristics of each group were identified.
In total, 53 bioclimatic variables affecting the biological habitats were constructed and a reclassified land cover map was used to reflect anthropogenic factors.From the CA and PCA, the AI, PET seasonality, and PEI were selected as the main climatic variables.The clustering was performed with the principal component generated from these main variables and the land cover.The codes and descriptive names of each zone were given based on aridity, seasonality, and naturality.
In this study, we attempted to establish bioclimatic zones of NEA.It is significant that this study enhances understanding of the bioclimatic environments in NEA and provides a basis for the monitoring and management of biodiversity.In addition, it showed the possibility and necessity of inclusion of social factors in bioclimatic classification.However, there is a limitation in that it cannot be verified because of problems with securing the data and it does not reflect the latest climate and land cover environment.Therefore, further studies such as collecting environmental data and upgrading the methodology that reflects the species habitats and social factors should be conducted in order to make practical use of this material in the future.

Moist Natural Area (NmD)
The NmD zone is a region of high naturality that includes water, needleleaf forest, and woodland in similar proportions.This zone includes the northern and southern regions leading from the Maritime Provinces (Russia) to the northeastern part of NEA to the low mountains of the Korean Peninsula and the watersheds of the Yangtze River Basin.It includes several regions distributed over various latitudes, and thus exhibits a moderate level of environmental characteristics such as mean temperature and precipitation.
Moist Shrub-Bare Land (AmB) The AmB zone is composed of about 40% closed shrubland and 40% bare land, so the proportion of natural area is low.This zone is a large shrub-bare land area located between the Liaodong Peninsula and Amnok River and is in a relatively high altitude area.Although the annual mean temperature is lower than that of the surrounding areas, it is relatively wet with high precipitation.

Semi-arid Cropland (AaA)
The AaA zone is a low naturality area, mostly composed of croplands.It is widely distributed in northwest NEA and is mostly affected by continental climate.It is located the most inland of the zones, therefore it has the highest seasonality and the average altitude is relatively high.It is the driest zone, adjacent to Chinese and Mongolian inland deserts, and has less than 500 mm of annual precipitation.

Wet Artificial Area (AwG)
The AwG zone is a region that includes grassland, cropland, and built-up areas, and similar to the AmF zone, its naturality is low.It is sporadically distributed near the southern part of NEA, with the annual mean temperature and precipitation the second highest among the zones.

Wet Evergreen Forest (NwG)
The NwG zone is the region that contains the most evergreen needleleaf forest, and its naturality is high.It includes most of the southern forest area of NEA and has a high average altitude.Because it is located in the southern coastal region, it has the highest annual precipitation and the lowest PET seasonality and is the wettest region among all zones.

Wet Mixed Forest (NwD)
The NwD zone includes the major mountains of the Korean Peninsula and southern coast of the Maritime Provinces, and its naturality is high because of the high proportion of mixed forests.Despite being located at low latitudes, the mean annual temperature is low because it contains many high-altitude mountains.

Figure 1 .
Figure 1.Location of the mid-latitude region including: Northeast Asia (left); and land cover of the study area (right).

Figure 1 .
Figure 1.Location of the mid-latitude region including: Northeast Asia (left); and land cover of the study area (right).

Figure 2 .
Figure 2. Research flow of the development and characterization of the bioclimatic classification for Northeast Asia.CA: correlation analysis; PCA: principle component analysis; ISODATA: iterative self-organizing data analysis technique algorithm.

Figure 2 .
Figure 2. Research flow of the development and characterization of the bioclimatic classification for Northeast Asia.CA: correlation analysis; PCA: principle component analysis; ISODATA: iterative self-organizing data analysis technique algorithm.

Figure 3 .
Figure 3. (a) Selection process of social variables; and (b) an example of fragmentation by applying population density data.

Figure 3 .
Figure 3. (a) Selection process of social variables; and (b) an example of fragmentation by applying population density data.

Figure 4 .
Figure 4. (a) Map of the first principal component; (b) map of the second principal component; and (c) reclassified land cover map of Northeast Asia.

Figure 4 .
Figure 4. (a) Map of the first principal component; (b) map of the second principal component; and (c) reclassified land cover map of Northeast Asia.

Figure 4 .
Figure 4. (a) Map of the first principal component; (b) map of the second principal component; and (c) reclassified land cover map of Northeast Asia.

Figure 6 .
Figure 6.Dendrogram based on Euclidean distance between cluster means explaining how each zone was clustered into 14 groups.

Figure 6 .
Figure 6.Dendrogram based on Euclidean distance between cluster means explaining how each zone was clustered into 14 groups.

Figure 9 .
Figure 9. Map expanding three areas to compare with GEnS of Northeast Asia (NEA) made by Metzger et al. (2013) [24] and to illustrate naturality (A1: the result of this study in Northeast China, A2: GEnS in Northeast China, B1: the result of this study in East China, B2: GEnS in Eest China, C1: the result of this study in central Korean peninsula, C2: GEnS in central Korean peninsula).

Figure 9 .
Figure 9. Map expanding three areas to compare with GEnS of Northeast Asia (NEA) made by Metzger et al. (2013) [24] and to illustrate naturality (A1: the result of this study in Northeast China, A2: GEnS in Northeast China, B1: the result of this study in East China, B2: GEnS in Eest China, C1: the result of this study in central Korean peninsula, C2: GEnS in central Korean peninsula).

Table 1 .
Standard variation and eigenvectors for the three principal components (PCs) of the second principal component analysis.

Table 1 .
Standard variation and eigenvectors for the three principal components (PCs) of the second principal component analysis.

Table 1 .
Standard variation and eigenvectors for the three principal components (PCs) of the second principal component analysis.

Table 2 .
Indices and criteria used for descriptive names.

Table 3 .
Criteria used for A-G classifications.

Table 4 .
Code names, descriptive names, and contained zones by each group.

Table 5 .
Environmental characteristics, including climate and land cover, of each group.

Table 5 .
Environmental characteristics, including climate and land cover, of each group.