Research in Meteorological Modeling Oriented Comprehensive Surface Complexity (CSC)

: Ground surface characteristics (i.e., topography and landscape patterns) are important factors in geographic dynamics. Thus, the complexity of ground surface is a valuable indicator for designing multiscale modeling concerning the balance between computational costs and the accuracy of simulations regarding the resolution of modeling. This study proposes the concept of comprehensive surface complexity (CSC) to quantity the degree of complexity of ground by integrating the topographic complexity indices and landscape indices representing the land use and land cover (LULC) complexity. Focusing on the meteorological process modeling, this paper computes the CSC by constructing a multiple regression model between the accuracy of meteorological simulation and the surface complexity of topography and LULC. Regarding the ﬁve widely studied areas of China, this paper shows the distribution of CSC and analyzes the window size e ﬀ ect. The comparison among the study areas shows that the CSC is highest for the Chuanyu region and lowest for the Wuhan region. To investigate the application of CSC in meteorological modeling, taking the Jingjinji region for instance, we conducted Weather Research and Forecasting Model (WRF) modeling and analyzed the relationship between CSC and the mean absolute error (MAE) of the temperature at 2 meters. The results showed that the MAE is higher over the northern and southern areas and lower over the central part of the study area, which is generally positively related to the value of CSC. Thus, it is feasible to conclude that CSC is helpful to indicate meteorological modeling capacity and identify those areas where ﬁner scale modeling is preferable.


Introduction
Characterization of surface complexity is essential throughout the earth sciences, since each geographic process operating at the ground surface encounters and affects the ground surfaces (i.e., topography, land use and land cover, soil) [1]. Given this significance of the ground surface, the complexity is widely applied in many different scenarios, such as identification of active landslides based on topographic complexity [2], design of multiresolution approach to estimate snow water equivalent [3], and so on. Similar to these study, this paper analyzes the surface complexity to describe the spatial heterogeneity of ground surface in a specific area. To quantify the surface complexity, a number of different techniques have emerged over the years, although they usually focus on one of the variables, including topography, land use and land cover [4,5], and so on.
Jingjinji, Lanxi, Wuhan, Chuanyu, and the PRD, respectively, instead of using acronyms. These urban agglomerations are of high population, accounting for nearly one-third (27%, in 2014) of the total population of China. Because of the intense impacts induced by human activities, the environment and ecosystem here are vulnerable [20] and meteorological modeling is widely studied. The study area has diverse climate conditions, including temperate monsoon, semi-humid/semi-arid continental monsoon, subtropical monsoon climate, and so on. Concerning such high populations, fragile environments, and various types of climate, this area deserves our attention to quantify the ground surface characteristics to provide information for geographic process modeling. These urban agglomerations are of high population, accounting for nearly one-third (27%, in 2014) of the total population of China. Because of the intense impacts induced by human activities, the environment and ecosystem here are vulnerable [20] and meteorological modeling is widely studied.
The study area has diverse climate conditions, including temperate monsoon, semi-humid/semi-arid continental monsoon, subtropical monsoon climate, and so on. Concerning such high populations, fragile environments, and various types of climate, this area deserves our attention to quantify the ground surface characteristics to provide information for geographic process modeling.

Topographic and LULC Data
The initial DEM data were derived from the Shuttle Radar Topography Mission (SRTM) data with a horizontal grid spacing of 90 m [21] (Figure 1). The DEM data were then aggregated to 1 × 1 km resolution using simple averaging aggregation in ArcGIS version 10.0, so that it has the same resolution as LULC data for 2015 ( Figure 2), which was provided by Data Center for Resources and Environmental Sciences, Chinese Academy of Sciences (RESDC) (http://www.resdc.cn). In this way, these two variables can be together analyzed to compute the CSC. In processing the LULC data, a unified quality check and data integration for each set of the data sets were conducted, and the comprehensive evaluation accuracy of the land use classification exceeded 91.2% [22].

Topographic and LULC Data
The initial DEM data were derived from the Shuttle Radar Topography Mission (SRTM) data with a horizontal grid spacing of 90 m [21] (Figure 1). The DEM data were then aggregated to 1 × 1 km resolution using simple averaging aggregation in ArcGIS version 10.0, so that it has the same resolution as LULC data for 2015 ( Figure 2), which was provided by Data Center for Resources and Environmental Sciences, Chinese Academy of Sciences (RESDC) (http://www.resdc.cn). In this way, these two variables can be together analyzed to compute the CSC. In processing the LULC data, a unified quality check and data integration for each set of the data sets were conducted, and the comprehensive evaluation accuracy of the land use classification exceeded 91.2% [22].

Research Method
The concept of CSC is proposed to investigate the integrated effect from topography and LULC in geographic process modeling ( Figure 3). To quantify the CSC, Shannon's diversity index (SHDI), one popular measure of diversity borrowed from community ecology, was employed to quantify the complexity of LULC. The SHDI increases as the number of different LULC types increases and/or the proportional distribution of area among LULC types becomes more equitable. The SHDI equals 0 when the moving window contains only 1 LULC type (i.e., no diversity) and approaches 0 as the distribution of area among the different LULC types becomes increasingly uneven (i.e., dominated by 1 type); and the SHDI equals 1 when the distribution of area among LULC types is perfectly even (i.e., proportional abundances are the same). For the complexity of topography, this study utilizes the local relief (LR), standard

Research Method
The concept of CSC is proposed to investigate the integrated effect from topography and LULC in geographic process modeling ( Figure 3).

Research Method
The concept of CSC is proposed to investigate the integrated effect from topography and LULC in geographic process modeling ( Figure 3). To quantify the CSC, Shannon's diversity index (SHDI), one popular measure of diversity borrowed from community ecology, was employed to quantify the complexity of LULC. The SHDI increases as the number of different LULC types increases and/or the proportional distribution of area among LULC types becomes more equitable. The SHDI equals 0 when the moving window contains only 1 LULC type (i.e., no diversity) and approaches 0 as the distribution of area among the different LULC types becomes increasingly uneven (i.e., dominated by 1 type); and the SHDI equals 1 when the distribution of area among LULC types is perfectly even (i.e., proportional abundances are the same). For the complexity of topography, this study utilizes the local relief (LR), standard To quantify the CSC, Shannon's diversity index (SHDI), one popular measure of diversity borrowed from community ecology, was employed to quantify the complexity of LULC. The SHDI increases as the number of different LULC types increases and/or the proportional distribution of area among LULC types becomes more equitable. The SHDI equals 0 when the moving window contains only 1 LULC type (i.e., no diversity) and approaches 0 as the distribution of area among the different LULC types becomes increasingly uneven (i.e., dominated by 1 type); and the SHDI equals 1 when the distribution of area among LULC types is perfectly even (i.e., proportional abundances are the same). For the complexity of topography, this study utilizes the local relief (LR), standard deviation of height (SDH), and standard deviation of slope (SDS) to portray the multiple aspects of the complexity. The equations for both LULC and topography are listed in Table 1.
θ i is the slope of any point in a moving window and N is the number of points in a moving window The values of each equation in Table 1 are calculated in a moving window over the LULC and DEM data, where the values are reported in the central cell. Among them, the SDS value is based on the slope map, which is calculated first based on the DEM data. The main steps to compute these values are shown in Figure 4, where we implement the interactive data language (IDL) for the LULC complexity index and in ArcGIS 10.2 for the topographic complexity index. To make the values of these indices on different scales compatible with a notionally common scale, normalization concerning all the five study regions is applied prior to finding the weighted average. The normalization method is that the value of each cell is divided by the max value of all the five regions. deviation of height (SDH), and standard deviation of slope (SDS) to portray the multiple aspects of the complexity. The equations for both LULC and topography are listed in Table 1.  The values of each equation in Table 1 are calculated in a moving window over the LULC and DEM data, where the values are reported in the central cell. Among them, the SDS value is based on the slope map, which is calculated first based on the DEM data. The main steps to compute these values are shown in Figure 4Error! Reference source not found., where we implement the interactive data language (IDL) for the LULC complexity index and in ArcGIS 10.2 for the topographic complexity index. To make the values of these indices on different scales compatible with a notionally common scale, normalization concerning all the five study regions is applied prior to finding the weighted average. The normalization method is that the value of each cell is divided by the max value of all the five regions.  To construct the model for CSC, this paper applies the multiple regression analysis method based on the complexity of LULC and topography and the station-based modeling performance in the Beijing region ( Figure 5). Based on the observation from each station (an hourly series) with coordinates, we extracted the simulation results from the Weather Research and Forecasting Model (WRF). Then, the Sustainability 2019, 11, 4081 6 of 13 mean absolute error (MAE), mean biased error (MBE), and the root mean square error (RMSE) are considered to quantify the modeling performance from WRF [18].
SDH and SDS. For 2 m temperature, one typical meteorological variable, we get the regression between MAE, MBE, and RMSE and the complexity of topography and LULC (Table 2). So in each regression, the MAE, MBE, and RMSE are the dependent variables. The regression model is shown in Equation 1. It can be seen that the Multiple R is approximately 0.45, and the CSC is negatively correlated with the SHDI, positively correlated with the complexity of topography, and generally stable for the AME, ME, and RMSE. Thus, this paper applies the MAE in the CSC model, which is shown in Equation (1). By such regression model, we get the relationship between modeling performance and the surface complexity. CSC = 3.15 − 1.99 * + 6.88 * * + + (1) Figure 5. Station based observations for multiple regression analysis to construct the CSC model.  The modeling is run for July 2012; the detailed experiment setup and modeling results are illustrated in [23]. Based on the coordinates of these stations, we extracted the SHDI and complexity of topography with the moving window size of 3 × 3 km which is computed as the mean of the LR, SDH and SDS. For 2 m temperature, one typical meteorological variable, we get the regression between MAE, MBE, and RMSE and the complexity of topography and LULC (Table 2). So in each regression, the MAE, MBE, and RMSE are the dependent variables. The regression model is shown in Equation (1). It can be seen that the Multiple R is approximately 0.45, and the CSC is negatively correlated with the SHDI, positively correlated with the complexity of topography, and generally stable for the AME, ME, and RMSE. Thus, this paper applies the MAE in the CSC model, which is shown in Equation (1). By such regression model, we get the relationship between modeling performance and the surface complexity. CSC = 3.15 − 1.99 * SHDI normalization + 6.88 * 1 3 * (LR normalization + SDH normalization + SDS normalization ) (1)

CSC Metrics
Based on the CSC model, the LULC, topographic complexity, and CSC maps with moving window size of 3 × 3 km are shown in Figure 6. Along with the data from Figure 7, it can be seen that both the Chuanyu and Lanxi regions have relatively high CSC. Referring to the geographic location from Figure 1, we can see that these two regions are located in the western part of China with high and complex topography. However, the Jingjinji region is of the lowest CSC, which indicates that the ground surface is relatively simple. In addition to the comparison among the different regions concerning the mean value of CSC, the spatial heterogeneity is still obvious in all the study areas from Figure 6.

CSC Metrics
Based on the CSC model, the LULC, topographic complexity, and CSC maps with moving window size of 3 × 3 km are shown in Figure 6. For each region, in the Figure 6, figures a1, b1 to e1 are for the complexity of LULC, figures a2, b2 to e2 are for the topographic complexity, and figures a3, b3 to e3 are for the CSC maps. Along with the data from Figure 7, it can be seen that both the Chuanyu and Lanxi regions have relatively high CSC. Referring to the geographic location from Figure 1, we can see that these two regions are located in the western part of China with high and complex topography. However, the Jingjinji region is of the lowest CSC, which indicates that the ground surface is relatively simple. In addition to the comparison among the different regions concerning the mean value of CSC, the spatial heterogeneity is still obvious in all the study areas from Figure 6.

Window Size Effect Analysis
Since CSC is scale dependent like other ground surface indices [24], for values of each equation in Table 1, the width of the moving window varied between 3 and 15 km for each region. Figure 8, Figure 9 and Figure 10 show the statistical values of the complexity of topography and LULC and CSC computed for the five study areas (Figure 1). All the values from different window sizes seem to provide similar results. In detail, the mean and minimum values of CSC increase slightly, and the maximum and standard deviation values decrease or remain the same with the increase in window size. Furthermore, concerning the difference from the five study regions, Jingjinji is of the lowest CSC but has the highest standard deviation. This result means that CSC in this region is more spatially heterogeneous. Based on such analysis, the results from the window size on 3 × 3 km comprehensively describe the complexity of the ground surface.

Window Size Effect Analysis
Since CSC is scale dependent like other ground surface indices [24], for values of each equation in Table 1, the width of the moving window varied between 3 and 15 km for each region. Figures 8-10 show the statistical values of the complexity of topography and LULC and CSC computed for the five study areas (Figure 1). All the values from different window sizes seem to provide similar results. In detail, the mean and minimum values of CSC increase slightly, and the maximum and standard deviation values decrease or remain the same with the increase in window size. Furthermore, concerning the difference from the five study regions, Jingjinji is of the lowest CSC but has the highest standard deviation. This result means that CSC in this region is more spatially heterogeneous. Based on such analysis, the results from the window size on 3 × 3 km comprehensively describe the complexity of the ground surface.

Window Size Effect Analysis
Since CSC is scale dependent like other ground surface indices [24], for values of each equation in Table 1, the width of the moving window varied between 3 and 15 km for each region. Figure 8, Figure 9 and Figure 10 show the statistical values of the complexity of topography and LULC and CSC computed for the five study areas (Figure 1). All the values from different window sizes seem to provide similar results. In detail, the mean and minimum values of CSC increase slightly, and the maximum and standard deviation values decrease or remain the same with the increase in window size. Furthermore, concerning the difference from the five study regions, Jingjinji is of the lowest CSC but has the highest standard deviation. This result means that CSC in this region is more spatially heterogeneous. Based on such analysis, the results from the window size on 3 × 3 km comprehensively describe the complexity of the ground surface.

Application of CSC in Meteorological Modeling
With geographic data to express the ground surface characters, the WRF model is widely applied to simulate the meteorological situation on the mesoscale. Despite the acknowledged performance, in each simulation domain the modeling accuracy is always spatially heterogeneous. Liu et al. [25] conducted six simulation scenarios with different parameters, but the modeling result is always poor for the site of Shijingshan in Beijing. This result may be affected by the poor expression of the status of topography and LULC [26]. As mentioned in Section 3, the Jingjinji region has a relatively lower mean value of CSC but a higher standard deviation of CSC. We use the Jingjinji region as an example to study the relationship between the heterogeneity of temperature at a 2 meter (T2) modeling performance and the feature of ground surface, which is expressed by CSC.
In this study, we use WRF 3.5 to simulate the meteorology in the Jingjinji region for July 2010. In addition to the U.S. Geological Survey (USGS), topographic and LULC data are preprocessed and reclassified to make the data compatible with the WRF. The T2 accuracy of the WRF simulation is verified by observational data from the Meteorological Data Center at the China Meteorological Administration. The MAE of T2 was applied to describe the modeling performance. For detailed information concerning modeling setup and validation, please refer to Li et al. [19].
The CSC and the MAE of T2 expressed as the size of circle around each station are shown in Figure 11. From the figure, we can see that the MAE is higher over the northern and southern areas and lower over the central part of the study area, which is generally relevant to the value of CSC. Thus, it is feasible to suggest that CSC is helpful to indicate the meteorological modeling capacity and identify those areas where finer scale modeling is preferable.

Application of CSC in Meteorological Modeling
With geographic data to express the ground surface characters, the WRF model is widely applied to simulate the meteorological situation on the mesoscale. Despite the acknowledged performance, in each simulation domain the modeling accuracy is always spatially heterogeneous. Liu et al. [25] conducted six simulation scenarios with different parameters, but the modeling result is always poor for the site of Shijingshan in Beijing. This result may be affected by the poor expression of the status of topography and LULC [26]. As mentioned in Section 3, the Jingjinji region has a relatively lower mean value of CSC but a higher standard deviation of CSC. We use the Jingjinji region as an example to study the relationship between the heterogeneity of temperature at a 2 meter (T2) modeling performance and the feature of ground surface, which is expressed by CSC.
In this study, we use WRF 3.5 to simulate the meteorology in the Jingjinji region for July 2010. In addition to the U.S. Geological Survey (USGS), topographic and LULC data are preprocessed and reclassified to make the data compatible with the WRF. The T2 accuracy of the WRF simulation is verified by observational data from the Meteorological Data Center at the China Meteorological Administration. The MAE of T2 was applied to describe the modeling performance. For detailed information concerning modeling setup and validation, please refer to Li et al. [19].
The CSC and the MAE of T2 expressed as the size of circle around each station are shown in Figure 11. From the figure, we can see that the MAE is higher over the northern and southern areas and lower over the central part of the study area, which is generally relevant to the value of CSC. Thus, it is feasible to suggest that CSC is helpful to indicate the meteorological modeling capacity and identify those areas where finer scale modeling is preferable.

Application of CSC in Meteorological Modeling
With geographic data to express the ground surface characters, the WRF model is widely applied to simulate the meteorological situation on the mesoscale. Despite the acknowledged performance, in each simulation domain the modeling accuracy is always spatially heterogeneous. Liu et al. [25] conducted six simulation scenarios with different parameters, but the modeling result is always poor for the site of Shijingshan in Beijing. This result may be affected by the poor expression of the status of topography and LULC [26]. As mentioned in Section 3, the Jingjinji region has a relatively lower mean value of CSC but a higher standard deviation of CSC. We use the Jingjinji region as an example to study the relationship between the heterogeneity of temperature at a 2 meter (T2) modeling performance and the feature of ground surface, which is expressed by CSC.
In this study, we use WRF 3.5 to simulate the meteorology in the Jingjinji region for July 2010. In addition to the U.S. Geological Survey (USGS), topographic and LULC data are preprocessed and reclassified to make the data compatible with the WRF. The T2 accuracy of the WRF simulation is verified by observational data from the Meteorological Data Center at the China Meteorological Administration. The MAE of T2 was applied to describe the modeling performance. For detailed information concerning modeling setup and validation, please refer to Li et al. [19].
The CSC and the MAE of T2 expressed as the size of circle around each station are shown in Figure 11. From the figure, we can see that the MAE is higher over the northern and southern areas and lower over the central part of the study area, which is generally relevant to the value of CSC. Thus, it is feasible to suggest that CSC is helpful to indicate the meteorological modeling capacity and identify those areas where finer scale modeling is preferable.

Complexity and roughness
For topography, complexity is usually identical to roughness. However, for LULC, these two terms are different in research. In this work, the complexity is defined to describe the degree of complexity in spatial patterns of LULC patches in a specific window. Therefore, in this paper, we utilize the landscape index to compute the complexity to LULC.

Multiple regression model of CSC
In this work, we compute the CSC by a multiple regression model between meteorological modeling performance and the topographic index and landscape index. The regression parameters show that meteorological modeling performance is quantitatively relevant to surface characteristics, although Multiple R is low. Meteorological modeling performance may be affected by many factors, such as boundary and initial conditions, modeling physical setting, and so on. Surface condition is just one factor, so Multiple R may be low. However, the parameters of such model may still be improved by more observations or evaluation from other regions. However, the application of CSC computed here is favorable concerning the positive relationship in the Jingjinji region. Thus, for more detailed applications on different study regions, the model can be refined accordingly. Furthermore, different index may also be considered to quantify the complexity of topography and LULC [27,28].

Concept of CSC
This work is valuable concerning the new concept of CSC, which is a new perspective to forecast and explain the performance of geographic process modeling from the ground surface. The research method used in this paper can be extended to other geographic process modeling in addition to meteorological processes. For instance, concerning the surface hydrological process, surface variables, such as topography, LULC type, and soil type, can be comprehensively considered to quantify the CSC to facilitate process modeling. Meanwhile, the specific parameters in the multiple regression model of CSC for different processes may also vary accordingly, which deserves more evaluation in practice.

Complexity and roughness
For topography, complexity is usually identical to roughness. However, for LULC, these two terms are different in research. In this work, the complexity is defined to describe the degree of complexity in spatial patterns of LULC patches in a specific window. Therefore, in this paper, we utilize the landscape index to compute the complexity to LULC.

Multiple regression model of CSC
In this work, we compute the CSC by a multiple regression model between meteorological modeling performance and the topographic index and landscape index. The regression parameters show that meteorological modeling performance is quantitatively relevant to surface characteristics, although Multiple R is low. Meteorological modeling performance may be affected by many factors, such as boundary and initial conditions, modeling physical setting, and so on. Surface condition is just one factor, so Multiple R may be low. However, the parameters of such model may still be improved by more observations or evaluation from other regions. However, the application of CSC computed here is favorable concerning the positive relationship in the Jingjinji region. Thus, for more detailed applications on different study regions, the model can be refined accordingly. Furthermore, different index may also be considered to quantify the complexity of topography and LULC [27,28].

Concept of CSC
This work is valuable concerning the new concept of CSC, which is a new perspective to forecast and explain the performance of geographic process modeling from the ground surface. The research method used in this paper can be extended to other geographic process modeling in addition to meteorological processes. For instance, concerning the surface hydrological process, surface variables, such as topography, LULC type, and soil type, can be comprehensively considered to quantify the CSC to facilitate process modeling. Meanwhile, the specific parameters in the multiple regression model of CSC for different processes may also vary accordingly, which deserves more evaluation in practice.

Conclusions
In regard to the close relationship between surface complexity and the performance of geographic process modeling, this paper proposed the concept of CSC, which was quantified comprehensively based on topographic and LULC data. Such CSC is helpful in geographic modeling concerning two aspects. On the one hand, for multiscale modeling, those areas with higher CSC are preferable to set finer scale modeling and vice versa. On the other hand, those areas with higher CSC are usually of relatively poor modeling performance in one modeling domain, which deserves our focused attention.
This paper applied the topographic index and the landscape index to compute the CSC. Based on the multiple regression model between these indices and the modeling performance, we constructed the CSC model and computed the CSC for the five study areas. For the CSC in different study regions, the Chuanyu region had the highest CSC and Wuhan region had the lowest CSC, which gives the indication that for meteorological modeling, it is better to apply a finer setting of modeling for the Chuanyu region in regard to the characteristics of ground surface.
The scale effect concerning the moving window size was also discussed. Based on the window size on 3 × 3, 5 × 5, 9 × 9, and 15 × 15 km, we found an increasing trend for the mean and minimum values of CSC, and nearly decreasing trend for the maximum and standard deviation values along with the increase in window size.
To evaluate the capacity of CSC in indicating the modeling performance, we analyzed the relationship between the CSC and modeling performance of T2 from the WRF model in the Jingjinji region. The result qualitatively showed the positive relationship. In detail, the north and south part is of high CSC with relatively worse modeling performance, and the middle part is of relatively low CSC with better modeling performance.
Author Contributions: C.Z. proposed the CSC concept and computed the CSC maps. X.Z. improved the CSC maps. J.L. conducted the WRF modeling. S.W. applied the CSC maps in meteorological modeling. W.X. write the draft and improved it.