Expert Opinion Dimensions of Rural Landscape Quality in Xiangxi, Hunan, China: Principal Component Analysis and Factor Analysis

: Scholars and planning/design professionals are interested in the quantitative, metric properties influencing the quality and assessment of rural landscape space. These metrics are important for guiding rural planning, design, and construction of cultural rural environments. Respondents and metrics from four sampled villages (Qixin, Hangsha, Yanpai Xi, and Lvdong) in the Xiangxi District of Hunan Province in China were examined, employing statistical principal component analysis and factor analysis methods to understand the identifying properties concerning planning and design features of these rural mountain village landscape spaces. The two approaches reveal different aspects from the same variables. Through factor analysis and rotation, four general dimensions were revealed explaining approximately 62% of the variance: a settlement and environmental axis, an intangible culture axis, a productive landscape axis, and a transportation and public space axis, supporting the standing notion that the variables were ordinated across four dimensions in these mountain villages and occupied an elliptical plane that was different than the predicted space occupied by nearby cites. In contrast, principal component analysis revealed that the variables could be grouped into one latent dimension explaining 48% of the variance and revealing an alternative interpretation and spatial plot of the sites.


Introduction
As an agricultural country with over 5000 years of "farming culture", China's agricultural area accounts for nearly 56% of the land area [1]. These rural environments contain agricultural production areas, villages, mountains, rivers, and other natural landscape features reflecting the local history and civilization. Rural landscape evaluation is the core of rural landscape theory expressed by Liu and Wang [2], and also an important means to achieve the protection and development of rural landscape with regional characteristics, directing future planning and development of rural areas. The United States enacted the Wilderness Act (1964) which initiated the evaluation and protection of rural landscape resources; while the United Kingdom began to conduct qualitative analysis of rural landscape quality since the 1980s, emphasizing public participation and sensory evaluation [3,4]. Towards the end of the 20th century, Australia, Netherlands, Russia, Canada, and other countries explored planning and management metrics [5]. In the Netherlands, an expert scoring method to evaluate the quality of rural landscape was employed [6]. To understand the public's preferences, direct surveys were initiated to achieve a consistent approach to rural landscape quality evaluation . In the past, evaluation methods of rural landscape quality have been diversified, mainly including analytic hierarchy process (AHP), questionnaire survey methods, and visual assessments method [28,29]. Recently, there has been an emphasis upon landscape and design metrics to measure and ordinate environments [30][31][32][33][34][35]. In addition, there have been approaches to employ fractals to understand and replicate spaces modified by humans [36][37][38]. Although rural landscape evaluation has been an interest of some scholars, the rural landscape multivariate evaluation system has not been extensively examined, investigations are often quantitatively weak, and a somewhat complete comprehensive evaluation system has not been fully explored, yet there is much to learn [10]. Over the last 50 years, investigators have focused upon the perceptions of citizens with the understanding that experts (academics and experienced professionals within China from the planning and design arena) view the environment differently [10]. Few studies have examined the perceptions of experts; Kongjian Yu is one of the few to conduct such studies [39]. Our study examined the evaluation of the rural landscape of the Xiangxi District in Hunan Province, China, by surveying the responses of planning and design experts to gain their perception of the environmental qualities that comprise the characteristics of the setting. Such studies often generate many variables to consider and may rely upon multivariate statistical analysis to clarify the results [40][41][42][43][44][45][46][47][48][49][50][51][52].
To provide a more quantitative approach, factor analysis (FA) and principal component analysis (PCA) are two methods of multivariate statistical analysis which have been widely used in soil science, water quality science, climatology, medicine, urban geography, and other fields and have achieved insights into the relationships amongst a larger set of variables [40][41][42][43][44][45][46][47][48][49][50][51][52]. These approaches attempt to reduce and group the number of dimensions/variables to glean a clearer understanding of underlying relationships amongst the variables. In this investigation, the team examined spatial variables (24) addressing a somewhat culturally distinct rural environment in the Xiangxi District of Hunan Province in China (Figure 1), generating results from four villages (Qixin, Hangsha, Yanpai Xi, and Lvdong). The team employed both FA and PCA to extract descriptions of the data and to suggest implications for the planning, design, and management of these rural cultural areas.

Xiangxi Study Area and Photographic Images
Xiangxi Tujia and Miao Autonomous Prefecture of Hunan Province (Xiangxi for short) is located in the Wuling Mountain Area. Due to the unique karst landform and isolated traffic patterns, there are a large number of intact traditional villages with a long history and rich in cultural relics. In order to promote the protection and development of traditional villages, the Ministry of Housing and Urban-Rural Development of China and other departments established a list of traditional Chinese villages according to the evaluation and identification index system of traditional villages, totaling 6799 villages. The study team (including scholars from the College of Landscape Architecture and Art, Hunan Agriculture University, Changsha, Hunan Province, China, and the College of Landscape Architecture, Beijing Forestry University, Beijing, China) chose four of the most representative (village remoteness, traditional buildings, preserved farmland, maintain traditional culture, elevation between 478-750 m, and presence of traditional streets) villages for a more in-depth study: Lvdong, Qixin, Hangsha, and Yanpai Xi, located in the Xiangxi area of Figure 1. The four traditional mountain villages comprised the basis for the questionnaire. The four villages are located in the eastern end of Yunnan-Guizhou plateau and the middle part of the Wuling mountain range with an average elevation of 478-750 m, vegetation coverage, and no large-scale tourism development. Because they are located in remote mountains, they are less disturbed by modern culture, and their historical and cultural values are more evident. Their traditional residential buildings, streets, and farmlands are well preserved. The four traditional villages in this study appeared to have great similarity.
For each village, 8 to 9 photographs were chosen to obtain respondent impressions/opinions (2-3 images for village overall condition and the surrounding environment, two for close distance examples of the settlement and residential environment landscape, one to show the village water, 2-3 images for customs or landmarks) (

Methodology
To derive a response instrument from the images, a series of variables needed to be employed in a respondent survey. In determining the variables, 30 graduate students in landscape architecture from Beijing Forestry University, Hunan Agricultural University, and 10 experts on rural tourism, planning, and design were interviewed concerning their opinions about assessment impact variables in rural landscapes. The results identified, potentially, 45 different items. Then, 40 Asian tourists who have experienced rural tourism were asked, "In your opinion, what are the variables affecting rural landscape evaluation?" This approach generated approximately 50 potential variables. In the end, this led to a total of 24 variables ( Table 1). The variables were able to be measured in a respondent survey employing the Likert scale, an ordinal data approach (Table A1).
Respondents were selected from individuals who were engaged in or experienced in rural tourism, and the survey was conducted by an on-line network questionnaire. A total of 164 questionnaires were distributed, and 164 effective questionnaires were received, an effective rate of 100%. The entire questionnaire survey process was completed within a continuous period of time, ensuring the randomness and representativeness of data samples. The gender ratio was 52% female and 48% male. The age groups in the respondents ranged from 20 to 39 (51%) and 40 to 59 (29%) with the remaining in other age groups. In terms of education level, respondents mainly received undergraduate and master's degrees (46% and 29%, respectively), and those with doctor's degree or above accounted for 9%. The remaining respondents had no higher education degrees.
Prior to giving responses, the purpose and method of research were told to the respondents. Then, the group of slides for a village were presented. The respondents had 1 min to view the image. The respondent's responses were then assessed with FA and PCA. The data captured were ordinal in nature, and these methods are most suitable to non-parametric statistical approaches; however, for multivariate analysis, a reliable/widely accepted non-parametric approach for FA and PCA have not been widely adopted. Since this study is exploratory in nature, parametric FA and PCA were employed.

Results
The overall reliability of 164 questionnaires was tested with the Cronbach's alpha coefficient which had a value of 0.952, indicating that the questionnaire had high reliability. The Kaiser-Meyer-Olkin measure (KMO) and Bartlett test of sphericity were used in the study. The KMO value was measured to be 0.958 which was much greater than the minimal number of 0.5, indicating that the data samples were robustly sufficient and suitable for factor analysis, and the results of principal component analysis had practicability. At the same time, Bartlett's significance value for the spherical test was 0.000 which was less than an alpha of 0.01, indicating that there was a correlation among the variables and that this data were suitable for factor analysis. The correlation coefficient matrix of 24 variables was obtained. A large proportion, approximately over 90% of the variables, generated correlation coefficients among the variables that were greater than 0.3, meaning there was a substantial linear correlation among many of the variables; in other words, the variables were able to be grouped or associated.
In this study, principal component analysis (PCA) was used to identify the general number of dimensions. Normally in PCA, eigenvalues dropping lower than one are often considered dimensions with low explanatory values. The characteristic root of the first principal component is 11.513 which can explain 47.969% of information across all variables ( Table 2). The characteristic root of the second principal component is 1.355, which can explain 5.647% of the information in all variables. The characteristic root of the third principal component is 1.171, which can explain the information of 4.881% in all variables. However, the cumulative contribution rate of the three principal components was only 58.497%. According to the related literature, the principal component should be accumulated to explain 60%-70% of the variation of the data; thus, to extract the fourth principal component, the cumulated variance contribution ratio in the first four principal components reached 62.220%, explaining 62.22% of the total variable difference. The PCA suggests, at most, there were up to four meaningful dimensions. According to standard principal component analysis, the dimensions of the principal components are not rotated explaining the maximum amount of variance per orthogonal dimension. The eigenvector of the first dimension contained larger coefficients for the variables ranging from 0.553 to 0.776. Usually, such values are strongly associated with the first dimension [40][41][42][43][44][45][46][47][48][49][50][51][52]. In other words, it would be possible to explain 47.969% of the variance in one dimension with all of the variables strongly associated with the first dimension. However, the study team was interested in exploring the results with rotations. The study team employed the maximum variance method to normalize the rotation with Kaiser and the rotation convergence of eight iterations for factor analysis. The new factor loads of all variables on the four principal components were obtained by rotation. The scale of factor axis composition can be observed from Table 3. The four common factors included the physical load and the main characteristics of 24 evaluation factors. It should be noted that dimension names were heuristically derived by trying to identify a general character of the grouped variables. In addition, six of the variables (listed at the bottom of Table 3) were not associated with any of the latent dimensions. In the factor analysis results, the general qualitative characteristics of these villages included a strong sense of isolation from the outside world, a strong orientation towards nature (biospheric, not noospheric), an abundance of traditionally styled Chinese buildings (little modernism and postmodernism), good air quality (not polluted), strong evidence of cultural relics (stone carvings, stelae, etc.), strong evidence of quality traditional construction methods, strong use of local materials (wood and stone), strong absence of modern technology (no highways, little sign of electrical lines, towers, rail lines, no neon lighting), and a strong sense of care in appearance (no rubble, no litter). Other characteristics included: a strong evidence of local traditions/customs, a strong evidence of Miao folk art (silver work, embroidery, batik, etc.), evidence of authentic traditional lifestyles, good external traffic connections, a variety of traditional open spaces (well site, sun drying space, etc.), a strong sense of order in the transportation spaces, strong spatial separation of land uses (clear distinction between farmland, urban space, woodland), an overwhelming abundance of farmland in contrast to urban land, and that the agricultural land is diverse. The PCA results included a similar character but also included the properties of the remaining six variables: abundant landmarks, abundant natural scenery, associated legends and stories for the area, presence of water, the presence of forested lands, and the village size ranges from 500 to 1000 residents.
Often, the concept behind factor analysis is that there is a predetermined or imagined structure (factors) that is suggested or has evolved in the literature and is illustrated in a classic study by Dunn [54], whereas principle component analysis does not assume a preordained structure [53,54]. The rotations and clusters associated with factor analysis attempt to ascertain the strength of the expected structure. The factor loading coefficients in Table 3 which have values greater than 0.6 represent a strong association with the factor. These values are shown in bold in Table 3. Each of the eighteen variables with a strong association are affiliated with one of the predefined factors. The remaining six variables have only a weak association with the four predetermined factors. The four predetermined factors: settlements and environmental factors; intangible cultural factors; transport and public space factors; and productive landscape factors are orthogonal dimensions (meaning independent) and demonstrate evidence of structural/statistical/numerical existence. In other words, the villages in the study can be defined by the four factors and eighteen variables, explaining 62.220% of the variance (Tables 4, A1, and A2). This means there is still almost 38% of the variance (100 minus 62 leaves 38) that the factor analysis does not explain and is open to further study. A study area can be evaluated with the linear combinations of four equations that can define the characteristics for a village, town, and city as illustrated in Equation (1), for the first factor which generates a numerical score for the first factor. Equations for the other three factors can be similarly constructed. Numerical scores for the PCA dimensions can also be accomplished with coefficients from each eigenvector (see Tables 5 and A2)

Discussion
Both factor analysis and principal component analysis can be implemented with statistical software to study the relationships amongst variables. The procedures can reveal that there exist weak and non-existent relationships or they may reveal meaningful clusters or groupings of the variables. The results depend upon how the variables relate to each other. When there are many variables, without multivariate analysis, it can be difficult to interpret the collection of the variables. Principle component analysis and factor analysis can reveal the collective relationships of the variables as illustrated in past studies of Indian and Canadian cities across the countries studied [55,56]. These factors and dimensions can be employed to make numerical comparisons of various sites, plots of the dimensions, and study the variations and characteristics amongst the cities. While geographers and urban planners have studied cities at a national level, the study of the special cultural spatial characteristics has yet only been modestly examined.
In this study, the characteristics of the villages can be described by the variables and plotted. The new coefficient loadings for the variables in the factor analysis are obtained by rotation. According to the results of the evaluation of common factor 1 (settlements and environments factor) project, coefficients above 0.508 form a list of nine variables: V18 isolated degree, V19 landscape view toward the environment, V20/V9 historic building air quality/sight visibility villages overall integrity protection, V8 and V10 architectural style and technology level, V13 building material characteristics, V17 surrounding environment visual noise, V15 village cleanliness. The V18 had the highest coefficient value of 0.742. These nine adjectives reflect the surrounding and internal environment of the village, the whole settlement, and the building characteristics of residential buildings.
In the examination of factor/dimension 2, there were three variables with a coefficient load above 0.675: V21 folk customs, V22 activation inheritance of folk art, and V23 settlement function continuity. V21 had the highest value, 0.722. These adjectives reflect the cultural characteristics and authenticity of the village, so they were named as intangible cultural factors.
In the evaluation of common factor 3, there were three variables with factor loading above 0.644: V16 external traffic accessibility, V11 common meeting space type, and traffic organization in V14 village. The V16 had the highest value, 0.711. This group of variables mainly reflect the external and internal traffic organization of the village as well as the nodes of common assembly space. Therefore, they were named traffic and common space factors.
In the examination of dimension 4, there were three variables whose coefficient loading was above 0.652: V6 texture level of farmland, the coverage area of V2 farmland, and color and type of V4 farmland/orchard/vegetable garden/tea garden. The V6 was the highest with a coefficient score of 0.743. This group of adjectives reflects the characteristics of productive landscape in villages, so they were named productive landscape factors.
Principal component analysis and factor analysis obtained new dimensions (clustered vectors of variables). They represent two different views of the same data. The PCA generated dimensions with the largest orthogonal variance possible, and it is possible to lump all variables together in one large dimension which explains 47.969% of the variance (Table 5).
Factor analysis revealed sets of variables in dimensions that seemed to be, in this instance, an understandable set of dimensions: settlements and environmental factors; intangible cultural factors; transport and public space factors; and productive landscape factors. In addition, the seven of the variables were not strongly affiliated with any of the four rotated dimensions. These variables were: indicators of V12, V3, V24, V5, V1, and V7. For comparison purposes, the first four principal components and the factor analysis dimensions can be employed in linear equations to assess and compare additional villages and environments as illustrated by the multivariate efforts of other investigators [40][41][42][43][44][45][46][47][48][49][50][51][52].
A three-dimensional plot can be constructed of the factor analysis and the principal component results ( Figure 6). The plots of the four villages can be compared to plots of nearby cities in the area, including: Changsha, Nantong, Wuhan, Guangzhou, Zhuzhou, Yueyang, Jishou, and Chongqing. The plots represent the data with two different perspectives. The factor analysis plot separates the villages from the cities along two primarily parallel planes by factor 3, the transportation and public space factor. The principle component plot separates the villages from the cities with the villages containing an orbit beyond the cluster of the cities in a three-dimensional setting. Both approaches can differentiate the cities from the villages with the same data but present the ordination differently. The two approaches show that that villages can be differentiated and are indeed spatially different along the variables measured.
Once these results have been obtained, illustrating that there are characteristics of the villages that can be quantifiably identified, the next step in the investigatory process is to examine the variables in detail that explain the nuances among the villages and their differences from the cities. This step is being accomplished in a publication by Wen, Li, and Zhou (in publication) [57].
According to different common factors, the quality evaluation of rural landscape space in each village was quantified, and the result was consistent with the actual situation of each village. The research results provide a theoretical basis for the construction of new villages and the protection and development of traditional villages. The subjects of this research questionnaire are experienced tourists who have been or are in the process of traveling. Although the subjects spanned different ages and cultural backgrounds, the main source was still urban residents. Only 8% of the respondents were farmers. Moreover, due to the different cultural background and training of planning/design professions, this expert-based opinion response survey may not reflect the values of the public. Therefore, the findings may be biased.
In addition, the comparison between principal component analysis and factor analysis in statistics is used to understand the differences and similarities. In future studies, this method needs to be further examined for assessment approaches employing ordinal data and data suitable for nonparametric statistical tests.
The study is limited by the selection of villages, variables employed, images chosen, and the experts interviewed in the study. The results presented in the study, while significant, are not definitive. Numerous studies are required to corroborate or refute such findings. Figure 6. The left side presents the plots of the four villages (surrounded by a blue ellipse) and the cities examined (surrounded by a yellow ellipse) with the scores from factor analysis' first three factors. The right side presents a cluster of the cities (yellow ellipse) surrounded by the plots of the villages surrounded by a blue ellipse from the first three eigenvalues and associated eigenvectors (copyright © 2020 Bin Wen all rights reserved, used by permission).
In Figure 6, note that the four villages were not identical and contained differences, but that through FA and PCA, the location of the variables was different than the location of the other towns/cities plotted in the figure. The results illustrate that spatially there is something that is differently expressed across 18 variables in FA and 24 variables in PCA. For planning and design assessment and landscape management, villages in the Xinagxi study area should occupy positions in the general three-dimensional space revealed in this study. Villages with calculated scores that occupy a different region may have drifted away from their historic characteristics or are not part of the set of traditional communities. The equations from FA and PCA can calculate the position of other communities; however, the equations are not the final answer in planning, design, and management but rather information about the general spatial character or an existing community or the impacts of proposed changes. The equations and plots may provide feedback on alternatives and management of these spaces. In addition, villages, towns, and cities can be considered design treatments and compared for statistical difference by employing Friedman's Two-way Analysis of Variance by Ranks, a procedure that compares treatments across all the variables in interest [58].

Conclusions
This research investigating the environmental quality of mountain villages is at a formative stage. Much more work and effort can be conducted to refute or corroborate these results. Evaluation and comparison of rural landscape factors with different variables will take time and a series of investigations. This study revealed that the PCA approach can generate a single comprehensive dimension, while the factor analysis approach generated four distinct dimensions. This research demonstrates that the relatively isolated and undisturbed village environment can be quantitatively measured, described, and differentiated from other nearby urban settings. The two methods illustrated different representations/interpretations of the same data. Factor analysis differentiated the villages from nearby cities by separating them into two distinct elliptical planes ( Figure 6). Principal component analysis separated the villages on the edges of a three-dimensional elliptical orbit with the cities closer to the center of the three-dimensional plane ( Figure 6). These two methods suggest that there is a multivariate statistical difference between the villages and nearby cities. A detailed discourse describing the statistical and physical differences between these villages and cities and their characteristics are discussed in a forthcoming article [57]. Acknowledgments: The respondent survey study was conducted with the supervision and authority of Beijing Forestry University and in conjunction with Hunan Agricultural University.

Conflicts of Interest:
The authors declare no conflict of interest.
Appendix A Table A1. List of the variables employed in the study of rural landscape in mountain villages where: 5 = very satisfied, 4 = satisfaction, 3 = in general, 2 = not satisfied, 1 = very dissatisfied, by respondents.