A Measurement of Visual Complexity for Heterogeneity in the Built Environment Based on Fractal Dimension and Its Application in Two Gardens

: In this study, a fractal dimension-based method has been developed to compute the visual complexity of the heterogeneity in the built environment. The built environment is a very complex combination, structurally consisting of both natural and artiﬁcial elements. Its fractal dimension computation is often disturbed by the homogenous visual redundancy, which is textured but needs less attention to process, so that it leads to a pseudo-evaluation of visual complexity in the built environment. Based on human visual perception, the study developed a method: fractal dimension of heterogeneity in the built environment, which includes Potts segmentation and Canny edge detection as image preprocessing procedure and fractal dimension as computation procedure. This proposed method effectively extracts perceptually meaningful edge structures in the visual image and computes its visual complexity which is consistent with human visual characteristics. In addition, an evaluation system combining the proposed method and the traditional method has been established to classify and assess the visual complexity of the scenario more comprehensively. Two different gardens had been computed and analyzed to demonstrate that the proposed method and the evaluation system provide a robust and accurate way to measure the visual complexity in the built environment.


Introduction
Visual complexity is one of the essential factors of human perception in the built environment. It helps us to understand people's visual experience, hence further affecting design activities. To better evaluate and analyze people's visual complexity in the built environment, fractal dimension provides a reliable approach by virtue of its characteristics of accurate complexity measurement.

Visual Research in the Built Environment
Vision is the most intuitive aspect in our daily interaction with the built environment, which often determines the first impression of the space and thus influences people's movements and behaviors. A high-quality visual experience can not only enrich the exploration and enjoyment in the built environment, but also effectively boost the vitality and attractiveness of the district. Therefore, visual experience and its subsequent cognition have been considered as an important factor in the design of architecture and the built environment. It provides a humanistic thinking for the design theory and human settlement optimization.
Previous studies have conducted various vision-related research in the field of the built environment. A traditional method is to manually record visual information in the built environment by human observation and assessment [1,2]. It investigates visual factors in the built environment in detail. Even so, its success largely depends on the objectiveness of the survey and the scale of the subjects, and its application is limited in some extent. Another method, which is more widely used, involves quantitative measurement of the visual characteristics of the built environment from the perspective of data. Space Syntax [3] is a convinced example, which includes some computational approaches such as isovist [4], viewshed analysis [5], and visibility graph analysis [6], among others. However, these approaches mainly focus on space configuration planning rather than on the human visual scenes.
In recent decades, computer vision and image processing have been developed as a powerful complementation in aspects of efficiency, accuracy, and objectivity in visual research of the built environment [7,8]. This algorithm-based method focuses on the real visual scenes and tries to make computers capable of image-understanding. This is a challenging new direction of built environment research in the digital age.

Visual Complexity as a Stimulus in the Built Environment
Various related factors in visual cognitions have been examined by previous studies. In terms of environmental elements, there are spatial layout [9], building style [10], vegetation [11], and public facility [12], etc. In terms of human emotional judgments, there are aesthetics [13], safety [14], comfort [15], etc. In terms of space type, there are cityscape [11][12][13], driving environment [16], window views [17], etc. However, fewer studies have paid attention to the visual stimuli behind all kinds of the above visual factors.
Visual complexity is a bottom-up visual stimulus. The mechanism of bottom-up indicates visual attention is driven by raw sensory input, making observers shift their attention rapidly and unconsciously to the salient visual spot of potential importance [18]. Therefore, visual complexity can be regarded as the richness and intensity of the visual information people receive, and a measure index for how much is 'going on' and how much to look at in a particular scenario [19].
In the built environment, visual complexity has been considered as a vital variable in human cognition. Ulrich [20] listed complexity as the first element in his psychoevolutionary framework, which influences human preference for the outdoor environment. Rachel and Stephen Kaplan [19] formulated an evaluative matrix (consisting of complexity, mystery, coherence, preference) in environmental psychology, where complexity plays a role that enhances exploration, and further information is promised if view time is available. Consequently, visual complexity effectively affects human cognition, judgment, and behavior. Berlyne [21] and Sun et al. [22] pointed out that visual complexity is a priority in the visual process, where perceptual curiosity and subsequent exploratory behavior can be aroused by it. Kacha et al. [23] studied the dynamics between human visual complexity perception of street view and the brain oscillations by electrophysiological analysis, the result found a positive correlation between perceived complexity and Beta brainwaves.
Hence, as an indicator of intricate level, visual complexity is worth further investigation to better understand the structural attributes in the built environment and its effects on human cognition.

Measuring Visual Complexity by Fractal Dimension
With the development of computer-aided image processing technology, the measurement of visual complexity is now considered as computing the statistical properties of the visual information, which is often presented through images. Numerous methods were employed to measure the complexity of visual images, i.e., the size of image files [24], the compression rate of image files [25], entropy [26], color hue variation [27], and fractal dimension [28], to name a few. Among them, fractal dimension has been proven to have a significant correlation with human visual cognition [29,30]. Fractal dimension is a statistical parameter in the theory of fractal geometry [31], it measures how complex a pattern is, in another word, the space-filling ability of the pattern. The more complex the pattern, the larger the fractal dimension.
In the built environment, visual information is often perceived by its geometrical properties [32], and fractal dimension originally specialized in measuring the complexity of geometric patterns. Based on this, fractal dimension has motivated many scholars to investigate the visual complexity and variety level of the building environment. Vaughan et al. measured the visual properties of falling water and its natural surroundings, demonstrated an application of fractal dimension to examine the harmony between buildings and nature [28]. Cooper et al. [33] evaluated the fractal dimension of the streetscape, wherein it was found that streetscape diversity and visual quality are positively correlated with a change of fractal dimension. Juliani et al. [34] argued that the fractal dimension of the environment directly impacts people's navigational ability in goal searching.
In addition, the measurement of fractal dimension is closely associated with scales, which brings great potential for it to be further developed in the field of built environment. Interestingly, Crompton [35] proposed that the size of a space can be measured by the number of individuals in different sizes occupying it rather than by absolute square meter, this theory implied the fractal features of measuring the space with different scales. Ma et al. [36] proposed a fractal dimension trend method to demonstrate the building's visual complexity at different observation scales. Fractal dimension has shown its ability of accuracy and adaption within the quantitative research of the building and environmental science.

Research Motivation
Since fractal dimension has been widely used to investigate visual complexity in many studies, some limitations have arisen.
As a complexity index of geometric pattern, fractal dimension was commonly used for binary images consisting of black and white pixels only, which depicts the edge structures in visual images. Since fractal dimension was first introduced in architectural science [37], an 'image-preprocessing + fractal dimension' workflow has been used to address the visual complexity of buildings, a proper image-preprocessing is therefore necessary because it not only represents the design presentation, but also connects with people's visual perception.
The traditional method was 'building's CAD wireframe + fractal dimension' [38], it is accurate, but sometimes a proper CAD file is difficult to obtain, which limits the research to some extent. Then, 'edge detection + fractal dimension' has been proved to perform better in aspects of computational efficiency and sample size [36], its outstanding performance for individual building has been recognized. However, when it comes to the built environment which contains many more environmental elements like buildings, vegetation, public facilities, et al., the current methods cannot work well, because the fractal dimension of the entire scenario seems to be easily disturbed by the redundant information.
The built environment is randomly mixed with artificial and natural elements, where the element filled with intensive homogenous details, such as leafy vegetations, duplicate windows, textured pavements, and others, will generate cluttered lines in the edge version of the visual image. The homogeneous textures remarkably increase the fractal dimension of the scenario. Hence, it leads to a pseudo-estimation of its visual complexity. For example, the more trees, the greater the visual complexity. As some studies argued, the application of fractal dimension is limited because it cannot distinguish different scenes in the built environment [39].
The scope of this study is to face the limitation and to explore the potentials of fractal dimension in the complexity evaluation of the built environment. This study suggests that the complex combination in the built environment requires a more proper preprocessing method to distinguish the valid component from the redundant information. Therefore, according to human visual features of receiving information in the built environment, the workflow of 'Potts segmentation + Canny edge detection + fractal dimension' is proposed to evaluate the visual complexity for the valid heterogeneity in the built environment. In addition, this study suggests the establishment of an evaluation system, combining the previous fractal dimension method and the proposed 'Potts segmentation + Canny detection + fractal dimension' method, to evaluate the comprehensive visual complexity in terms of detailed texture and heterogeneous components in the built environment.

Materials and Methods
Based on the research motivation, this study developed a 'fractal dimension of heterogeneity' (FDH) method, which includes the workflow of 'Potts segmentation + Canny edge detection + fractal dimension', to effectively evaluating the visual complexity in the built environment. Here, heterogeneity refers to dissimilar composition parts in the built environment that is distinguished by different colors or textures.

Study Area and Data Collection
Taking two different gardens as the example to test and verify the proposed FDH method: The Gardens of Versailles in Paris, France and the Qinghui Garden in Foshan, China. A garden is the exquisite combination of building and landscape, where the elements of nature and artifact complement each other to form an attractive sight. This makes the garden an excellent example in the study of visual complexity in the built environment.
The Gardens of Versailles, a famous royal residence, in front of the Chateau of Versailles with a core area of about 720,000 m 2 , was chosen for analysis. By using Google Street View Static API technology, visual images from 2324 spots covered by Google Street View were programmed and collected. In addition, four directions of visual perspective (front, back, left, and right along the path direction) on each spot were respectively collected for omni-directional evaluation. A total of 2324 × 4 = 9296 visual images were collected. The distance between the spots is around 5-10 m. Based on the Google API requests, the collecting parameters were set at image size = 640 × 640 pixels (max), field of view = 90 • , and camera pitch = 10 • .
The Qinghui Garden, a famous residence built in Ming Dynasty in South China, was chosen where its area is about 22,000 m 2 . Since there is no map street view covered, the visual images from 269 observation points were photographed by the authors, and 269 × 4 = 1076 visual images in total were collected. The distance between observation points is around 3-5 m, and the photograph parameters were set at image size = 640 × 640 pixels, field of view = 90 • , and camera height = 1.6 m.

Image Preprocessing
An image usually needs to be preprocessed before its fractal dimension computation, since fractal dimension mostly focuses on the edge structures of the images. Hence, a proper binary version of the original image is necessary. Based on the previous workflows of 'building's CAD wireframe + fractal dimension' [38] and 'edge detection + fractal dimension' [36,40,41], 'CAD wireframe' and 'edge detection' are commonly considered as the image-preprocessing procedures. This study developed a workflow of 'Potts segmentation + Canny edge detection + fractal dimension' and regarded 'Potts segmentation + Canny edge detection' as the proposed image-preprocessing procedure.
Firstly, an image segmentation algorithm, Potts Segmentation [42], was introduced to eliminate the redundant homogeneous textures from the visual image in the built environment. The human visual system has limited perceptual sensitivity. A globalto-local, coarse-to-fine framework is generally adopted in the visual dynamic process, especially within the complex scenes, where the finer details of the context need a longer time to be recognized [43,44]. In addition, the human visual system tends to minimize the effects of redundant homogeneous textures and usually picks the visual information in terms of a whole (e.g., building, forest) rather than the details (e.g., window lattice, leaves) [45]. Therefore, in general, human visual perception is essentially determined by the basic compositions in scenarios, where different compositions can be distinguished by visual features, such as color, texture, etc.
Inspired by this theory of eliminating visual redundancy, the current study introduced Potts segmentation as the image partitioning procedure to minimize the homogeneous textures of visual images in the built environment. The Potts segmentation procedure plugged in the Icy image processing software has been employed, it can group image pixels having homogenous features with the same kind of color or texture ( Figure 1a). The segmentation results depend on the scale parameter called granularity coefficient γ (γ > 0) [46], which determines the size of correlate neighboring pixels, the smaller γ value results in finer segmentation. whole (e.g., building, forest) rather than the details (e.g., window lattice, leaves) [45]. Therefore, in general, human visual perception is essentially determined by the basic compositions in scenarios, where different compositions can be distinguished by visual features, such as color, texture, etc. Inspired by this theory of eliminating visual redundancy, the current study introduced Potts segmentation as the image partitioning procedure to minimize the homogeneous textures of visual images in the built environment. The Potts segmentation procedure plugged in the Icy image processing software has been employed, it can group image pixels having homogenous features with the same kind of color or texture ( Figure 1a). The segmentation results depend on the scale parameter called granularity coefficient γ (γ > 0) [46], which determines the size of correlate neighboring pixels, the smaller γ value results in finer segmentation. The Potts segmentation was originally inspired by the statistical theory, Ising model, the concept of which is that in the matrix m × n, the state of each lattice (i, j) (0 ≤ i ≤ m, 0 ≤ j ≤ n) is determined by its neighbors' interaction. After iterate processes, the lattices with the same or similar state are grouped as a region, the size of the region is determined by the interact strength, the stronger the interact strength, the larger the regions tend to be [47]. This statistical theory was subsequently introduced into the field of image partitioning [48]. Potts segmentation for images is a coarse-grained process, which can also be related to the theory of superpixel. It transforms an image from pixel-level to district-level by minimizing the similar pixels and provides perceptually meaningful visual abstraction in the manner of computer vision. As Figure 1a illustrates, the different scale parameter γ was set to get a different level of segmentation results, and the Potts segmentation extracted the effective and meaningful parts from the original images in the built environment.
Next, edge detection was applied to the Potts segmentation processed image to extract its boundary edges. Canny edge detection was the chosen algorithm to implement this step since its good performance in the architectural field has been verified in our The Potts segmentation was originally inspired by the statistical theory, Ising model, the concept of which is that in the matrix m × n, the state of each lattice (i, j) (0 ≤ i ≤ m, 0 ≤ j ≤ n) is determined by its neighbors' interaction. After iterate processes, the lattices with the same or similar state are grouped as a region, the size of the region is determined by the interact strength, the stronger the interact strength, the larger the regions tend to be [47]. This statistical theory was subsequently introduced into the field of image partitioning [48]. Potts segmentation for images is a coarse-grained process, which can also be related to the theory of superpixel. It transforms an image from pixel-level to district-level by minimizing the similar pixels and provides perceptually meaningful visual abstraction in the manner of computer vision. As Figure 1a illustrates, the different scale parameter γ was set to get a different level of segmentation results, and the Potts segmentation extracted the effective and meaningful parts from the original images in the built environment.
Next, edge detection was applied to the Potts segmentation processed image to extract its boundary edges. Canny edge detection was the chosen algorithm to implement this step since its good performance in the architectural field has been verified in our previous work [36]. It can extract one-pixel edges clearly and intactly without mess noises (Figure 1b).
To sum up, the image preprocessing includes two procedures of Potts segmentation and Canny edge detection, after which perceptually meaningful edge information in the built environment can be collected.

Fractal Dimension Computation
After image preprocessing, this study employed the box-counting method to compute the fractal dimension of the visual images in the built environment.
The box-counting method is a widely used algorithm in fractal dimension computation, especially since it was introduced in the architectural field [37]. Its geometric nature of measurement makes it closely associated with that of building patterns. The box-counting method is to cover computed pattern O with grid boxes and count the number of nonempty boxes that contain any part of the pattern, then repeat this process while shrinking the box size. The number of grids N(ε) and the size of the box ε are recorded and plotted to a log-log diagram in each repeated step, where the slope of the fitting line of all data points is the fractal dimension of this specific pattern (Formula (1)). Therefore, the boxcounting method essentially measures the growth rate of the pattern's complexity over the shrinking scales.
According to the fundamental concept of the box-counting method, we programmed it in Python to efficiently manage the fractal dimension computation of one or more images.
In conclusion, the FDH method proposed here, including 'Potts segmentation + Canny edge detection + fractal dimension', was aimed at finding a perceptually adaptive way to compute the fractal dimension and its referred visual complexity in the built environment. As a tool of complexity measurement, fractal dimension has been applied in many fields such as medicine [49], geography [50], biology [51], among others. In the field of built environment, the key point is to embrace the theory of fractal dimension adaptively. The proposed FDH method extracts effective edges from the visual images of the built environment based on human visual characteristics. Hence, its fractal dimension computation indicates the actual complex level of human visual heterogeneity in the built environment.

Comprehensive Evaluation System of Visual Complexity
However, one may argue that too much visual information has been erased from the built environment in the proposed FDH method such that spatial features and identities are lost. Therefore, this study suggests evaluating the visual complexity of the built environment by a comprehensive evaluation system, which includes two kinds of fractal dimension methods. One is the proposed FDH method that reduces the detail information and computes the heterogenous composition complexity in visual images, while anther is the previous method who remains all the detail information and computes the texture complexity in visual images, and it is referred to 'fractal dimension of texture' (FDT) in this study. A 'FDT-FDH' system is therefore established to corporately evaluate the comprehensive visual complexity in terms of detailed texture and heterogeneous composition in the built environment.
In the 'FDT-FDH' evaluation system, the FDT indicates the texture complexity of the scenario, while the FDH can be seen as the composition complexity of the scenario. This study first made a rough classification of different scenarios, which are low-texture & low-composition, high-texture & low-composition, high texture & high-composition, and low-texture & high-composition ( Figure 2). In previous FDT-based studies, the fractal dimension of different scenarios in the built environment often falls into the interval of 1.0-2.0. the intervals between 1.0-1.3 and 1.3-1.5 are defined as the low-to-mid level, where the scenarios can be seen as 'well-recognized' and 'better for goal-directed navigation' [34,52]. Meanwhile, the interval beyond 1.5 is defined as mid-to-high level, where the scenarios can be seen as 'arousing' and 'exciting' [29]. Therefore, this study set the fractal dimension value 1.5 as the division point to distinguish the 'high-texture' and 'low-texture' scenarios in FDT (y-axis in the Figure 2).
On the other hand, according to the computation in the proposed FDH method, the fractal dimensions of most scenarios are between 1.0-1.6 due to the reduction of details in Potts segmentation. The division point of fractal dimension in FDH needs to be set to distinguish the 'low-composition' and 'high-composition' scenarios (x-axis in the Figure 2).

The Performance of the FDH Method
There were four images chosen from two gardens to demonstrate the performance of In previous FDT-based studies, the fractal dimension of different scenarios in the built environment often falls into the interval of 1.0-2.0. the intervals between 1.0-1.3 and 1.3-1.5 are defined as the low-to-mid level, where the scenarios can be seen as 'wellrecognized' and 'better for goal-directed navigation' [34,52]. Meanwhile, the interval beyond 1.5 is defined as mid-to-high level, where the scenarios can be seen as 'arousing' and 'exciting' [29]. Therefore, this study set the fractal dimension value 1.5 as the division point to distinguish the 'high-texture' and 'low-texture' scenarios in FDT (y-axis in the Figure 2).
On the other hand, according to the computation in the proposed FDH method, the fractal dimensions of most scenarios are between 1.0-1.6 due to the reduction of details in Potts segmentation. The division point of fractal dimension in FDH needs to be set to distinguish the 'low-composition' and 'high-composition' scenarios (x-axis in the Figure 2).

The Performance of the FDH Method
There were four images chosen from two gardens to demonstrate the performance of the proposed FDH method.
As Figure 3 illustrates, Figure 3(a1,a2) are from the Qinggui Garden, while Figure 3(a3,a4) are from the Gardens of Versailles. Figure 3b shows that in FDT method, Canny edge detection (bi-threshold set at threshold_1 = 30 and threshold_2 = 90 (0 ≤ threshold ≤ 255)) directly performed on the original images, it shows clustered lines generated by natural vegetations or textured materials, which caused a higher fractal dimension and pseudo complexity level of human visual variety. The fractal dimension of these four scenarios sorted from high to low is Figure 3

The Comparison between the FDT Method and the FDH Method
Since people receive continuous visual information while moving in space, just a few visual images are not enough to represent its visual complexity characteristics, and certainly not enough to analyze design philosophy and human behaviors accordingly. In fact, some scholars have realized that multi-perspective evaluation of visual complexity can be more comprehensive and realistic than a single one [53,54]. Therefore, the fractal On the other hand, through the Potts segmentation procedure in the proposed FDH method (Figure 3c), the study set the scale parameter as γ = 0.8 (γ > 0) to eliminate most of the visual details and only retained basic shapes distinguished by different vectors of color or texture in the image. Hence, their Canny-edge versions depict basic outlines of the compositions in images (Figure 3d). The rank of their fractal dimension and referred visual complexity turned to sort as Figure 3(d1) > Figure 3(d3) > Figure 3(d2) > Figure 3(d4), which is more compatible with human visual perceptions.

The Comparison between the FDT Method and the FDH Method
Since people receive continuous visual information while moving in space, just a few visual images are not enough to represent its visual complexity characteristics, and certainly not enough to analyze design philosophy and human behaviors accordingly. In fact, some scholars have realized that multi-perspective evaluation of visual complexity can be more comprehensive and realistic than a single one [53,54]. Therefore, the fractal dimension distribution of FDT and FDH were mapped respectively in each garden, to further compare the performance of these two methods ( Figure 4).  Visual complexity is an essential factor in creating energetic spatial scenarios or dynamic sightseeing experience. Changes in visual complexity imply changes in the physical structures of the built environment. From the color-coded distribution shown in Figure 4, it is clear that the FDH distribution can present more hierarchical structures than the FDT method, and can better reflect the design philosophies of these two gardens as well. How- Visual complexity is an essential factor in creating energetic spatial scenarios or dynamic sightseeing experience. Changes in visual complexity imply changes in the physical structures of the built environment. From the color-coded distribution shown in Figure 4, it is clear that the FDH distribution can present more hierarchical structures than the FDT method, and can better reflect the design philosophies of these two gardens as well. However, due to the garden is a typical place that full of textured elements, such as vegetation, decoration, and pavement pattern, etc., its fractal dimension computation is easily disturbed. Therefore, the FDTs of these two gardens do not reflect their corresponding design philosophies respectively.
The Gardens of Versailles and the Qinghui Garden represent the typical design protocol of western and eastern gardens. The FDH distribution in the Gardens of Versailles shows a roughly symmetric situation, which is consistent with the garden layout. Dominated by the main axis, the FDH distribution reflects the characteristics of balance, order, and strictness in the Euclidean-geometric layout of the garden. Moreover, the areas with different landscape design themes are also distinguished clearly by different structures of the FDH distribution. On the contrary, the FDH distribution in the Qinghui Garden shows a non-structural layout, which corresponds to the eclectic and flexible arrangement of traditional Chinese gardens. Just as Chambers once said after his visit to some Chinese gardens: 'Nature is their (Chinese garden) pattern, and their aim is to imitate her in all her beautiful irregularities' [55].
In addition, people's sightseeing experiences consist of a group of visual perceptions along the viewing path, and the most interaction with the built environment for individuals happens during their movement within space. Based on the FDH distribution, the Gardens of Versailles produces a relatively steady visual experience with the changeless fractal dimension along the viewing path. Probably because the concept of this garden was to belittle the human to manifest its overwhelming gesture, its geometric straight paths are more appreciated from above rather than walking through [56]. Compared to the Gardens of Versailles, the Qinghui Garden is much smaller in area, as well as its route length. Even so, its FDH distribution fluctuate randomly, which means that with the ever-changing visual complexity, one can hardly feel bored in this compact but full-of-surprise place, since an eventful experience makes the route perceptually shorter [35].

Scenario Classification in the FDT-FDH Evaluation System
In the section of '2.4. Comprehensive Evaluation System of Visual Complexity', four types of scenarios have been roughly classified based on the FDT-FDH evaluation system. The fractal dimension of 1.5 has been set as the division point of 'low texture' and 'high texture' in FDT method. According to the FDH results of two gardens, different fractal dimension value was tried as the division point of 'low-composition' and 'highcomposition' scenarios in FDH, and fractal dimension of 1.3 performed best in the scenario classification. The final FDT-FDH classification of scenarios is listed as below.

1.
Classification of 'Simple' scenario: low texture & low composition. Fractal dimension thresholds: 1.0 < FDT ≤ 1.5 & 1.0 < FDH ≤ 1.3. This type of scenario usually contains little texture because of the glaze surface or wide view. In addition, the composition is very simple so that the scenario does not have too much to see, making it easily to recognize ( Figure 5).

2.
Classification of 'Textured but Simple' scenario: high texture & low composition. Fractal dimension thresholds: 1.5 < FDT ≤ 2.0 and 1.0 < FDH ≤ 1.3. This type of scenario has a lot of homogeneous details, but its environmental composition is relatively simple. Mass homogeneous textures cause a higher range of fractal dimension in the FDT method, while the proposed FDH method can significantly reduce the influence of homogeneous texture on the perceptually evaluation the visual complexity, since the redundant textures have been largely diminished ( Figure 6).
tion' scenarios in FDH, and fractal dimension of 1.3 performed best in the scenario classification. The final FDT-FDH classification of scenarios is listed as below.
1. Classification of 'Simple' scenario: low texture & low composition. Fractal dimension thresholds: 1.0 < FDT ≤ 1.5 & 1.0 < FDH ≤ 1.3. This type of scenario usually contains little texture because of the glaze surface or wide view. In addition, the composition is very simple so that the scenario does not have too much to see, making it easily to recognize ( Figure 5). scenario has a lot of homogeneous details, but its environmental composition is relatively simple. Mass homogeneous textures cause a higher range of fractal dimension in the FDT method, while the proposed FDH method can significantly reduce the influence of homogeneous texture on the perceptually evaluation the visual complexity, since the redundant textures have been largely diminished ( Figure 6). 3. Classification of 'Diverse' scenario: high texture & high composition. Fractal dimension thresholds: 1.5 < FDT ≤ 2.0 and 1.3 < FDH ≤ 2.0. This type of scenario has high degree of texture and composition, which means diverse visual elements with textured details are contained (Figure 7).

3.
Classification of 'Diverse' scenario: high texture & high composition. Fractal dimension thresholds: 1.5 < FDT ≤ 2.0 and 1.3 < FDH ≤ 2.0. This type of scenario has high degree of texture and composition, which means diverse visual elements with textured details are contained (Figure 7).

4.
Classification of 'Invalid': low texture & high composition. Fractal dimension thresholds: FDT < FDH. This type of scenario usually does not exist, because FDH eliminates many textured details in visual images, so generally FDT ≥ FDH. 3. Classification of 'Diverse' scenario: high texture & high composition. Fractal dimension thresholds: 1.5 < FDT ≤ 2.0 and 1.3 < FDH ≤ 2.0. This type of scenario has high degree of texture and composition, which means diverse visual elements with textured details are contained (Figure 7). 4. Classification of 'Invalid': low texture & high composition. Fractal dimension thresholds: FDT < FDH. This type of scenario usually does not exist, because FDH eliminates many textured details in visual images, so generally FDT ≥ FDH.
Based on the above classification and the corresponding sample images, the FDT-FDH evaluation system delivers clear classification results properly. The classification diagram shown in Figure 2 can be further optimized as shown in Figure 8, in which the FDT-FDH values of all the observation points in these two gardens were also plotted. It can be noted that most points of the Gardens of Versailles fell into the 'textured but simple' interval, while that of the Qinghui Garden fell into the 'diverse' interval.  Based on the above classification and the corresponding sample images, the FDT-FDH evaluation system delivers clear classification results properly. The classification diagram shown in Figure 2 can be further optimized as shown in Figure 8, in which the FDT-FDH values of all the observation points in these two gardens were also plotted. It can be noted that most points of the Gardens of Versailles fell into the 'textured but simple' interval, while that of the Qinghui Garden fell into the 'diverse' interval. 4. Classification of 'Invalid': low texture & high composition. Fractal dimension thresholds: FDT < FDH. This type of scenario usually does not exist, because FDH eliminates many textured details in visual images, so generally FDT ≥ FDH.
Based on the above classification and the corresponding sample images, the FDT-FDH evaluation system delivers clear classification results properly. The classification diagram shown in Figure 2 can be further optimized as shown in Figure 8, in which the FDT-FDH values of all the observation points in these two gardens were also plotted. It can be noted that most points of the Gardens of Versailles fell into the 'textured but simple' interval, while that of the Qinghui Garden fell into the 'diverse' interval. As can be seen from Figures 5-8 shown above, the scenarios of 'textured but simple' take up almost the entire Gardens of Versailles, which is mainly generated by the plants with regular Euclidean geometric shapes but dense textures. The 'simple' scenarios adjacently surround the Chateau of Versailles, highlighting the dominance of the main build- As can be seen from Figures 5-8 shown above, the scenarios of 'textured but simple' take up almost the entire Gardens of Versailles, which is mainly generated by the plants with regular Euclidean geometric shapes but dense textures. The 'simple' scenarios adjacently surround the Chateau of Versailles, highlighting the dominance of the main building by placing expansive walkways and short plants, leading the sight reach to the distance. Meanwhile, the 'diverse' scenarios are mostly distributed on the areas with different design themes (i.e., parterres or woods), the combination of buildings, plants, fence or any other elements provides richness for human vision. On the other hand, in the Qinghui Garden, the distribution of the 'textured but simple' and 'diverse' scenarios are balanced, except for the absence of the 'simple' scenarios. The balanced distribution represents the characteristics of different scenarios merged together to express the harmony between nature and artifacts, create a hierarchical visual experience with movement. Meanwhile, most of the scenarios in the garden consist of multiple elements such as buildings, pavilions, ponds, rocks, and plants, etc., which pursue natural layout and shape to express a great diversity to the human vision.
In this classification, different types of scenarios in the built environment were preliminarily classified into three categories: 'simple', 'textured but simple', and 'diverse'. The difference between 'simple' and 'diverse' scenarios is usually clear, but the proposed FDH method corrects the pseudo-evaluation of high visual complexity (high fractal dimension) due to the visual redundancy, and adds 'textured but simple' in the classification system. With the expansion of the sample scale and the optimization of the evaluation method, a more precise classification of the fractal dimension of the built environment is expected.

Discussion
The building environment is a very complex combination, and it is quite a challenge to compute its fractal dimension. Perry et al. [57] and Patuano et al. [39] raised questions on whether fractal dimension can distinguish different types of environmental senarios; and if the answer is positive, why do two entirely different scenes (e.g., totally natural and urban) possibly share very similar fractal dimension values? These thought-provoking questions are related to the development and application of fractal dimension in the research field of the built environment.
For the better performance of fractal dimension on the real scenario in the built environment, it should be notable to extract perceptually effective and meaningful edges from visual images for fractal dimension computation according to the research needs, since the edge information in visual images is the focus of fractal dimension computation.
In the built environment, this study believes that the visual complexity of its geometric structures is not perceived at the same level, noting the coarse-to-fine framework in the visual process. Hence, for the assessment of visual complexity, it would be arbitrary to take all the geometric structures of one's vision into account without any priority. There is a lot of visually redundant details involved in our daily built environment. Vegetation, for instance, usually contains dense and fine texture but generally does not need much brainpower and visual attention to process. However, it will generate chaotic lines and dots after edge detection, which can greatly disturb the result of its fractal dimension computation. Dupont et al.'s research [58] demonstrated that people have weaker visual exploration, quicker glances, and less interest in front of homogeneous landscape scenarios. However, the intensive textures generated by the edge detection cause a higher value of fractal dimension computation, the corresponding visual complexity evaluation of the scenario with vegetation (trees, bushes, grassland, etc.) or any other textured element is therefore falsely higher than its actual level. It might lead to the ramification that an empty grassland has a pseudo similar fractal dimension with crowded urban buildings [39].
In fact, this study has the opinion that a natural landscape is not necessarily more complex than an urban view and vice versa, the visual complexity depends on the human perception of effective and meaningful structures in the spatial scenes. The proposed FDH method extracts the edges of perceptible heterogenous compositions which is distinguished by different colors and textures. Compared with the FDT method, which retains all detailed information and computes the texture complexity of the scenario, the proposed FDH method represents the true level of heterogeneity of the environmental compositions by eliminating chaotic or duplicate visual details.
However, this does not mean that the proposed FDH method is simply 'better' than the FDT. Actually, FDT and FDH complement each other and can provide a more accurate and comprehensive evaluation than any single one of them. A 'FDT-FDH' system was therefore established, in which the FDT performs 'better' in evaluating the visual complexity of detailed texture, while the FDH performs 'better' in evaluating the visual complexity of heterogeneous composition.
Four types of scenarios are classified according to the 'FDT-FDH' system. It is a delicate issue to classify and define different types of scenarios according to the specific domain of fractal dimension. This study therefore made a preliminary three-classification: 'simple' (low-texture & low-composition)-'textured but simple' (high-texture & low-composition)-'diverse' (high-texture & high-composition), the fourth classification 'invalid' (low texture & high-composition) means it hardly happens because usually FDT >= FDH, as Potts segmentation in FDH diminishes details. Although a strong correlation between fractal dimension and human perception has been proved [33,[59][60][61], the classification of fractal dimension according to human perception varies with the difference of computation samples and computation methods.
In previous FDT-based research, this computed the fractal dimension of all the detailed textures in the real or simulated scenarios of the built environment. Cooper et al. [40] found a positive relationship between fractal dimension and the judgments of visual quality, the city street with mean fractal dimension of 1.718 got the highest score in visual quality survey, while another street with mean fractal dimension of 1.455 got the lowest score. Abboushi, et al. [29] suggested that the natural shadow patterns with mid-tohigh complexity (fractal dimension of 1.5-1.7) cause more visual interesting. However, Juliani, et al. [34] illustrated that the scenarios with low-to-mid fractal dimension (1.0-1.5) are better for individual's navigation activities. Based on above FDT-based research, it can be concluded that the scenarios with mid-to-high visual complexity of detailed textures (fractal dimension of 1.5-2.0) have more visual excitation, while the scenarios with lowto-mid visual complexity of detailed textures (fractal dimension of 1.0-1.5) are clear for goal-searching. This is consistent with this study's result that the fractal dimension of 1.5 was set as the division point of 'high-texture' and 'low-texture'.
In previous FDH-based research, which has considered the influence of redundant textures on the fractal dimension computation in the built environment. Patuano [41] pointed out that it is critical to address the image-preprocessing properly, because different levels of edge extraction have a significantly effect on the fractal dimension of landscape images, and only the silhouette outline of the landscape can discriminate between two different scenario types. Hagerhall, et al. [60] investigated the correlation between the fractal dimension of 80 landscapes' silhouettes and human preference, the result showed that human preference peaks at the fractal dimension of 1.3. The extraction of landscape silhouette is similar to the proposed FDH method to some extent, they both extract the perceptually meaningful edges in landscape images, except the FDH extracts larger numbers of effective edges than just silhouette in the scenario, and captures more information of visual perception. Therefore, one can assume that the fractal dimension of the highest human preference is greater than 1.3 in the FDH method, but it should be examined thoroughly in the future research, the questionnaire survey and physical monitoring (e.g., eye tracking system, electroencephalography, skin conductance response, etc.) can be conducted to obtain human perception data.
Furthermore, recognizing and distinguishing complexity is one of the fundamental logics of visual processing in our daily lives, but this does not mean that people always search for complex scenes. People have similar experiences: a crowd on the street inevitably grabs one's attention and entices one to find out what is happening; after walking through a dense forest, a tranquil lake coming into view makes one feel calm and peaceful spontaneously. Therefore, visual complexity is more like a trade-off searching mechanism of Gestalt figure-ground theory, which indicates that complexity and simplicity are related and dependent on each other [62]. However, complexity stands out from simplicity more easily than the other way around in visual perception, it is a classic searching asymmetry that complexity has a priority in the visual system [22]. The proposed FDT-FDH system brings a new perspective for the assessment of visual complexity in the built environment, it provides references for architects and city planners to create diversity and interest in the design. More importantly, with the collaboration of computer technology and big data, it lays a foundation for large-scale scenario recognition in terms of visual complexity in the future. This study made a preliminary attempt to extend the scale of fractal dimension computation from a few visual images to the entire garden site to examine its distribution characteristics of visual complexity. The visual complexity assessment of the larger space scale, such as a district or a city, is expected in future research.

Conclusions
In summary, this research highlights the promising potential of fractal dimension as a quantitative tool in the assessment of the built environment. The proposed FDH method in this study evaluated the visual complexity of the built environment in terms of heterogeneous compositions, which is in consonance with the characteristics of human visual perception. Furthermore, the combination of the FDT method and the FDH method establishes a system to evaluate visual complexity more comprehensively.
However, some limitations are noted. Firstly, a preliminary classification of different scenarios has been made according to the previous studies and the result of this study, the correlation between the classification and human preference needs to be further explored by more experiments [63] and larger sample size [64]. In addition, as the fractal dimension distribution has been presented in this study, the complexity of the changes of visual complexity along the path and its relationship with human dynamic perception can be further investigated. Secondly, the edge structure in the built environment is the primary focus of this study, but color is also a vital factor that affects human visual perception. Some studies have explored the algorithm of fractal dimension in the quantization of color information [65,66], and its combination with the built environment can be considered as a direction for further research. Therefore, a visual complexity evaluation of 'texturecomposition-color' is expected as well in the future research. Thirdly, the visual complexity based on fractal dimension is one aspect of the built environment assessment, where it needs to be connected to other advanced technology to further evaluate the quality of the built environment.

Data Availability Statement:
The data that support the findings of this study are available from the corresponding author upon reasonable request.