Next Article in Journal
Prevalence and Characteristics of Radiographic Radiolucencies Associated with Class II Composite Restorations
Previous Article in Journal
Three-Dimensional Printing and 3D Scanning: Emerging Technologies Exhibiting High Potential in the Field of Cultural Heritage
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Environmental Pattern Recognition Method for Traditional Chinese Settlements Using Deep Learning

1
School of Information and Control Engineering, Xi’an University of Architecture and Technology, Xi’an 710055, China
2
College of Architecture, Xi’an University of Architecture and Technology, Xi’an 710055, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2023, 13(8), 4778; https://doi.org/10.3390/app13084778
Submission received: 20 February 2023 / Revised: 3 April 2023 / Accepted: 7 April 2023 / Published: 11 April 2023
(This article belongs to the Section Computing and Artificial Intelligence)

Abstract

:
The recognition of environmental patterns for traditional Chinese settlements (TCSs) is a crucial task for rural planning. Traditionally, this task primarily relies on manual operations, which are inefficient and time consuming. In this paper, we study the use of deep learning techniques to achieve automatic recognition of environmental patterns in TCSs based on environmental features learned from remote sensing images and digital elevation models. Specifically, due to the lack of available datasets, a new TCS dataset was created featuring five representative environmental patterns. We also use several representative CNNs to benchmark the new dataset, finding that overfitting and geographical discrepancies largely contribute to low classification performance. Consequently, we employ a semantic segmentation model to extract the dominant elements of the input data, utilizing a metric-based meta-learning method to enable the few-shot recognition of TCS samples in new areas by comparing their similarities. Extensive experiments on the newly created dataset validate the effectiveness of our proposed method, indicating a significant improvement in the generalization ability and performance of the baselines. In sum, the proposed method can automatically recognize TCS samples in new areas, providing a powerful and reliable tool for environmental pattern research in TCSs.

1. Introduction

Traditional Chinese settlements (TCSs) are the products of agricultural societies [1,2]. Due to the limited productivity level, the advantages and disadvantages of natural conditions play a decisive role in agricultural production and human life [3,4]. The rational use of natural resources and topographic advantages to avoid harm is the primary criterion for settlement site selection. Therefore, the ancient Chinese were adept at determining the place of residence based on geographies and natural resources, such as mountain shapes, riverbanks, forests, and pastures, to create relatively suitable production and living spaces [5,6]. These patterns with regional environmental characteristics are regarded as settlement environmental patterns, which combine architecture and environment to realize the harmonious coexistence between humans and nature. Cognitive environmental patterns are useful for modern urban and rural planners to discover the traditional habitat wisdom embedded in TCSs, thereby guiding new village planning. However, most TCSs are located in remote areas with limited access, making it difficult to conduct manual surveys. Therefore, it is valuable to research the automatic recognition method of environmental patterns using the environmental data of TCSs.
Traditionally, research on environmental patterns primarily relies on phenomenal summaries and inductive descriptions, focusing on the internal and external conditions, such as geographical distribution, topographic conditions, and material culture, to classify TCSs in a qualitative manner [7,8,9]. To improve quantification, researchers have conducted quantitative comparative studies of TCSs using statistical learning methods [10,11,12,13]. In recent years, statistical machine learning has achieved great success in many fields, such as agriculture [14], forestry [15], and climatology [16] by combining the experience, prior knowledge, and conceptual understanding of human experts, which inspires experts in traditional settlement studies to explore digital transformation. Consequently, scholars attempt to use the analytic hierarchy process to quantify the dominant elements of TCSs and then apply machine learning methods, such as SVM [17], clustering [17,18], and random forest [19], to scientifically compare and classify them. Although these methods provide some ideas for environmental pattern recognition in TCSs, manually constructed features not only have difficulty describing complex human-land relationships, but also rely on features selected by the algorithm designer and environmental pattern recognition rules that depend on user-configured parameters. Setting these parameter values to obtain accurate recognition is difficult, especially when multiple parameters are involved.
In recent years, high-resolution remote sensing images have enabled the rapid acquisition of spatial structures, morphological characteristics, and environmental information related to settlements [20,21,22], which enables the remote sensing image scene classification methods to provide a new way to automatically recognize the TCSs. Among them, relying on the deep neural network can automatically attain global features from the input data and treat the scene classification task as an end-to-end problem [23,24,25]. The features learned by deep learning methods are often better and simpler than those designed manually. However, the drawback of deep learning is also obvious. High classification performance presupposes sufficient supervised learning samples, which is often difficult. In most cases, we only have a small number of available TCS samples, and the classic classification networks [26,27,28,29] perform poorly when they encounter few samples. In addition, TCSs are formed spontaneously during the long-term interaction between humans and nature and have distinctive regional characteristics, which makes it difficult for trained models to effectively recognize the patterns of TCSs in unknown areas.
To address these limitations, we investigate the application of deep learning techniques in the environmental pattern recognition task of TCSs with the workflow shown in Figure 1. First, we download remote sensing images and digital elevation model (DEM) data of TCSs from three different topographic areas in China. Among them, DEM is a three-dimensional array-based digital elevation model that represents the surface morphology of the Earth by quantifying the mean altitude of the sampled areas through discretized elevation data. It can effectively supplement missing elevation information in remote sensing images. With some preprocessing work, a new TCS dataset by region was built, which includes 648 TCSs with five environmental patterns, such as river valley, foothill, and riverine. Second, three representative convolutional neural networks (CNNs), AlexNet [26], ResNet [27], and DenseNet [28], are used to benchmark the new dataset. The three CNNs achieved near-perfect performance on the training set but performed poorly on the test set. Because the samples in the training and test sets are from different areas, this leads to serious overfitting problems in the CNNs under the conditions of sparse samples and regional differences. To solve this problem, we propose a new deep learning method by introducing pre-segmentation and metric-based meta-learning techniques to CNNs. Specifically, a semantic segmentation model is used to segment the input data of remote sensing images and DEM data into settlement environment maps composed of seven elements, including mountains, water, forests, and farmland. Subsequently, the environmental pattern recognition of TCSs containing unknown areas is regarded as a few-shot classification problem [29,30], where the areas with a large number of samples will be used as the base dataset to train the model, and the areas containing only a small number of samples will be used as the novel dataset, enabling few-shot recognition using the similarity between the support and query samples. Finally, we perform model training, evaluation, and comparative evaluations to demonstrate the effectiveness of the proposed method. This work provides a new way to research and recognize the environmental patterns of TCSs. The primary contributions are as follows:
  • The environmental pattern recognition of TCSs is formalized as an image processing task, addressed by a deep learning model trained with remote sensing images and DEM data. More specifically, these two types of data are combined into four-channel inputs to extract environmental features and perform automatic recognition using CNNs.
  • A semantic segmentation model is used to segment the input data into settlement environment maps consisting of dominant elements, which helps to fuse expert prior knowledge to reduce the influence of noise in remote sensing images and improve interpretability.
  • A metric-based meta-learning method incorporating pre-segmentation strategies is proposed to achieve the few-shot recognition of environmental patterns under conditions of sample scarcity and geographical differences in TCSs.
The rest of this paper is organized as follows. In Section 2, we review the related work on TCS classification research and introduce our idea. Section 3 describes the construction of the new dataset. Section 4 introduces the proposed method. Section 5 contains experiments and analysis, and the last section concludes.

2. Related Work

In this section, the work related to our research is reviewed from two perspectives; one is the classification study of TCSs, and the other is few-shot learning.

2.1. Classification Study of TCSs

Traditionally, classification research on TCSs is carried out manually [7,8,9], which is often limited in scope and time-consuming. To overcome this problem, researchers have attempted to conduct comparative studies of TCSs using statistical learning methods. For example, refs. [10,11] applied Space syntax to classify settlements by topological features, such as spatial depth, connectivity, and integration. Additionally, refs. [12,13] focused on an abstracted plane to quantify the shape, structure, and order of the settlement plane in order to establish a two-dimensional morphological analysis model for classification purposes. With the rise in digital research on TCSs, machine learning techniques have also been applied. For instance, Zheng et al. [17] used Support Vector Machines (SVM) to distinguish between new-style and old-style rural settlements based on a landscape analysis approach. Jia et al. [18] employed an Analytic Hierarchy Process (AHP) to construct a morphological quantification index in three dimensions for mountain settlements and then apply clustering methods to recognize the spatial morphology of Miao settlements in Qiandongnan. Subsequently, Wu et al. [19] studied the environmental factors of 4000 TCSs, constructing a quantitative index system of settlement environmental factors by topography, meteorology, and natural resources. Then, clustering algorithms were applied to recognize the geospatial patterns of these settlements.
However, the classification of TCS environmental patterns is still an unresolved issue due to the lack of clear rules, which is reflected in the fact that researchers usually limit their research to a certain area and formulate corresponding rules according to the characteristics of that area. These rules force researchers to manually extract a large number of features based on the study perspective, which is a tedious process and lacks generalizability. Although deep learning methods have powerful feature extraction and characterization capabilities that can replace the process of feature construction to some extent, deep learning methods rely on large amounts of labeled data, which is difficult for a limited number of TCSs. In addition, the inexplicability and security risks [31] of deep learning are criticized by researchers in the field of urban and rural planning, which restrict the development of deep learning techniques in the field of environmental pattern research in TCSs. To address these issues, we investigated environmental factors and found that elements in natural environments such as mountains, rivers, farmlands, forests, and vegetation regions play dominant roles in environmental patterns [17,18,19,32,33,34]. Therefore, our goal is to segment the remote sensing images of TCSs into settlement environment maps composed of dominant elements, thus incorporating expert knowledge in the field of urban and rural planning. On this basis, the similarity between TCSs is used to achieve automatic recognition, enabling the use of interpretable artificial intelligence (XAI) [35] methods to drive digital transformation in the field of environmental pattern research.

2.2. Few-Shot Learning

Deep learning methods have achieved remarkable success in many computer vision tasks [23,24,25] due to their powerful feature extraction and characterization capabilities. However, deep neural networks are prone to overfitting when the number of samples is insufficient. Although some transfer learning [36], and data augmentation [37] methods can mitigate overfitting, they do not solve it.
Meta-learning, also called learning to learn, extracts transferable meta-knowledge from historical tasks to avoid overfitting and improve generalizability. Inspired by metric learning [38], most of the existing meta-learning image classification methods usually use the similarity of images in the feature space for classification. The idea is to learn a feature encoder that can transform an input image into a deep representation suitable for comparison. When encoded in the feature space, it is possible to compute the similarity between the support and query samples by measurement functions such as cosine similarity or Euclidean distance and compare them with each other in order to classify them in the same class. Matching Networks [39] are the first metric-based meta-learning method that maps the support set to a function via an attention model and then classifies the query samples by a weighted nearest neighbor classifier in an embedding space. Similarly, Prototypical Networks [40] create a prototype representation for each category and classify it by computing the distance between the query sample and the prototype vector. Relational Networks [41] proposes a parametric relational model to replace the ordinary distance metric, using neural networks to compute the degree of match between feature vectors. Inspired by the Prototypical Network [40], TADAM [42] extends the concept of metric space by integrating metric scaling, task conditioning, and auxiliary task co-training. MetaOptNet [43] uses discriminative linear classifiers such as support vector machines instead of nearest-neighbor classifiers (e.g., cosine similarity or Euclidean distance), using high-dimensional embeddings with improved generalization. In addition, several approaches based on pre-training have achieved competitive performance [44,45,46,47]; among them, Meta-Baseline [47] provides better generalization by pre-training classifiers on all base classes and meta-learning on nearest prime-based few-sample classification algorithms. Meanwhile, Zhang et al. [48] proposed a graph-based few-shot learning method that converts the features extracted by a pre-trained self-supervised feature extractor into a Gaussian-like distribution to reduce feature distribution mismatch. Recently, Meta DeepBDC [49] achieved state-of-the-art performance by measuring the joint distribution between sample pairs, thus obtaining more accurate similarity.
While meta-learning approaches have seen great success in few-shot classification, we think that fusing expert prior information and learning good embedding features is more effective than complex meta-learning algorithms in some cross-disciplinary studies. Therefore, our work prefers to improve the feature representation.

3. Dataset Construction

In this section, we construct a labeled TCS environmental pattern dataset for the training and validation of the deep learning models. First, remote sensing images and DEM data of TCSs are collected from three areas with different topographic conditions in China. Next, the collected data are preprocessed to generate usable training and testing data.

3.1. Data Collection

TCSs are formed spontaneously over the course of long-term interactions between humans and nature, and they demonstrate distinct geographical characteristics [50]. According to the Ministry of Housing and Urban-Rural Development of China, there are currently 6821 recorded TCSs throughout the nation. Information on them has been introduced on the Traditional Chinese Settlements Digital Museum (http://www.dmctv.cn/directories.aspx) (accessed on 20 February 2023). For the purpose of automatic recognition of environmental patterns using deep learning techniques, data from three different areas with a variety of landforms have been collected. The first area is the Qiandongnan Miao and Dong Autonomous Prefecture (Qiandongnan, Figure 2b, 107°17′ E–109°35′ E, 25°19′ N–27°31′ N) on the Yunnan-Kweichow Plateau, with an area of 30,282 km2. This region is home to many well-preserved TCSs. The second area is Shaanxi Province (Figure 2c, 105°29′ E–111°15′ E, 31°42′ N–39°35′ N), situated in the Chinese hinterland. It encompasses an area of 205,624 km2 and is renowned for its diverse range of landforms, including loess tablelands, mountains, plains, and basins. Lastly, the Anhui Province (Figure 2d, 114°54′ E–119°37′ E, 29°41′ N–34°38′ N), located in the Yangtze River Delta region of East China, offers an area of 140,100 km2, consisting of various plains, hills, rivers, and lakes.
Next, we investigate five environmental patterns of TCSs in the above areas, including river valley, foothill, riverine, hillside, and plain. Among them, the settlements with river valley patterns usually occupy a valley, are surrounded by mountains, and are often located near rivers. The settlements with foothill patterns tend to be located on flat slopes, along ridges, or at the base of hills. Riverine settlements are typically situated near rivers or dispersed along these waterways in a band. Hillside patterns of settlements often exist on the sides of mountains, characterized by a steep drop-off between peaks and deep ravines. The settlement planning boundary is broken and follows the terrain in multiple directions. Finally, plain settlements are usually situated on flat plains in patches, and they often display higher densities and structural forms.
To acquire the settlements and their environmental data, Pleiades satellite images from Airbus Defence and Space (with a spatial resolution of 0.5 m) are selected. The remote sensing images of five environmental pattern TCSs in the mentioned areas are cropped to 2560 × 2560 pixels, with the settlement being located at the center. Additionally, the corresponding DEM data with a 10 m spatial resolution are also considered, based on the latitude and longitude. A sample of 648 TCSs is collected, consisting of a remote sensing image paired with one piece of DEM data, as shown in Table 1. The number of TCSs of each type varied greatly, and even the same type had significant regional distinctions across different regions. An example of this difference could be seen in the hillside pattern settlements displayed in Figure 3.

3.2. Data Preprocess

After data collection, a dataset of TCS environment patterns was constructed for training deep learning models. This dataset consists of two parts: one is the environmental pattern classification labels for training a classification model, and the other is the semantic segmentation labels for training a segmentation model.
For the semantic segmentation labels, research has identified natural environmental elements such as mountains, rivers, forests, vegetation, and farmland, in addition to their spatial relationships with settlements, as the main components of TCS environment patterns [51,52]. Mountains are especially important, having a significant influence on the characteristics of a settlement, such as production methods and factor safety [53,54,55]. Rivers and lakes provide essential resources such as drinking water and transportation [56,57], while forests offer materials for settlement development and maintain comfortable microclimates [58]. The distribution of agricultural land dictates the productive and social properties of a settlement [59,60], and vegetation enhances soil quality, water conservation, and disaster prevention [61]. All these factors combine to form a relationship pattern between the settlement and its environment [62].
Based on this research, seven categories that represent environmental features have been identified. These categories and their contents for semantic segmentation are outlined in Table 2. Remote sensing images of TCSs are then semantically annotated according to the definitions given in Table 2. In cases where one element conflicts with another, such as mountains and water, the element with higher influence is given precedence, e.g., water. Therefore, this is not a land cover classification, but a description of the human–land relationship in TCS using the settlement environment map. Five samples of TCSs with different environmental patterns and their corresponding settlement environment maps are shown in Figure 4.
Finally, the dataset is split into training, validation, and testing sets. The training and validation sets come from Qiandongnan and are randomly split by a ratio of 4:1. The test sets come from the Shaanxi and Anhui provinces. The reason for this dataset partition approach is that TCSs are widely distributed and have regional differences. It is not practical to train a model for each region, and collecting all the TCSs’ data to build a massive dataset would require a significant amount of manpower. Therefore, we used two different regions to construct the test set to verify the model’s generalization ability and performance. The details of the dataset division are shown in Table 3.

4. Proposed Methods

In this paper, we propose a metric-based meta-learning method for the few-shot recognition of environmental patterns in TCSs. We outline the proposed framework, which consists of four stages, as illustrated in Figure 5. First, a semantic segmentation model is trained using a cross-entropy (CE) loss function to extract settlement environment maps consisting of environmental elements from TCS remote sensing images and DEM data. Second, a CNN is trained on all base categories, with its final fully connected (FC) layer removed to obtain a feature extractor, represented by f θ . Third, in the meta-training stage, a meta-classifier, M · | S b a s e , is trained on multiple episodes, each containing a support set, S , and a query set, Q . For single episodes, the category means of query features and support features are compared by scaling the cosine distance. During the training process, each minibatch contains several tasks, and the average loss is calculated. Finally, the classifier, M · | S b a s e , is evaluated on the episodes drawn from the test set during the meta-testing stage. In the following sections, the segmentation model and meta-classification model are discussed in detail.

4.1. Pre-Segmentation Model

The purpose of the pre-segmentation model is to extract the dominant environmental elements of a TCS from remote sensing images and DEM data to get the settlement environment map. In this regard, the DeepLab V3+ model [63] is modified to accept four-channel inputs, as presented in Figure 6. The kernel size, step size, convolution layers, pooling layers, deconvolution layers, and activation functions are the same as those proposed by Chen et al., with the exception that the input layer was adapted to accommodate four-channel data that had been derived by concatenating remote sensing images and DEM data.

4.2. Meta-Classification Model

In few-shot settings, the environmental pattern recognition problem of TCSs can be considered a set of N - w a y   K - s h o t   M - q u e r y tasks, each containing N classes with K support samples and M query samples in each class. These tasks are called e p i s o d e s , the N × K labeled TCS samples are called the support set, and the M TCS samples to be recognized are called the query set. The goal is to query the M unlabeled TCS samples using only the N × K labeled samples for N environmental patterns. To solve this problem, a meta-classification model has been designed, containing three distinct stages.
The first stage is training the feature extractor by training a CNN on all base categories, which will remove the final fully connected layer and obtain f θ . The second stage is the meta-learning stage, which is the main component of the model; the task here is to improve generalization. Here, many episodes are extracted from the base dataset consisting of K input and output samples randomly selected from each category. This makes a total of N   ×   K samples that are trained per episode, and the parameters of the classifier M · | S b a s e are shared over all episodes. Given a few-shot task with support-set S , let S c denote the few-shot samples in class c and compute a mean embedding w c as the centroid of class c as follows:
w c = 1 S c x S c f θ x
For query sample x in the few-shot task, the probability that x belongs to class c is predicted based on the distance between the embedding of x and centroid w c of class c . Here, the cosine similarity is used for the metric, and a learnable scale parameter τ is added to adjust the original range of values of cosine similarity 1 , 1 ; τ is initialized to 10. With   ,   denoting the cosine similarity of the two vectors, the prediction can be formalized as follows:
p y = c x = exp τ f θ x , w c c exp τ f θ x , w c
The last stage is the meta-testing stage, where the generalization performance of M · | S b a s e is evaluated by constructing D novel from a sample of TCSs from different areas. In the meta-testing stage, a new set of episodes is randomly selected from D novel , which consists of a new support set S novel and a new query set Q novel , and S novel can be used to predict Q novel .

5. Experiments and Results

In this section, three representative CNNs are first benchmarked on the new TCS dataset to survey their recognition accuracies. Next, the proposed method is applied to obtain new models, and the performance of those models is compared with the baseline to observe the improvement in generalizability.

5.1. Results of the Baselines

To test the effect of training deep learning models on the TCS environmental pattern dataset, three CNNs are used as benchmarks, including AlexNet [26], ResNet50 [27], and DenseNet121 [28]. All models are trained on the training set with 200 epochs and a batch size of 32, using the Adam optimizer with an initial learning rate of 0.001 and a decay factor of 0.1, decaying at 50 and 100 epochs. The input images are remote sensing images of TCSs, and each image is resized to 256 × 256 pixels. Standard data augmentation techniques are applied, including random resizing, cropping, and rotation. The accuracy rate is the average of the ratio of correctly recognized samples in each category to all samples in that category. The results are shown in Table 4, and the learning curves are shown in Figure 7.
As can be seen from the learning curve in Figure 7, all three CNNs struggle with severe overfitting. They achieved almost perfect performance on the training set, while performing considerably worse on the validation set. For example, the ResNet50 model achieved 100% accuracy on the training set, but only 63.75% accuracy on the validation set and even lower accuracy on the two test sets, with an average accuracy of only 48.46%. In the loss curve of the ResNet50 model, the loss on the training set keeps decreasing, though it continues increasing on the validation set after the oscillation. A similar overfitting phenomenon is observed in the AlexNet and DenseNet121 models. This indicates that overfitting is a significant problem when training neural networks with small-sized unbalanced datasets, particularly when dealing with complex input data.

5.2. Results of the Proposed Methods

To address the overfitting problem caused by sparse data, the CNNs are trained using the proposed method. The semantic segmentation model is first trained to extract the settlement environment map. Next, the three CNNs are tuned to accept single-channel maps, naming them AlexNet-PS, ResNet50-PS, and DenseNet121-PS. Following the same configuration, we retrained the three tuned CNNs and obtained the test results in Table 5 and Figure 8.
As shown in Table 5, all models had improved accuracy once given the settlement environment map as input. ResNet50-PS achieved an average accuracy of 70.37%, a 21.91% enhancement compared to the baseline. This validates the notion that remote sensing images of TCSs contain noise that interferes with the relations between the inputs and the outputs, resulting in the models remembering the noise features instead.
The learning curves of the models are featured in Figure 8. This highlights the suppression of the overfitting issue, yet there remains a substantial gap between the validation set and test set accuracy. For example, DenseNet121-PS demonstrated a maximum accuracy of 90% in the validation set, while reaching only 72.13% in the test set. This reflects the inconsistency of the data distributions between the training and test sets. Considering that TCSs have geographical differences, we applied the meta-classification method for mapping the input data into a feature vector suitable for comparison, whereby we constructed a series of meta-tasks.
The gained models, called AlexNet-PS-MC, ResNet50-PS-MC, and DenseNet121-PS-MC, are each trained with 3-way 1-shot and 3-way 5-shot tasks, each supported by five query samples. After 30 epochs, the highest accuracy model from the validation set was selected for testing, with its accuracy measured as the average of 200 tasks from the test set. In addition, we construct two state-of-the-art few-shot classification models, Meta-Baseline [47] and Meta DeepBDC [49], and adjust them to accept four-channel input data. Both models are trained using the same settings and compared with the proposed method. The recognition results for the Shannxi and Anhui provinces are shown in Table 6.
In Table 6, as the number of samples in the support set increases, every model achieves accuracy gains because a large support set facilitates the model to learn generalized features. Compared with Meta-Baseline and Meta DeepBDC, the proposed method achieves accuracy leadership in both cases. This confirms our conjecture that the excessive noise in remote sensing images makes it difficult for the feature extractor to extract good features. This reflects the importance of the pre-segmentation module, which can help the feature extraction module to obtain better features by incorporating human prior information, thus improving the classification performance. The recognition accuracy of the proposed method in the 1-shot case is already higher than the baseline because the classic classification network contains fully connected layers with a strong fitting ability, and these layers can undergo severe overfitting in case of sample scarcity. Although dominant elements of the input data are extracted, the recognition performance of the three CNNs in new areas is still low. Removing the fully connected layers of the three CNNs and applying meta-learning methods finally remedied the overfitting issue. Experiments revealed that the proposed method can effectively improve the generalization capability and performance of deep neural networks in TCS environment pattern recognition tasks.

5.3. Ablation Study

In this section, we conduct ablation studies to analyze how each component affects the environment pattern recognition performance. We study the following five components of our method: (a) effect of pre-segmentation; (b) effect of data augmentation; (c) effect of pre-training; (d) effect of meta-training; (e) effect of input size. The model used is DenseNet121-PS-MC with the highest recognition accuracy. The test set is Test2, from Anhui Province, and includes one-hundred forty-six samples and five environmental patterns. Table 7 shows the results of the ablation study.
It is observed that when the pre-segmentation module is removed, the classification performance of the model degrades significantly in different shot settings, and the accuracy decreases by 10.41% in the 3-way 1-shot case. As the pre-segmentation module is removed, the background information in remote sensing images “spoofs” the model, making it focus on irrelevant noise. This is especially noticeable in small-size datasets. Data augmentation operations can improve the performance of the model to some extent, but the improvement is limited. Before meta-training, pre-training is introduced to improve the feature representation ability of the model in the case of a few samples, which gives a good initialization of the model; therefore, removing pre-training leads to a decrease in recognition performance. After the meta-training stage is removed, the recognition accuracy of the model decreases by 9.78% in the 3-way1-shot case. This is because meta-training adjusts the scaling parameters in the metric module and optimizes the feature extractor as a way to learn task-level distributions. In addition, the size of the input images also affects the performance of the model, as scaling down inputs loses information. After adjusting the input size to 128 × 128, the recognition accuracy decreased by 1.71% in the 3-way 1-shot case. In experiments using input images of 512 × 512 in size, the recognition accuracy improves by only 0.19% but increases the computational cost exponentially. Therefore, we chose 256 as the final input scale for the model.

6. Conclusions

This paper investigates the application of deep learning techniques to achieve the automatic recognition of environmental patterns for TCSs. To represent the complex human–land relationships in TCSs, we formalize this task as an image processing problem, using deep learning methods to automatically extract environmental features from TCS remote sensing images and DEM data. Specifically, we construct a new labeled TCS environmental pattern dataset and perform benchmarking using several representative CNNs. To address the problem of low recognition rate due to sample scarcity and geographical differences, we utilize a semantic segmentation model to construct settlement environment maps informed by human prior information, and we employ a metric-based meta-learning method to perform few-shot recognition using sample similarity. Extensive experiments are conducted to verify the effectiveness of the proposed method. The constructed DenseNet121-PS-MC model achieves 90.46% recognition accuracy in the self-constructed TCSs environmental pattern dataset, which is better than other existing methods, and it achieves effective recognition of environmental patterns in different areas.
This study explores the use of intelligent methods to assist TCS surveys, providing an effective analytical tool for urban and rural planners. However, the proposed method still has some limitations; for example, only five representative environmental patterns are selected in the study, and it is not known whether other environmental patterns can be effectively recognized. For future work, we plan to expand our dataset to incorporate more areas of TCSs and additional environmental patterns. Meanwhile, the latest deep network structures will be investigated to further improve recognition accuracy. Finally, we are also interested in exploring unsupervised techniques to avoid the tedious task of manual data labeling.

Author Contributions

Conceptualization, Y.K. and P.X.; methodology, Y.K. and P.X.; study design, Y.K. and P.X.; validation, Y.K., P.X., Y.X. and X.L.; formal analysis, Y.K., P.X. and Y.X.; figure, Y.K.; resources, Y.K.; writing—review and editing, Y.K.; writing—original draft preparation, P.X.; investigation, P.X. and X.L.; data collection, P.X., Y.X. and X.L.; literature search, P.X., Y.X. and X.L.; data curation, P.X.; visualization, Y.X.; supervision, Y.K.; project administration, Y.K.; funding acquisition, Y.K., Y.X. and X.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by People’s Government of Shaanxi Province: 2020ZDLNY06-02 and Xi’an University of Architecture and Technology: 2022SCZH15.

Data Availability Statement

The processed data used to support the findings of this study have not been made available because the data is part of ongoing research.

Acknowledgments

We would like to express our gratitude to the editor and reviewers for their valuable comments.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Han, K.-T. Traditional Chinese Site Selection-Feng Shui: An Evolutionary/Ecological Perspective. J. Cult. Geogr. 2001, 19, 75–96. [Google Scholar]
  2. Xiao, J.; Cao, K. Analysis of Feng Shui Pattern in Traditional Chinese Settlements Based on the Concept of Cultural Landscape—A Case Study of Ancient Town of Shangli, Ya’an in Sichuan Province. J. Hum. Settl. West China 2014, 29, 108–113. [Google Scholar]
  3. Ye, Z.R. Modem residential environment & traditional inhabitable culture. Archit. J. 2001, 12, 21–24. [Google Scholar]
  4. Council, OF EUROPE. European landscape convention. Eur. Treaty Ser. 2000, 176, 1–7. [Google Scholar]
  5. Long, H.; Liu, Y. Rural restructuring in China. J. Rural Stud. 2016, 47, 387–391. [Google Scholar] [CrossRef]
  6. Tao, J.; Chen, H.; Xiao, D. Influences of the natural environment on traditional settlement patterns: A case study of Hakka traditional settlements in Eastern Guangdong Province. J. Asian Archit. Build. Eng. 2017, 16, 9–14. [Google Scholar] [CrossRef] [Green Version]
  7. Zhao, M.-D.; Tang, G.-A.; Shi, W.-Z.; Liu, Y.-M. A GIS-based research on the distribution of rural settlements in Yulin of northern Shaanxi. J. Geogr. Sci. 2002, 12, 171–176. [Google Scholar]
  8. Ma, X.D.; Li, Q.L.; Shen, Y. Morphological difference and regional types of rural settlements in Jiangsu Province. Acta Geogr. Sin. 2012, 67, 516–525. [Google Scholar]
  9. Yang, R.; Xu, Q.; Long, H. Spatial distribution characteristics and optimized reconstruction analysis of China’s rural settlements during the process of rapid urbanization. J. Rural Stud. 2016, 47, 413–424. [Google Scholar] [CrossRef]
  10. Lee, H.W.; Kim, Y.J.; Choi, S.M. A study on spatial structure analysis for comprehensive rural clustered villages development area using the space syntax method technique. J. Korean Soc. Rural Plan. 2004, 10, 19–28. [Google Scholar]
  11. Fladd, S.G. Social syntax: An approach to spatial modification through the reworking of space syntax for archaeological applications. J. Anthropol. Archaeol. 2017, 47, 127–138. [Google Scholar] [CrossRef]
  12. Bu, X.C. Quantitative Research on the Integrated Form of the Two-Dimensional Plan to Traditional Rural Settlement; Zhejiang University: Hangzhou, China, 2012. [Google Scholar]
  13. Du, J.; Hua, C.; Wu, Y.; Tong, L. A study on the spatial characteristics of tunpu settlements in the karst and mountainous areas of central Guizhou. Arch. J. 2016, 5, 19. [Google Scholar]
  14. Balducci, F.; Impedovo, D.; Pirlo, G. Machine learning applications on agricultural datasets for smart farm enhancement. Machines 2018, 6, 38. [Google Scholar] [CrossRef] [Green Version]
  15. Holzinger, A.; Saranti, A.; Angerschmid, A.; Retzlaff, C.O.; Gronauer, A.; Pejakovic, V.; Medel-Jimenez, F.; Krexner, T.; Gollob, C.; Stampfer, K. Digital transformation in smart farm and forest operations needs human-centered AI: Challenges and future directions. Sensors 2022, 22, 3043. [Google Scholar] [CrossRef]
  16. Bochenek, B.; Ustrnul, Z. Machine learning in weather prediction and climate analyses—Applications and perspectives. Atmosphere 2022, 13, 180. [Google Scholar] [CrossRef]
  17. Zheng, X.; Wu, B.; Weston, M.V.; Zhang, J.; Gan, M.; Zhu, J.; Deng, J.; Wang, K.; Teng, L. Rural settlement subdivision by using landscape metrics as spatial contextual information. Remote Sens. 2017, 9, 486. [Google Scholar] [CrossRef] [Green Version]
  18. Jia, Z.; Meng, C.; Zhou, Z. A 3-D morphological approach on spatial form and cultural identity of ethnic mountain settlements: Case from Guizhou, China. J. Mt. Sci. 2021, 18, 1144–1158. [Google Scholar] [CrossRef]
  19. Wu, S.; Di, B.; Ustin, S.L.; Stamatopoulos, C.A.; Li, J.; Zuo, Q.; Wu, X.; Ai, N. Classification and detection of dominant factors in geospatial patterns of traditional settlements in China. J. Geogr. Sci. 2022, 32, 873–891. [Google Scholar] [CrossRef]
  20. Lisetskii, F.N.; Zemlyakova, A.V.; Terekhin, E.A.; Naroznyaya, A.G.; Pavlyuk, Y.V.; Ukrainskii, P.A.; Kirilenko, Z.A.; Marinina, O.A.; Samofalova, O.M. New opportunities of geoplanning in the rural area with the implementing of geoinformational technologies and remote sensing. Adv. Environ. Biol. 2014, 8, 536–539. [Google Scholar]
  21. Conrad, C.; Rudloff, M.; Abdullaev, I.; Thiel, M.; Löw, F.; Lamers, J. Measuring rural settlement expansion in Uzbekistan using remote sensing to support spatial planning. Appl. Geogr. 2015, 62, 29–43. [Google Scholar] [CrossRef]
  22. Liang, J.; Jiayu, C.; De, T.; Xin, L. Planning control over rural land transformation in Hong Kong: A remote sensing analysis of spatio-temporal land use change patterns. Land Use Policy 2022, 119, 106159. [Google Scholar] [CrossRef]
  23. Cheng, G.; Yang, C.; Yao, X.; Guo, L.; Han, J. When deep learning meets metric learning: Remote sensing image scene classification via learning discriminative CNNs. IEEE Trans. Geosci. Remote Sens. 2018, 56, 2811–2821. [Google Scholar] [CrossRef]
  24. Nogueira, K.; Penatti, O.A.; Dos Santos, J.A. Towards better exploiting convolutional neural networks for remote sensing scene classification. Pattern Recognit. 2017, 61, 539–556. [Google Scholar] [CrossRef] [Green Version]
  25. Zhang, R.; Xu, L.; Yu, Z.; Shi, Y.; Mu, C.; Xu, M. Deep-irtarget: An automatic target detector in infrared imagery using dual-domain feature extraction and allocation. IEEE Trans. Multimed. 2021, 24, 1735–1749. [Google Scholar] [CrossRef]
  26. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef] [Green Version]
  27. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
  28. Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
  29. Fei-Fei, L.; Fergus, R.; Perona, P. One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intell. 2006, 28, 594–611. [Google Scholar] [CrossRef] [Green Version]
  30. Wang, Y.; Yao, Q.; Kwok, J.T.; Ni, L.M. Generalizing from a few examples: A survey on few-shot learning. ACM Comput. Surv. 2020, 53, 1–34. [Google Scholar] [CrossRef]
  31. Hoenigsberger, F.; Saranti, A.; Angerschmid, A.; Retzlaff, C.O.; Gollob, C.; Witzmann, S.; Nothdurft, A.; Kieseberg, P.; Holzinger, A.; Stampfer, K. Machine Learning and Knowledge Extraction to Support Work Safety for Smart Forest Operations. In Machine Learning and Knowledge Extraction: 6th IFIP TC 5, TC 12, WG 8.4, WG 8.9, WG 12.9 International Cross-Domain Conference, CD-MAKE 2022, Vienna, Austria, 23–26 August 2022; Springer International Publishing: Cham, Switzerland, 2022; pp. 362–375. [Google Scholar]
  32. Wu, C. The core of study of geography: Man-land relationship areal system. Econ. Geogr. 1991, 11, 1–6. [Google Scholar]
  33. Liu, S.J. Academician Wu Chuanjun’s human geographical thoughts and man-nature relationship system theory. Prog. Geogr. 1998, 17, 12–18. [Google Scholar]
  34. Moghadam, D.M.; Singh, H.J.; Yahya, W.R. A brief discussion on human/nature relationship. Int. J. Humanit. Soc. Sci. 2015, 5, 90–93. [Google Scholar]
  35. Schwalbe, G.; Finzel, B. A comprehensive taxonomy for explainable artificial intelligence: A systematic survey of surveys on methods and concepts. Data Min. Knowl. Discov. 2023, 1–59. [Google Scholar] [CrossRef]
  36. Chan, J.Y.; Bea, K.T.; Leow, S.M.; Phoong, S.W.; Cheng, W.K. State of the art: A review of sentiment analysis based on sequential transfer learning. Artif. Intell. Rev. 2023, 56, 749–780. [Google Scholar] [CrossRef]
  37. Haruna, Y.; Qin, S.; Mbyamm Kiki, M.J. An improved approach to detection of rice leaf disease with gan-based data augmentation pipeline. Appl. Sci. 2023, 13, 1346. [Google Scholar] [CrossRef]
  38. Kaya, M.; Bilge H, Ş. Deep metric learning: A survey. Symmetry 2019, 11, 1066. [Google Scholar] [CrossRef] [Green Version]
  39. Vinyals, O.; Blundell, C.; Lillicrap, T.; Wierstra, D. Matching networks for one shot learning. Adv. Neural Inf. Processing Syst. 2016, 29, 3630–3638. [Google Scholar]
  40. Snell, J.; Swersky, K.; Zemel, R. Prototypical networks for few-shot learning. Adv. Neural Inf. Process. Syst. 2017, 30, 4077–4087. [Google Scholar]
  41. Sung, F.; Yang, Y.; Zhang, L.; Xiang, T.; Torr, P.H.; Hospedales, T.M. Learning to compare: Relation network for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 1199–1208. [Google Scholar]
  42. Oreshkin, B.; Rodríguez López, P.; Lacoste, A. TADAM: Task dependent adaptive metric for improved few-shot learning. In Advances in Neural Information Processing Systems 31; Curran Associates, Inc.: Red Hook, NY, USA, 2018; pp. 721–731. [Google Scholar]
  43. Lee, K.; Maji, S.; Ravichandran, A.; Soatto, S. Meta-learning with differentiable convex optimization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–17 June 2019; pp. 10657–10665. [Google Scholar]
  44. Tian, Y.; Wang, Y.; Krishnan, D.; Tenenbaum, J.B.; Isola, P. Rethinking few-shot image classification: A good embedding is all you need? In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; Springer: Cham, Switzerland, 2020; pp. 266–282. [Google Scholar]
  45. Chen, W.Y.; Liu, Y.C.; Kira, Z.; Wang, Y.C.; Huang, J.B. A closer look at few-shot classification. arXiv 2019, arXiv:1904.04232. [Google Scholar]
  46. Gupta, A.; Thadani, K.; O’Hare, N. Effective few-shot classification with transfer learning. In Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Spain, 8–13 December 2020; pp. 1061–1066. [Google Scholar]
  47. Chen, Y.; Liu, Z.; Xu, H.; Darrell, T.; Wang, X. Meta-baseline: Exploring simple meta-learning for few-shot learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 9062–9071. [Google Scholar]
  48. Zhang, R.; Yang, S.; Zhang, Q.; Xu, L.; He, Y.; Zhang, F. Graph-based few-shot learning with transformed feature propagation and optimal class allocation. Neurocomputing 2022, 470, 247–256. [Google Scholar] [CrossRef]
  49. Xie, J.; Long, F.; Lv, J.; Wang, Q.; Li, P. Joint distribution matters: Deep brownian distance covariance for few-shot classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 7972–7981. [Google Scholar]
  50. Han, Q.; Yin, C.; Deng, Y.; Liu, P. Towards Classification of Architectural Styles of Chinese Traditional Settlements Using Deep Learning: A Dataset, a New Framework, and Its Interpretability. Remote Sens. 2022, 14, 5250. [Google Scholar] [CrossRef]
  51. Tian, G.; Qiao, Z.; Zhang, Y. The investigation of relationship between rural settlement density, size, spatial distribution and its geophysical parameters of China using Landsat TM images. Ecol. Model. 2012, 231, 25–36. [Google Scholar] [CrossRef]
  52. Fang, F.; Ma, L.; Fan, H.; Che, X.; Chen, M. The spatial differentiation of quality of rural life based on natural controlling factors: A case study of Gansu Province, China. J. Environ. Manag. 2020, 264, 110439. [Google Scholar] [CrossRef] [PubMed]
  53. Song, M.; Zhang, Y. Research on the relationship between geographical factors, sports and culture. Adv. Phys. Educ. 2017, 8, 66–70. [Google Scholar] [CrossRef] [Green Version]
  54. Tambassi, T. From geographical lines to cultural boundaries. Mapp. Ontol. Debate. Riv. Estet. 2018, 67, 150–164. [Google Scholar] [CrossRef]
  55. Potosyan, A.H. Geographical features and development regularities of rural areas and settlements distribution in mountain countries. Ann. Agrar. Sci. 2017, 52, 32–40. [Google Scholar] [CrossRef]
  56. Fang, Y.; Jawitz, J.W. The evolution of human population distance to water in the USA from 1790 to 2010. Nat. Commun. 2019, 10, 430. [Google Scholar] [CrossRef] [Green Version]
  57. Shao, Y.; Chen, Y.; Su, J. Understanding of the settlements with coexisting water and earth under the background of climate change—The case of Liang Village in Pingyao County, China. Built Herit. 2022, 6, 1–20. [Google Scholar] [CrossRef]
  58. Yadava, R.N.; Sinha, B. Vulnerability assessment of forest fringe villages of Madhya Pradesh, India for planning adaptation strategies. Sustainability 2020, 12, 1253. [Google Scholar] [CrossRef] [Green Version]
  59. Xu, D.; Deng, X.; Huang, K.; Liu, Y.; Yong, Z.; Liu, S. Relationships between labor migration and cropland abandonment in rural China from the perspective of village types. Land Use Policy 2019, 88, 104164. [Google Scholar] [CrossRef]
  60. Shoji, G.; Yoshida, K.; Yokoyama, S.; Thompson, E.C. Transition of farmland use in a Japanese mountainside settlement: An analysis of the residents’ career histories. Geogr. Rev. Jpn. Ser. B 2020, 93, 15–26. [Google Scholar]
  61. Shoji, G.; Yoshida, K.; Yokoyama, S.; Thompson, E.C. Vegetation series as a marker of interactions between rural settlements and landscape: New insights from the archaeological record in Western Sicily. Landsc. Res. 2020, 45, 484–502. [Google Scholar]
  62. Wang, Y.; Jin, C.; Lu, M.; Lu, Y. Assessing the suitability of regional human settlements environment from a different preferences perspective: A case study of Zhejiang Province, China. Habitat Int. 2017, 70, 1–12. [Google Scholar] [CrossRef]
  63. Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar]
Figure 1. The workflow of the proposed method, which has four major components: (a) dataset construction; (b) benchmarking the dataset; (c) proposing a new method; (d) model training and evaluation.
Figure 1. The workflow of the proposed method, which has four major components: (a) dataset construction; (b) benchmarking the dataset; (c) proposing a new method; (d) model training and evaluation.
Applsci 13 04778 g001
Figure 2. Location of the three researched areas. (a) The provincial boundaries of China; (b) the location of the Qiandongnan Miao and Dong Autonomous Prefecture; (c) the location of the Shaanxi Province; (d) the location of the Anhui Province.
Figure 2. Location of the three researched areas. (a) The provincial boundaries of China; (b) the location of the Qiandongnan Miao and Dong Autonomous Prefecture; (c) the location of the Shaanxi Province; (d) the location of the Anhui Province.
Applsci 13 04778 g002
Figure 3. Remote sensing images of hillside pattern settlement. (a) Zeli settlement in Qiandongnan of the Guizhou Province; (b) Zhangzhuang settlement in the Shaanxi Province.
Figure 3. Remote sensing images of hillside pattern settlement. (a) Zeli settlement in Qiandongnan of the Guizhou Province; (b) Zhangzhuang settlement in the Shaanxi Province.
Applsci 13 04778 g003
Figure 4. Some examples of pairs of remote sensing images (top), DEM data (middle), and settlement environment maps (bottom) belong to the following five environmental patterns: (a) river valley; (b) foothill; (c) riverine; (d) plain; (e) hillside.
Figure 4. Some examples of pairs of remote sensing images (top), DEM data (middle), and settlement environment maps (bottom) belong to the following five environmental patterns: (a) river valley; (b) foothill; (c) riverine; (d) plain; (e) hillside.
Applsci 13 04778 g004
Figure 5. The framework of the proposed method. (a) Training the pre-segmentation model using the segmentation label of the TCS environment pattern dataset; (b) training a CNN with all of the categories; by removing the FC layer, the network generates a feature encoder, f θ ; (c) training the meta-classification model by optimizing the parameter θ from multiple episodes; (d) evaluating the performance of the meta-classification model by sampling new episodes from the test set.
Figure 5. The framework of the proposed method. (a) Training the pre-segmentation model using the segmentation label of the TCS environment pattern dataset; (b) training a CNN with all of the categories; by removing the FC layer, the network generates a feature encoder, f θ ; (c) training the meta-classification model by optimizing the parameter θ from multiple episodes; (d) evaluating the performance of the meta-classification model by sampling new episodes from the test set.
Applsci 13 04778 g005
Figure 6. The network structure of the pre-segmentation model in our method. The encoder module encodes multi-scale contextual information, while the decoder module refines the segmentation results along object boundaries.
Figure 6. The network structure of the pre-segmentation model in our method. The encoder module encodes multi-scale contextual information, while the decoder module refines the segmentation results along object boundaries.
Applsci 13 04778 g006
Figure 7. Accuracy and loss learning curve for the representative CNNs, includes AlexNet, ResNet50, and DenseNet121.
Figure 7. Accuracy and loss learning curve for the representative CNNs, includes AlexNet, ResNet50, and DenseNet121.
Applsci 13 04778 g007
Figure 8. Accuracy and loss learning curve of tuned CNNs, includes AlexNet-PS, ResNet50-PS, and DenseNet121-PS.
Figure 8. Accuracy and loss learning curve of tuned CNNs, includes AlexNet-PS, ResNet50-PS, and DenseNet121-PS.
Applsci 13 04778 g008
Table 1. The number of TCSs for each environmental pattern in the three areas.
Table 1. The number of TCSs for each environmental pattern in the three areas.
PatternsQiandongnanShaanxiAnhuiTotal
River valley1401341194
Foothill962824148
Riverine1982350
Hillside1382032190
Plain16242666
Total40993146648
Table 2. Definitions of seven segmentation categories.
Table 2. Definitions of seven segmentation categories.
CategoriesList of Contents and Description
SettlementMan-made, built-up areas with human artifacts
MountainThe sum of a number of adjacent mountain ranges that are regularly distributed along a certain direction
WaterRivers, oceans, lakes, wetlands, ponds
ForestAny land with at least 20% tree crown density
FarmlandAny planned plantation, cropland, orchards
VegetationAny non-forest, non-farm, green land, grass
WastelandRock, dessert, land with no vegetation
Table 3. Splitting the TCS environmental pattern dataset.
Table 3. Splitting the TCS environmental pattern dataset.
AreasNumbers
TrainQiandongnan327
ValidationQiandongnan82
Test1Shaanxi93
Test2Anhui146
Table 4. Performance of three CNNs on the test set. The best metrics are highlighted in bold font.
Table 4. Performance of three CNNs on the test set. The best metrics are highlighted in bold font.
ModelAcc1 (%)Acc2 (%)mAcc (%)
AlexNet42.5247.2644.89
ResNet5046.2450.6848.46
DenseNet12153.7657.5355.65
Note: Acc1 refers to the recognition accuracy of the test set constructed from TCS data in the Shaanxi Province, while the data in Acc2 are from the Anhui Province; mAcc is the average of the accuracies of the two areas.
Table 5. Performance of three CNNs with settlement environment maps on the test set. The best metrics are highlighted in bold font.
Table 5. Performance of three CNNs with settlement environment maps on the test set. The best metrics are highlighted in bold font.
ModelAcc1 (%)Acc2 (%)mAcc (%)
AlexNet-PS58.06 (+15.54)60.96 (+13.70)59.51 (+14.62)
ResNet50-PS68.82 (+22.58)71.92 (+21.24)70.37 (+21.91)
DenseNet121-PS72.04 (+18.28)75.34 (+17.81)73.69 (+18.04)
Note: Acc1 refers to the recognition accuracy of the test set constructed from the TCS data in the Shaanxi Province, while the data in Acc2 are from the Anhui Province; mAcc is the average of the accuracies of the two areas.
Table 6. Performance comparison of the proposed method and others on the test set. The symbol * indicates that we have re-implemented Meta-Baseline [47] and Meta DeepBDC [49] with DenseNet 121 [28] as the backbone and adapted to accept four-channels inputs consisting of remote sensing images and DEM data. The best metrics are highlighted in bold font.
Table 6. Performance comparison of the proposed method and others on the test set. The symbol * indicates that we have re-implemented Meta-Baseline [47] and Meta DeepBDC [49] with DenseNet 121 [28] as the backbone and adapted to accept four-channels inputs consisting of remote sensing images and DEM data. The best metrics are highlighted in bold font.
Model3-Way 1-Shot (%)3-Way 5-Shot (%)
Acc1AlexNet-PS-MC70.58 (+28.06)76.88 (+34.36)
ResNet50-PS-MC78.58 (+32.34)85.58 (+37.12)
DenseNet121-PS-MC81.65 (+27.89)87.79 (+34.03)
Meta-Baseline * [47]70.5378.24
Meta DeepBDC * [49]72.2680.45
Acc2AlexNet-PS-MC71.63 (+24.37)78.74 (+31.48)
ResNet50-PS-MC82.57 (+31.89)88.32 (+37.64)
DenseNet121-PS-MC85.45 (+27.92)90.46 (+32.93)
Meta-Baseline * [47]74.7580.68
Meta DeepBDC * [49]76.4883.29
Note: Acc1 refers to the recognition accuracy of the test set constructed from TCSs data in the Shaanxi Province, while the data in Acc2 are from the Anhui Province.
Table 7. Ablation study using DenseNet121-PS-MC on the Test2 test set; “default” indicates the base model, “without” indicates the removal of the component from the base model, “input size” indicates the scale of the input settlement environment map, and the arrow indicates the adjustment made. The best metrics are highlighted in bold font.
Table 7. Ablation study using DenseNet121-PS-MC on the Test2 test set; “default” indicates the base model, “without” indicates the removal of the component from the base model, “input size” indicates the scale of the input settlement environment map, and the arrow indicates the adjustment made. The best metrics are highlighted in bold font.
Difference3-Way 1-Shot (%)3-Way 5-Shot (%)
default85.4590.46
without pre-segmentation 75.0481.46
without data augmentation81.6885.72
without pre-training80.7186.25
without meta-training75.6781.82
256 × 256 input size → 128 × 128 input size83.7489.69
256 × 256 input size → 512 × 512 input size85.6490.53
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kong, Y.; Xue, P.; Xu, Y.; Li, X. An Environmental Pattern Recognition Method for Traditional Chinese Settlements Using Deep Learning. Appl. Sci. 2023, 13, 4778. https://doi.org/10.3390/app13084778

AMA Style

Kong Y, Xue P, Xu Y, Li X. An Environmental Pattern Recognition Method for Traditional Chinese Settlements Using Deep Learning. Applied Sciences. 2023; 13(8):4778. https://doi.org/10.3390/app13084778

Chicago/Turabian Style

Kong, Yueping, Peng Xue, Yuqian Xu, and Xiaolong Li. 2023. "An Environmental Pattern Recognition Method for Traditional Chinese Settlements Using Deep Learning" Applied Sciences 13, no. 8: 4778. https://doi.org/10.3390/app13084778

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop