DL-SLICER: Deep Learning for Satellite-Based Identification of Cities with Enhanced Resemblance

Bissarinova, Ulzhan; Tleuken, Aidana; Alimukhambetova, Sofiya; Varol, Huseyin Atakan; Karaca, Ferhat

doi:10.3390/buildings14020551

Open AccessArticle

DL-SLICER: Deep Learning for Satellite-Based Identification of Cities with Enhanced Resemblance

¹

Institute of Smart Systems and Artificial Intelligence, Nazarbayev University, Astana 010000, Kazakhstan

²

Department of Civil Engineering, School of Engineering and Digital Sciences, Nazarbayev University, Astana 010000, Kazakhstan

^*

Author to whom correspondence should be addressed.

Buildings 2024, 14(2), 551; https://doi.org/10.3390/buildings14020551

Submission received: 7 December 2023 / Revised: 1 February 2024 / Accepted: 11 February 2024 / Published: 19 February 2024

(This article belongs to the Special Issue Artificial Intelligence and Buildings: Design, Analysis, and Construction)

Download

Browse Figures

Versions Notes

Abstract

:

This paper introduces a deep learning (DL) tool capable of classifying cities and revealing the features that characterize each city from a visual perspective. The study utilizes city view data captured from satellites and employs a methodology involving DL-based classification for city identification, along with an Explainable Artificial Intelligence (AI) tool to unveil definitive features of each city considered in this study. The city identification model implemented using the ResNet architecture yielded an overall accuracy of 84%, featuring 45 cities worldwide with varied geographic locations, Human Development Index (HDI), and population sizes. The portraying attributes of urban locations have been investigated using an explanatory visualization tool named Relevance Class Activation Maps (CAM). The methodology and findings presented by the current study enable decision makers, city managers, and policymakers to identify similar cities through satellite data, understand the salient features of the cities, and make decisions based on similarity patterns that can lead to effective solutions in a wide range of objectives such as urban planning, crisis management, and economic policies. Analyzing city similarities is crucial for urban development, transportation strategies, zoning, improvement of living conditions, fostering economic success, shaping social justice policies, and providing data for indices and concepts such as sustainability and smart cities for urban zones sharing similar patterns.

Keywords:

city identification; city similarity; urban planning; satellite data; machine learning; deep learning; explainable AI; saliency map

1. Introduction

Currently, the majority of the global population lives in cities, and urbanization rates are expected to continue rising [1]. Thus, cities play a significant role in shaping the sociological, economic, and environmental landscapes of the world. Their growing importance makes it vital to analyze and measure them accurately [2]. However, this is complicated by the unique identity and urban features of each city.

In general, spatial similarity refers to the extent to which two geographical entities share the same characteristics [3]. Cities can be compared based on different aspects, including, infrastructure, layout, societal and cultural peculiarities, historical background, economic situation, and even local user-generated content [3]. Moreover, understanding urban similarities could be important in shaping decisions that effectively improve the living conditions across cities with comparable features. In addition, analysis of the commonalities of cities can be helpful in the analysis of economic growth and understanding which features lead to the success of one city [4]. Furthermore, city commonality analyses can be used as data for further research, for example, index or ranking system development [3].

Identifying cities and developing knowledge about salient features can be used to guide the urban development of cities, for example, to keep or change their unique identity. In addition, such features can be useful in city categorization or analysis of city similarity [5]. Easy classification of city patterns can also aid in developing industrial solutions for natural resource management (air, water, and waste) [2]. Nevertheless, creating such a tool might be complicated because of rapid urbanization rates, which could change the urban typology and morphology.

While early research in measuring city similarity often relied on versatile data sources and distinct city features, to the best of our knowledge, there is limited evidence of dedicated efforts to identify cities based on satellite images. Analyzing separate features in isolation may offer a focused perspective on cities, but it can also lead to strong biases and an overwhelming reliance on them, potentially overlooking complex similarity patterns. In this regard, satellite images might serve as an integrated source capable of capturing a wealth of urban characteristics, including architectural patterns, transportation routes, geographical peculiarities, temporal and climatic conditions, and even air pollution rates. Several studies [2,6,7] have focused on classifying certain urban features based on satellite images, demonstrating the high accuracy of such models. This justifies considering satellite images as a reliable source for capturing various measurable traits, thereby integrating multiple features within themselves. Although these visual recognition methods prove highly accurate in specific cases, their use across multiple cities can be resource intensive and time consuming. Hence, a more adaptable methodology is needed to enable the development of more ubiquitous tools for the classification of cities.

Considering the gaps mentioned above, this research offers a novel artificial intelligence (AI)-driven city classification method which provides a homogeneous and unbiased result, employing visual and publicly accessible data focusing on factual circumstances and complex visual causalities. It offers a new perspective in the research domain by developing a deep learning (DL) tool that analyzes visual information from city satellite image patches. The main research question was to investigate the efficiency of a DL tool in classifying urban areas and identifying cities that exhibit similar characteristics using a collected dataset of satellite images. The contributions of this work are as follows:

We introduce a dedicated DL model for Satellite-based Identification of Cities with Enhanced Resemblance (DL-SLICER). Our model for identifying cities uses the ResNet architecture with various numbers of layers (18, 34, 50, 101). This model is the first DL city classification model in the literature, achieving an 84% accuracy rate for identifying a city from a satellite patch covering a 200 m by 200 m area.
We also present an open-source and publicly available dataset containing satellite images from 45 cities worldwide, labeled 565,938 satellite patches of 200 m by 200 m regions.
With the open-source data and models, our work can serve as a benchmark for identifying cities.
We conducted experiments using one of the latest Explainable AI tools, Relevance-CAM, to determine the features that characterize the cities from top views.

This paper is structured into six sections. This introduction is followed by a literature review (Section 2) focused on the scholarly challenges of using AI or satellite images in city assessments. Section 3 details the methods and tools we used and developed. Section 4 provides the main findings and discussions. Finally, Section 5 offers our concluding remarks, and Section 6 discusses future implementations and limitations and explores the practical and scholarly implications of the results.

2. Literature Review

2.1. City Similarity Tools

A comprehensive and uniform method to measure the similarity between cities does not exist. Frequently, cities are compared on the basis of certain characteristics such as basic indicators, including income distribution, costs, and ethnic composition. However, these characteristics only capture a small facet of the identity of a city, mostly related to economic and social dimensions. Earlier attempts at city clustering and identifying similar features can be found in the existing literature, and they use different data types and distinct methodologies.

Within academic research, the number of tools available for measuring city similarity is limited. For example, Saxena et al. developed a tool to identify cities similar to Delhi based on air quality metrics [8]. Cheng et al. (2022) developed an urban classification tool that aids in understanding and identifying urban environmental patterns [2]. This approach allowed them to measure the perception of city users through the photos they produced and shared. The researchers used certain city identity attributes for the image analysis, such as green areas, water resources, urban transportation systems, architectural forms, buildings, sports, and social activities.

In another study focusing on 385 European cities, researchers found that cities could be clustered according to their typology and environmental features [9]. A peer city identification tool was used for 960 cities in the United States (US), grouping them based on tabular data related various topics such as equity, resilience, outlook, and housing [10]. The authors of [11] conducted city map clustering using k-means of smart card data, aiding in the identification of city structures and clusters [11]. Costa and Tokuda (2022) investigated the similarity of 20 European cities based on their topology, utilizing a clustering method of street networks [4]. Seth et al. (2011) [12] found city similarities through query logs, suggesting that cities could be grouped not only by geographic location but also by the professional occupations of the populations (e.g., university students, high-tech companies, and defense contractors). For example, in their analysis of US cities, they found similarities among cities such as Boston, Brookline, New York, and Bethesda as well as Bethesda VA, Arlington VA, and Fort Myer VA [12]. A summary of early methods for measuring city similarity is provided in Table 1. The majority of the tools under consideration utilize unsupervised machine learning methods, which can result in reduced accuracy in certain instances.

Many non-academic methods for comparing and assessing cities focus on the cost of living. Examples include “Numbeo” [14], “Forbes Calculator” [15], and others [16,17]. The Urban Observatory, on the other hand, offers city comparisons across a variety of topics, including the type of work, transportation, and population density [18]. The ArcGIS Similarity Search tool allows for city comparisons based on attributes such as population, crime, and education [19]. ArcGIS developers emphasize the utility of this tool for various stakeholders, including retailers, policymakers, human resource specialists, law enforcement agencies, and academia. Shell has developed a tool that compares cities by considering such factors as the population density, the use of energy, and the need for energy resources [20].

In the case of tools focusing exclusively on US cities, they often compare cities using metrics related to the cost of living, employment, housing, and metropolitan statistical areas [21,22,23]. Additionally, some tools assess the impact of COVID-19 on local businesses as a basis for city comparisons [24]. It is worth noting that the majority of city similarity tools originate from the business sector rather than academic research.

2.2. Use of AI and Satellite Images in Urban Planning

AI techniques have been applied to analyze urban designs and city characteristics. These include fuzzy logic (FL), genetic algorithms (GA), neural networks (NNs), and simulated annealing (SA) [25]. NNs have been commonly used to predict land pollution (noise, air, water, waste) and changes in land use and form [26]. One of the stirring applications has been forecasting extreme temperatures and possible droughts in urban environments [27]. AI prospects in the field of urban construction and building design are also promising. Applications include the design of sustainable structures, structural health monitoring, soil analysis, and energy efficiency enhancements (e.g., variable heat flow and efficient use of solar panels) [25,28].

The combination of satellite imagery and AI has become increasingly prevalent, facilitating the analysis of parking, agricultural crops, and geological implications. Despite its increasing accessibility thanks to technological advancements, this approach is challenging due to imprecisions and artifacts in satellite imagery [29,30]. Convolutional NNs, namely U-Net and Mask R-CNN, have proven successful in satellite image understanding for building detection [29]. Other research has demonstrated the utility of deep neural networks (DNNs) for clustering urban land images even in low-resolution aerial photos [31]. In another study, Google Earth data were used to train an AI model to categorize cities by their formality level [2].

AI applications in geography extend to recognizing terrain features and land classification [32]. In one case, satellite images of green spaces in Colombo helped to predict air quality in the city [33]. Researchers also studied how a city grows using satellite images and applying NN algorithms as well as maximum likelihood and shortest distance methods [34]. Researchers have also studied urban growth using satellite images and NN algorithms algorithms as well as maximum likelihood and shortest distance methods [34]. Object-based image analysis (OBIA) and a SVM have been employed to understand urban expansion using satellite imagery from different regions (e.g., Canada, Sweden, and China) [35]. In another study, satellite images were processed to identify poor urban regions, which are also indexed by ten categories, from slums to more structured neighborhoods [36].

Overall, DL methods are frequently employed for land classification with satellite imagery. They are instrumental in mapping urban areas over different years using high-resolution satellite images for further analysis of urban development [37]. Pixel-based satellite images are commonly preferred over object-based ones for city mapping because of their efficiency [38]. However, the existing literature suggests a research gap concerning the use of satellite images and urban plans for city indexing, rating, or ranking, particularly for purposes such as sustainability assessment or district similarities, thus warranting further investigation. This limitation indicates a gap in the development of a more adaptable and efficient methodology. Such a methodology is essential for enabling broader application of tools in the classification of cities, reducing the need for extensive resources and time commitment. The research gap, therefore, lies in creating a scalable and less resource-intensive approach that maintains accuracy while being applicable to a diverse range of urban environments across multiple cities. Furthermore, there is a notable deficiency in the comprehensive analysis of Central Asian cities within the existing literature on urban studies. This research gap highlights the need for more focused inquiries to comprehend the unique characteristics of this region.

Analyzing Central Asian cities has been a missed opportunity in the global spectrum of urban patterns that can be extracted from satellite data. The region is often overlooked in literature, including city knowledge databases, owing to its historical background and barriers imposed by socio-economic development levels. Nevertheless, the region holds significant interest for audiences thanks to the intersection of unique cultural and historical traits in Central Asia, rooted in Turkic groups, and echoing the Soviet Era through city infrastructure and architecture. Central Asia comprises Kazakhstan, Kyrgyzstan, Tajikistan, Turkmenistan, and Uzbekistan. After the collapse of the Soviet Union in 1991, these countries emerged as separate entities, often omitted in most geographic-related studies. Despite the centralized management of these countries, each Central Asian nation possesses distinct cultural and architectural peculiarities specific to the region, yet shares some commonalities. In our study, we included either the capital city or economically significant metropolitan areas of each country in Central Asia.

3. Methods

This work deals with a DL task that relies on a classification model. The aim of the developed model is to identify a city by a given satellite image patch. This section provides a detailed flow of the work, covering the description of the data collection process—Section 3.1, the structure of the collected dataset—Section 3.2, the data preprocessing steps—Section 3.3, the DL-SLICER model—Section 3.4, and explanatory visualizations—Section 3.5 related to the performance of the model.

3.1. Data Collection

To train a city classification model, we collected a dataset consisting of images captured by satellites for 45 global cities, ensuring a diverse representation in terms of geographical locations and socioeconomic development levels. Additionally, we included eight cities (Almaty, Ankara, Ashgabat, Astana, Baku, Bishkek, Shymkent, and Tashkent) from regions of the world that are underrepresented in city studies, with most of the cities being located in Central Asia. Figure 1 presents all cities in the dataset, offering key details about each city, including a city code, population, human development index (HDI), country, continent, latitude, and longitude. This figure illustrates the breadth of the dataset, covering cities from various regions across the globe, exhibiting not only diverse geographical locations but also a wide range of population sizes and HDI values. Corresponding tabular representation of the same data can be found in Table A1 presented in Appendix A.

For each city, high-resolution satellite images (4800 × 4800 pixels) of 2 km × 2 km regions from the Google Earth Pro 7.1 software in jpg format were downloaded. A minimum of three regions (on average, 3.1 regions) with high construction activity for each city were chosen. The ‘Historical Imagery’ feature of Google Earth Pro was used to collect available images from 2018 to 2022, resulting in an average of 35 images for each city. This approach provides longitudinal information as it includes images of the same region for different dates. Images with substantial cloud coverage, bold shadows, and other artifacts were removed to ensure the dataset consists of high-quality images. Examples of such low-quality images are shown in Figure 2. The final dataset contains 1585 satellite images with a total size of 16 GB. The images for the dataset were downloaded manually.

3.2. Dataset Structure

The naming convention for each satellite image follows the format ‘City IATA Code_# of the site_year_month_day.jpg’, where the IATA code represents a unique three-letter code assigned by the International Air Transport Association. For example, the Almaty city image from site 1, captured on 9 August 2022, was saved with the filename ‘ALA_S1_2022_08_09.jpg’. For example, an image of Almaty city from site 1, captured on 9 August 2022, was saved with the filename ‘ALA_S1_2022_08 _09.jpg’.

In the dataset, each city is organized into a dedicated folder, and within this folder, there is a subfolder for each region. These subfolders contain images of the region captured on different dates, along with a metadata text file. The metadata file is named according to the convention ‘City IATA Code_# of the site.txt’ and provides information in text format, including the lower-left pixel coordinates, upper-right pixel coordinates, and camera elevation details. The corresponding folder structure for all three sites of a city is illustrated in Figure 3.

3.3. Data Preprocessing

The developed DL models are designed to predict a city based on small satellite image patches. Given that the original satellite images covered 2 km by 2 km regions, we created 200 m by 200 m patches (equivalent to 480 × 480 resolution) with a 240-pixel overlap. As a result, 373 processed patches were generated for each raw satellite image, leading to a total of 565,938 image patches in the ‘preprocessed’ part of the dataset. This ‘preprocessed’ portion of the dataset occupies 35.7 GB of storage. The folder structure for this part is depicted in Figure 3.

Images from region S3 (or S4 for cities where four sites were available) were split into east and west halves. The east halves were assigned to the validation set, whereas the west halves were used as an independent test set. Patches from the remaining regions (S1 and S2) of each city were utilized for machine-learning model training. As a result, the training, validation, and test splits contain 370,386, 86,526, and 86,526 patch images, respectively. To provide an illustration, the regions for Astana and a selection of sample patches are presented in Figure 4.

3.4. DL-SLICER Model for City Classification

One of the most influential DL models for image recognition in computer vision is the Deep Residual Networks architecture, also known as ResNet [39]. Since its introduction, it has proven to be highly effective for various tasks. The complexity of the ResNet model can be adjusted by changing the number of its layers, resulting in variants such as ResNet-18, ResNet-34, and ResNet-50. In these model names, the numbers indicate the ResNet architecture with a specific number of NN layers.

ResNet is a special type of Convolutional Neural Networks (CNN), which enables training of extremely deep neural networks without running into a “vanishing gradient” problem. The emergence of CNNs in the literature facilitated performance of deep learning models in image-related tasks such as object detection, image classification, or segmentation. The state-of-the-art performance of CNNs was achieved by sparse connection usage instead of fully-connected layers, weights sharing applied across the whole image and the concept of pooling. While it was noted that adding more layers to the CNNs saturated performance of models, too many layers in the underlying architecture has demonstrated degradation of accuracy due to the “vanishing gradients” [39]. The introduction of residual blocks comprising ResNets has resolved the 0 gradient problem. The main idea behind the mechanics of residual blocks is a process called skip connections, during this process activations of layers are bridged via skipping some layers in between. The advantage of such an approach lies in skipping layers that damage the performance of architecture by regularization. This way very deep networks with 100–1000 layers can be trained without 0 gradient problem, and get a significant boost in model performance.

For the purposes of this work, ResNet versions with 18, 34, 50, and 101 layers were trained to address a city classification task. The city classification models were trained for 100 epochs with a learning rate of 10⁻³ and a batch size of 128 using the Adam optimizer [40]. To evaluate the model performance for city classification, we employed the accuracy score (1), which represents the ratio of correctly identified samples to the total number of samples:

Accuracy score = \frac{T P + T N}{T P + T N + F P + F N},

(1)

where TP corresponds to the number of correctly identified positive testing samples, TN indicates the number of correctly identified negative testing samples, FP shows the number of misclassified positive samples, and FN stands for the number of misclassified negative items.

3.5. Explanatory Visualizations

To identify the features that have the most significant impact on the performance of the DL model, we conducted an explanatory visualization analysis. While the primary source of input features is satellite images, which can cover various elements, such as architectural patterns, transportation routes, geographical characteristics, temporal and climatic conditions, and even visual representations of air pollution rates, it is not immediately evident which specific patterns influence the ability of a model to identify a particular city. This part of AI, which sheds light on the decision-making processes occurring within the “black box” of machine learning models, is known as Explainable AI.

One of the latest Explainable AI tools available for inspecting classification algorithms and deducing the salient features of computer vision models is Relevance CAM [41]. Relevance CAM relies on class activation maps (CAM) [42] and layer-wise relevance propagation to compute the features that play a conclusive role in a class identification problem. It produces saliency maps highlighting image areas with significant weight in determining the city class. In our research, we utilized this tool to identify the visual features of cities that influenced the model decision in class determination.

4. Results and Discussions

4.1. City Classification

Our findings demonstrate that the DL method, using satellite images of cities, is a powerful tool that provides superior and objective visual information on the city classification challenge. Table 2 presents the results obtained for city classification using different ResNet architectures.

We identified the epoch number that yielded the best performance on the validation set and subsequently evaluated the model on the independent test set. The results from both the validation and test sets consistently indicate that ResNet-50 delivers the best performance for the city classification task. The overall testing accuracy achieved 83.9% in the classification task using the ResNet-50 model. These results for validation and test performance reflect a high degree of accuracy, affirming the effectiveness of the method in classifying cities based on their visible urban characteristics. It should be noted that our developed methodology specifically focuses on visible urban characteristics, including urban morphology, structures, and the overall appearance of buildings and other urban assets.

It is well known that some cities share similar urban characteristics, which can pose a significant challenge to achieving higher prediction performance in city classification. To comprehend how these city similarities impact the classification performance of the proposed method, we present additional test results in Table 3 and Table 4, demonstrating the best and worst performing ten cities, along with a list of the three most frequently confused cities (those classified as the target city).

High classification accuracy underscores the presence of unique urban patterns that are prevalent across most areas within the designated urban districts. The best-predicted cities in the classification analysis include Ankara, Buenos Aires, Cairo, Chicago, Hanoi, Mumbai, Oslo, Seoul, Melbourne, and Lisbon (Table 3).

For example, Cairo (see Figure 5) is distinguishable by its specific housing shapes and the prevalent use of brown-colored materials for house walls and roofs. This specific spatial organization has been highlighted in [43] as an influence of Islamic culture, while the common use of brick as a wall material aligns with the findings of the tool. Ankara is identified by its distinctive red roofs. Thus, one of the Milan patches was misclassified as Ankara because of the presence of red roofs (see Figure 6). Meanwhile, Hanoi is identified through its unique densely populated areas with red roofs and the presence of water objects. The tremendous urban density in Hanoi, resulting in its distinct urban typology, has also been emphasized in other works [44,45].

In contrast, the cities with the lowest prediction accuracy were Astana, Baku, Istanbul, Shymkent, Singapore, Milan, Bishkek, Paris, Brisbane, and Tashkent. The confusion or misclassifications may be attributed to the selection of patches used during training, as the distinctive features of these cities were not sufficiently represented in the training data for the model to learn. In addition, these cities may share similar features, such as patterns, designs, forms, or other characteristics (Table 4).

Most of the cities that were similarized to each other are either located in the same country or neighboring countries. Refer to Table 2 for instances of such confusions, such as Astana, Almaty, Baku, Tashkent, Bishkek, Shymkent, and Paris, Dublin, Lisbon). Geographic proximity is claimed to be a contributing factor to the similarity of cities, as noted in [13]. Munich and Milan, which are frequently confused, are also grouped together because of shared characteristics such as high administrative area, population density, and a low presence of green spaces [9].

In contrast, Paris, Lisbon, and Dublin, which were confused in this study, are entirely different clusters in the study of Gregor et al. [9], while another study observes the similarities of the cities of such countries as France-Spain-Germany, and Italy-Spain-Germany [4].

Similarly, the current research confuses Italian and German cities (e.g., Milan and Munich) but does not find many similarities between French (Paris) and Spanish and other German cities. In another study, Paris, Vienna, Prague, and Barcelona were claimed to be similar in terms of architectural attributes [13].

With respect to US cities, it has been suggested that resilient features, such as economic change and labor conditions, are similar in Boston, Chicago, and Washington [10], and these cities were also confused with each other in the current study. Chicago and Boston were found to share similarities in terms of the economic impacts of COVID-19 by another city similarity tool [24].

It has been observed that South Asian cities are losing their uniqueness due to rapid urbanization rates, making them appear more similar, particularly in terms of skyscrapers and contemporary glass buildings. Strong analogies have been noted between Kuala Lumpur and Beijing [46]. However, in contrast, this study did not find classification confusion between these two cities. Complete confusion matrix of 45 cities and corresponding confused labels are provided in Figure A1 of Appendix B.

Overall, the diverse geographic locations of the cities enhance the strength of the study in terms of big data; however, it also dilutes its focus. It is worth noting that some of the referenced studies and tools [4,9,10] focus on smaller amount of urban areas of a particular region (e.g., the US or Europe), making their findings more specific.

4.2. Salient Features of Urban Patterns

The saliency feature maps have been created for all the cities whose satellite images were used in this study. In this section, specific attention is given to cities from different continents, such as Almaty, Paris, Tokyo, and San Francisco. Subsequently, a discussion on the most and least accurate classifications is presented. All the saliency feature maps are accessible via our Github repository (https://github.com/IS2AI/city-identification, accessed on 1 February 2024).

4.2.1. Almaty

Founded in 1854, Almaty is the former capital of Kazakhstan. It is located in the vicinity of the Ile Alatau mountains. The city is famous for being an industrial center (food and light industries). The name is associated with the abundance of apple trees growing in the region. Because of the close proximity of mountains, there is a significant geological risk for the city, which has already been subject to dangerous earthquakes and mudflow. The city is secured with a 140 m dam to prevent potential mudflows [47].

An examination of the saliency map for Almaty (see Figure 7) reveals several salient features. Notably, the map highlights the presence of private housing with grey and brown-colored roofs situated near or intertwined with trees. This recurring pattern led to misclassifications of Almaty with cities such as Astana and Tashkent, which exhibit similar urban characteristics.

4.2.2. San Francisco

San Francisco (CA, USA) serves as both a hub for culture and finance in the west side of the US and stands out as one of the nation’s most diverse and metropolitan cities. San Francisco is situated on a hilly and square-shaped landmass at the northern end of a peninsula. Because early city planners favored a grid pattern, downtown streets were located along hills. San Francisco’s urban landscape features office buildings in the central area, green spaces in the western part, and residential buildings in other districts. Residential buildings are known for pastel-colored plastering on houses and multi-colored wooden buildings. Another significant and historic city feature is its cable tram transportation, which is still in operation [48]. San Francisco is distinguished by its unique tram cars [13].

The Relevance CAM results in Figure 8 for explainable visualization identify specific features that contribute to the classification of San Francisco. These include (1) grey and light grey roofs, (2) high-rise buildings with discernible long shadows, and (3) triangular-shaped building blocks characterized by two roads intersecting at acute angles. The misclassification of San Francisco with Tashkent primarily occurs when satellite images contain red-colored large roofs.

4.2.3. Paris

Paris, the capital of France, is located along the river Seine, which historically divides the city into the Central part, the Left bank (an intellectual center), and the Right bank (an economic heart). In general, the shape of the city is circular and consists of 20 districts. Paris is renowned for preserving its architectural heritage, including buildings, gardens, and streets. The city is full of green areas, which include parks, gardens, and squares [49].

The Relevance CAM results for Paris in Figure 9 highlight salient patterns that enable the identification of Paris. These patterns include (1) dense, high, and circular-shaped trees, (2) mid-height buildings with minimal shadows on satellite images, and (3) specific arrangements of buildings characterized by narrow, non-linear roads between housing rooftops.

These saliency patterns align with the findings of other research on Parisian urban typology, emphasizing the presence of balconies with railings and arranged in a grid pattern, lamp posts on tall bases, mid-height buildings arranged in a regular pattern, and distinctive vegetation [50]. Our findings support these observations.

4.2.4. Tokyo

Tokyo, the capital of Japan, is situated in the northern part of Tokyo Bay along the Pacific coastline of central Honshu. Unlike cities with a central business district, Tokyo features multiple urban areas clustered around railway stations, surrounded by department stores, hotels, corporate towers, and cafés. The architectural landscape in these districts spans from historic stone and brick constructions to modern skyscrapers made of concrete and steel. Traditional Japanese wooden houses are also prevalent, and green gardens dot the urban landscape [51]. In terms of transportation attributes, Tokyo features narrow streets—a consequence of its high population density [13].

The saliency map for Tokyo, as shown in Figure 10, reveals that our instrument identifies Tokyo primarily through patterns of private houses with various colors of roofs (blue, red, brown, grey, and green) located in close proximity to one another.

In the given research work, we approached the well-known city similarity problem from a different perspective and solved it in an AI-based dimension. The important finding of this research is that DL-based methodology using satellite data is applicable for resolving a city similarity problem and has been proven effective and comparable to the research results conducted via different approaches. In the present state of scientific problem solving, a city similarity was measured often via clustering and other machine learning methods such as a SVM, employing tabular data and street topologies which might capture limited information. In addition, historical data usage is mostly missed without regards to retrospective of urban locations in the past unlike in the given study. However, the research still contains some limitations due to the inherent nature of satellite data. There is a lack in capturing the social parameters since they do not have an effective footprint in the satellite images. In addition, hidden infrastructure like the subway in London or Paris can not be extracted from satellites unless this information is explicitly provided as input.

The city classification method developed in this research has the potential to open a new direction for urban developers and the remote sensing research community.

5. Conclusions

This study conducts a comprehensive analysis of satellite maps from 45 different cities, successfully identifying cities with similar characteristics.The findings suggest that cities can exhibit similarities based on their visual layouts. These similarities often result from factors such as historical and geographical proximity, the use of similar structural materials (e.g., roofing materials or contemporary glass in buildings), and the proportion of administrative areas relative to green spaces. This research emphasizes the potential for using visual features in satellite imagery to discern patterns and commonalities in urban design, highlighting the influence of history, geography, and urban planning choices on the visual identity of cities.

Despite the fact that there are a number of tools presented in the literature dedicated to finding similarity patterns across metropolitan areas, urban locations lack analysis from the AI perspective. The global trends are actively recruiting AI techniques to solve a number of tasks, and the purpose of the presented study is to showcase the efficiency of AI-based techniques to solve the well-known problem of city similarity. The results of the given work enables extension of the similarity accuracy and generalizability of the results in the AI-based dimension. This in turn can facilitate analysis of other related tasks by AI in urban studies such as sustainability indexes prediction.

Nonetheless, the present study is subject to several limitations that warrant discussion. The scope of data employed in this study may be regarded as inadequate to comprehensively capture the full complexity of sustainable city development. As mentioned in earlier sections, the findings do not provide the underlying factors that inform the classification challenges of AI, attributable to the inherent “black box” characteristics of DL methodologies. This limitation could be mitigated by increasing the number of city maps incorporated into the analytical model, which would enable a more comprehensive and nuanced assessment of sustainable urban development across diverse geographical contexts. Furthermore, this study has not extensively explored cities in Latin America and Africa, presenting a valuable opportunity for future research to delve into these regions. We suggest that further investigation into these areas could significantly enhance our understanding of urban similarities and differences on a global scale. To improve the model by expanding the geographical coverage of cities, our models, the dataset (both original and pre-processed), and the code base are publicly available in our GitHub (https://github.com/IS2AI/city-identification, accessed on 1 February 2024) repository and can be further extended.

6. Implementations

The instrument formulated in the present investigation exhibits multifaceted utility across diverse domains. One of its manifold applications lies in enabling the analysis of urban saliency maps, which, in turn, facilitates the development of an AI model for urban area identification. With its efficacy and versatility, this tool presents itself as a promising addition to the arsenal of urban planners and policymakers with a wide range of goals. The current study benefited from implementing exploratory data analysis through data visualization techniques, enhancing its capacity to investigate and identify potential constituents of the input images that correspond with the visual urban characteristics employed in regression analysis. Another possible application is that real-time traffic data can be incorporated into traffic congestion heatmaps to provide a dynamic view of the transportation efficiency of a city or the impact of certain activities, such as construction operations and sites on urban traversability. Furthermore, the overlay of zoning data on satellite imagery could deliver a visually compelling portrayal of land use and zoning compliance, highlighting the extent to which a city adheres to its urban planning regulations.

Meanwhile, assessing public space accessibility, water body condition, and the preservation of cultural heritage sites through the lens of satellite imagery deepens our understanding of the sustainability journey of a city.

Lastly, the visual representation of waste disposal sites and landfills can elucidate waste management practices, making the assessment more tangible. These enhancements are some of the numerous future implementation potentials that can be supported by the tool developed in this research, and they provide a robust and scientifically sound evaluation of urban sustainability, empowering city planners and policymakers with a potent tool to drive cities toward a more sustainable future.

Author Contributions

Conceptualization, H.A.V. and F.K.; methodology, U.B. and H.A.V.; software, U.B. and S.A.; validation, U.B.; formal analysis, A.T.; investigation, U.B. and A.T.; resources, H.A.V. and F.K.; data curation, S.A.; writing—original draft preparation, U.B and A.T.; writing—review and editing, H.A.V. and F.K.; visualization, U.B.; supervision, H.A.V. and F.K.; project administration, F.K.; funding acquisition, F.K. All authors have read and agreed to the published version of the manuscript.

Funding

The authors acknowledge the financial support from the Nazarbayev University Collaborative Research Program (Funder Project Reference 20122022CRP1606).

Data Availability Statement

The data presented in this study are openly available at https://doi.org/10.48333/2yyk-qw88, accessed on 1 December 2023.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Abbreviations

The following abbreviations are used in this manuscript:

AI	Artificial Intelligence
DL	Deep Learning
HDI	Human Development Index
CAM	Class Activation Maps
SVM	Support Vector Machine
FL	Fuzzy Logic
GA	Genetic Algorithms
NN	Neural Networks
SA	Simulated Annealing
OBIA	Object-based image analysis
IATA	International Air Transport Association
CNN	Convolutional Neural Networks
TP	True Positive
TN	True Negative
FP	False Positive
FN	False Negative

Appendix A

Table A1. List of cities in the dataset, along with their key characteristics.

City	IATA Code	Population	HDI	Country	Continent	Latitude	Longitude
Almaty	ALA	1,977,011	0.855	Kazakhstan	Asia	43°16′39″ N	76°53′45″ E
Ankara	ESB	5,747,325	0.832	Turkiye	Asia	39°55′48″ N	32°51′0″ E
Ashgabat	ASB	1,031,992	0.770	Turkmenistan	Asia	37°56′15″ N	58°22′48″ E
Astana	NQZ	1,136,008	0.840	Kazakhstan	Asia	51°10′0″ N	71°26′0″ E
Baku	GYD	2,293,100	0.826	Azerbaijan	Asia	40°23′43″ N	49°52′56″ E
Bangkok	BKK	8,305,218	0.814	Thailand	Asia	13°45′9″ N	100°29′39″ E
Beijing	PEK	21,893,095	0.904	China	Asia	39°54′24″ N	116°23′51″ E
Bishkek	FRU	1,074,075	0.745	Kyrgyzstan	Asia	42°52′29″ N	74°36′44″ E
Bogota	BOG	8,034,649	0.813	Colombia	South America	4°42′40″ N	74°4′20″ W
Boston	BOS	675,647	0.956	United States	North America	42°21′40″ N	71°3′25″ W
Brisbane	BNE	2,472,000	0.937	Australia	Oceania	27°28′12″ S	153°1′15″ E
Buenos Aires	AEP	3,003,000	0.882	Argentina	South America	34°36′12″ S	58°22′54″ W
Cairo	CAI	10,025,657	0.751	Egypt	Africa	30°2′40″ N	31°14′9″ E
Chicago	CHI	2,746,388	0.934	United States	North America	41°52′54″ N	87°37′23″ W
Dublin	DUB	554,554	0.965	Ireland	Europe	53°21′0″ N	6°15′37″ W
Hanoi	HAN	8,426,500	0.748	Vietnam	Asia	21°1′42″ N	105°51′15″ E
Hong Kong	HKG	7,413,070	0.949	China	Asia	22°18′10″ N	114°10′38″ E
Istanbul	IST	15,636,000	0.846	Turkiye	Europe	41°0′49″ N	28°57′18″ E
Jakarta	CGK	11,261,595	0.773	Indonesia	Asia	6°12′0″ S	106°49 0″ E
Kinhasa	FIH	17,071,000	0.577	Congo	Africa	4°19′30″ S	15°19′20″ E
Kuala-Lumpur	KUL	8,420,000	0.867	Malaysia	Asia	3°8′27″ N	101°41′35″ E
Lagos	LOS	7,937,932	0.675	Nigeria	Africa	6°27′18.1″ N	3°23′2.69″ E
Lahore	LHE	11,126,285	0.564	Pakistan	Asia	31°32′59″ N	74°20′37″ E
Lisbon	LIS	544,851	0.901	Portugal	Europe	38°43′30″ N	9°9′0.07″ W
Manila	MNL	1,846,513	0.732	Philippines	Asia	14°35′44″ N	120°58′37″ E
Melbourne	MEL	4,917,750	0.941	Australia	Oceania	37°48′51″ S	144°57′47″ E
Mexico City	MEX	9,209,944	0.784	Mexico	North America	19°26′0″ N	99°8′0″ W
Milan	MIL	3,149,000	0.912	Milan	Europe	45°27′52″ N	9°11′18″ E
Mumbai	BOM	12,479,608	0.697	India	Asia	19°4′34″ N	72°52′39″ E
Munich	MUC	1,488,202	0.956	Germany	Europe	48°8′15″ N	11°34′30″ E
Nairobi	NBO	4,397,073	0.665	Kenya	Africa	1°17′11″ S	36°49′2″ E
Oslo	OSL	634,293	0.975	Norway	Europe	59°54′48″ N	10°44′20″ E
Paris	PAR	2,165,423	0.947	France	Europe	48°51′23″ N	2°21′8″ E
Riga	RIX	614,618	0.933	Latvia	Europe	56°56′56″ N	24°6′23″ E
San Francisco	SFO	873,965	0.936	United States	North America	37°46′39″ N	122°24′59″ W
Sao Paulo	GRU	12,400,232	0.791	Brazil	South America	23°33′0″ S	46°38′0″ W
Seoul	ICN	9,765,869	0.943	South Korea	Asia	37°33′36″ N	126°59′24″ E
Shymkent	CIT	1,200,000	0.808	Kazakhstan	Asia	42°19′0″ N	69°35′45″ E
Singapore	SIN	5,453,600	0.938	Singapore	Asia	1°17′25″ N	103°51′7″ E
Sydney	SYD	5,231,150	0.945	Australia	Oceania	33°51′54″ S	151°12′35″ E
Taipei	TPE	2,704,810	0.916	Taiwan	Asia	25°4′0″ N	121°31′0″ E
Tashkent	TAS	2,750,000	0.807	Uzbekistan	Asia	41°18′0″ N	69°16′0″ E
Tokyo	TKY	37,274,000	0.944	Japan	Asia	35°39′10″ N	139°50′22″ E
Vancouver	YVR	2,632,000	0.960	Canada	North America	49°14′46″ N	123°6′58″ W
Washington	IAD	5,434,000	0.946	United States	North America	47°45′3″ N	120°44′24″ W

Appendix B

Figure A1. Full confusion matrix of city classification model featuring 45 cities, and revealing ratio of correctly identified patches as well as misclassificed samples on a test set.

References

Berry, B.J. Urbanization. In Proceedings of the Urban Ecology: An International Perspective on the Interaction between Humans and Nature; Marzluff, J.M., Ed.; Springer: Berlin/Heidelberg, Germany, 2018; pp. 25–48. [Google Scholar]
Cheng, Q.; Zaber, M.; Rahman, A.M.; Zhang, H.; Guo, Z.; Okabe, A.; Shibasaki, R. Understanding the urban environment from satellite images with new classification Method—Focusing on formality and informality. Sustainability 2022, 14, 4336. [Google Scholar] [CrossRef]
McKenzie, G.; Romm, D. Measuring urban regional similarity through mobility signatures. Comput. Environ. Urban Syst. 2021, 89, 101684. [Google Scholar] [CrossRef]
Costa, L.d.F.; Tokuda, E.K. A similarity approach to cities and features. Eur. Phys. J. B 2022, 95, 155. [Google Scholar] [CrossRef]
Bell, D.A.; de Shalit, A. Introduction: Cities and identities. Crit. Rev. Int. Soc. Political Philos. 2022, 25, 637–646. [Google Scholar] [CrossRef]
Fumega, J.; Niza, S.; Ferrão, P. Identification Of Urban Typologies Through The Use Of Urban Form Metrics For Urban Energy And Climate Change Analysis. In Proceedings of the Urban Futures-Squaring Circles: Europe, China and the World in 2050, Lisbon, Portugal, 10–11 October 2014. [Google Scholar]
Albert, A.; Kaur, J.; Gonzalez, M.C. Using Convolutional Networks and Satellite Imagery to Identify Patterns in Urban Environments at a Large Scale. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 23 June 2017; pp. 1357–1366. [Google Scholar] [CrossRef]
Saxena, P.; Jagdeesh, M.K. Similarity indexing & GIS analysis of air pollution. arXiv 2019, arXiv:1906.08756. [Google Scholar]
Gregor, M.; Löhnertz, M.; Schröder, C.; Aksoy, E.; Fons, J.; Garzillo, C.; Wildman, A.; Kuhn, S.; Prokop, G.; Cugny-Seguin, M. Similarities and Diversity of European Cities: A Typology Tool to Support Urban Sustainability. ETC/ULS Report 03/2018, European Topic Centre on Urban, Land and Soil Systems (ETC/ULS), Environment Agency Austria, Spittelauer Lände 5, A-1090 Vienna, Austria. 2018. Available online: http://www.eionet.europa.eu/ (accessed on 21 September 2023).
Federal Reserve Bank of Chicago. About the Peer City Identification Tool. Available online: https://www.chicagofed.org/region/peer-cities-identification-tool/pcit (accessed on 30 October 2023).
Kim, K. Identifying the structure of cities by clustering using a new similarity measure based on smart card data. IEEE Trans. Intell. Transp. Syst. 2019, 21, 2002–2011. [Google Scholar] [CrossRef]
Seth, R.; Covell, M.; Ravichandran, D.; Sivakumar, D.; Baluja, S. A Tale of Two (Similar) Cities: Inferring City Similarity Through Geo-Spatial Query Log Analysis. In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval, Paris, France, 26–29 October 2011. [Google Scholar]
Zhou, B.; Liu, L.; Oliva, A.; Torralba, A. Recognizing city identity via attribute analysis of geo-tagged images. In Proceedings of the Computer Vision—ECCV 2014: 13th European Conference, Zurich, Switzerland, 6–12 September 2014; Proceedings, Part III 13. Springer: Berlin/Heidelberg, Germany, 2014; pp. 519–534. [Google Scholar]
Numbeo. Numbeo—Cost of Living. 2023. Available online: https://www.numbeo.com/cost-of-living/rankings_by_country.jsp?title=2023 (accessed on 21 September 2023).
Forbes. Forbes—Cost of Living Calculator. 2023. Available online: https://www.forbes.com/advisor/mortgages/real-estate/cost-of-living-calculator/ (accessed on 21 September 2023).
NerdWallet. NerdWallet—Cost of Living Calculator. 2023. Available online: https://www.nerdwallet.com/cost-of-living-calculator (accessed on 21 September 2023).
Move. Moving.com—Compare Cities. 2023. Available online: https://www.moving.com/real-estate/compare-cities/ (accessed on 21 September 2023).
Urban Observatory. Urban Observatory. 2014. Available online: https://www.urbanobservatory.org (accessed on 21 September 2023).
ArcGIS Pro Documentation. How Similarity Search Works—ArcGIS Pro|Documentation, n.d. Available online: https://pro.arcgis.com/en/pro-app/latest/tool-reference/spatial-statistics/how-similarity-search-works.htm# (accessed on 21 September 2023).
Shell. Shell Energy and Innovation—Compare Cities. 2023. Available online: https://www.shell.com/energy-and-innovation/the-energy-future/future-cities/compare-cities.html (accessed on 21 September 2023).
Select Georgia. Select Georgia. Research Tool Spotlight: City Comparison, n.d. Available online: https://www.selectgeorgia.com/services/research-solutions-2021/city-and-state-comparisons/ (accessed on 21 September 2023).
AreaVibes. City Comparison, n.d. Available online: https://www.areavibes.com/city-comparison/ (accessed on 21 September 2023).
Dwellics. Dwellics. 2023. Available online: https://dwellics.com (accessed on 21 September 2023).
Homebase. Homebase—City-Wise Comparison Data. 2023. Available online: https://joinhomebase.com/data/city-wise-comparison/ (accessed on 21 September 2023).
Mehmood, M.U.; Chun, D.; Han, H.; Jeon, G.; Chen, K. A review of the applications of artificial intelligence and big data to buildings for energy-efficiency and a comfortable indoor living environment. Energy Build. 2019, 202, 109383. [Google Scholar] [CrossRef]
Casali, Y.; Aydin, N.Y.; Comes, T. Machine learning for spatial analyses in urban areas: A scoping review. Sustain. Cities Soc. 2022, 85, 104050. [Google Scholar] [CrossRef]
Huntingford, C.; Jeffers, E.S.; Bonsall, M.B.; Christensen, H.M.; Lees, T.; Yang, H. Machine learning and artificial intelligence to aid climate change research and preparedness. Environ. Res. Lett. 2019, 14, 124007. [Google Scholar] [CrossRef]
Manzoor, B.; Othman, I.; Durdyev, S.; Ismail, S.; Wahab, M.H. Influence of artificial intelligence in civil engineering toward sustainable development—A systematic literature review. Appl. Syst. Innov. 2021, 4, 52. [Google Scholar] [CrossRef]
Mohanty, S.P.; Czakon, J.; Kaczmarek, K.A.; Pyskir, A.; Tarasiewicz, P.; Kunwar, S.; Rohrbach, J.; Luo, D.; Prasad, M.; Fleer, S.; et al. Deep learning for understanding satellite imagery: An experimental survey. Front. Artif. Intell. 2020, 3, 534696. [Google Scholar] [CrossRef]
Sisodiya, N.; Dube, N.; Thakkar, P. Next-Generation Artificial Intelligence Techniques for Satellite Data Processing. In Artificial Intelligence Techniques for Satellite Image Analysis; Hemanth, D., Ed.; Springer: Berlin/Heidelberg, Germany, 2020; Chapter 11; pp. 235–254. [Google Scholar] [CrossRef]
Cao, R.; Zhu, J.; Tu, W.; Li, Q.; Cao, J.; Liu, B.; Zhang, Q.; Qiu, G. Integrating aerial and street view images for urban land use classification. Remote Sens. 2018, 10, 1553. [Google Scholar] [CrossRef]
Hu, Y.; Li, W.; Wright, D.J.; Aydin, O.; Wilson, D.; Maher, O.; Raad, M. Artificial Intelligence Approaches. In The Geographic Information Science & Technology Body of Knowledge; Wilson, J.P., Ed.; University Consortium for Geographic Information Science Symposium: Pasadena, CA, USA, 2019; Volume 3. [Google Scholar]
Senanayake, I.; Welivitiya, W.; Nadeeka, P. Urban green spaces analysis for development planning in Colombo, Sri Lanka, utilizing THEOS satellite imagery—A remote sensing and GIS approach. Urban For. Urban Green. 2013, 12, 307–314. [Google Scholar] [CrossRef]
Nazmfar, H.; Jafarzadeh, J. Classification of satellite images in assessing urban land use change using scale optimization in object-oriented processes (a case study: Ardabil city, Iran). J. Indian Soc. Remote. Sens. 2018, 46, 1983–1990. [Google Scholar] [CrossRef]
Furberg, D. Satellie Monitoring of Urban Growth and Indicator-Based Assessment of Environmental Impact. Ph.D. Thesis, KTH Royal Institute of Technology, Stockholm, Sweden, 2014. [Google Scholar]
Taubenböck, H.; Kraff, N.J.; Wurm, M. The morphology of the Arrival City—A global categorization based on literature surveys and remotely sensed data. Appl. Geogr. 2018, 92, 150–167. [Google Scholar] [CrossRef]
Wang, H.; Gong, X.; Wang, B.; Deng, C.; Cao, Q. Urban development analysis using built-up area maps based on multiple high-resolution satellite data. Int. J. Appl. Earth Obs. Geoinf. 2021, 103, 102500. [Google Scholar] [CrossRef]
Jamil, A.; Al-Shareef, A.; Al-Thubaiti, A. Classifications of Satellite Imagery for Identifying Urban Area Structures. Adv. Remote Sens. 2020, 9, 1. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Kinga, D.; Adam, J.B. A method for stochastic optimization. In Proceedings of the International Conference on Learning Representations (ICLR), San Diego, CA, USA, 7–9 May 2015; Volume 5, p. 6. [Google Scholar]
Lee, J.R.; Kim, S.; Park, I.; Eo, T.; Hwang, D. Relevance-CAM: Your model already knows where to look. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 14944–14953. [Google Scholar]
Zhou, B.; Khosla, A.; Lapedriza, A.; Oliva, A.; Torralba, A. Learning Deep Features for Discriminative Localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
Abdelkader, R.; Park, J.H. Spatial Principles of Traditional Cairene Courtyard Houses in Cairo. J. Asian Archit. Build. Eng. 2018, 17, 245–252. [Google Scholar] [CrossRef]
Ho, T.P.; Stevenson, M.; Thompson, J.; Nguyen, T.Q. Evaluation of Urban Design Qualities across Five Urban Typologies in Hanoi. Urban Sci. 2021, 5, 76. [Google Scholar] [CrossRef]
Hibayama, H.; Duan, O.D.; Mamoru, S. Studies on Hanoi Urban Transition in the Late 20th Century Based on GIS/RS. Southeast Asian Stud. 2009, 46, 4. [Google Scholar]
Chepelianskaia, O. Why Should Asia Build Unique Cities? Isocarp Review; International Society of City and Regional Planners (ISOCARP): The Hague, Netherlands, 2019. [Google Scholar]
Britannica, E. Almaty. 2023. Available online: https://www.britannica.com/place/Almaty-Kazakhstan (accessed on 10 March 2023).
Britannica, E. San Francisco. 2023. Available online: https://www.britannica.com/place/San-Francisco-California (accessed on 10 March 2023).
Britannica, E. Paris. 2023. Available online: https://www.britannica.com/place/Paris (accessed on 10 March 2023).
Nice, K.A.; Thompson, J.; Wijnands, J.S.; Aschwanden, G.D.P.A.; Stevenson, M. The “Paris-End” of Town? Deriving Urban Typologies Using Three Imagery Types. Urban Sci. 2020, 4, 27. [Google Scholar] [CrossRef]
Britannica, E. Tokyo. 2023. Available online: https://www.britannica.com/place/Tokyo (accessed on 10 March 2023).

Figure 1. Allocation of cities in the presented dataset by Population and Human Development Index (HDI) on the world map.

Figure 2. Omitted image samples with (a) substantial cloud coverage, (b) bold shadows, and (c) other image artifacts.

Figure 3. Dataset folder structure for raw and processed parts. Here, the files for Almaty city are shown, consisting of raw satellite images at different dates (left), processed patches for machine learning, and associated metadata for each region (right).

Figure 4. Division of the images from one region into training, validation, and test splits. Sample patches generated out of raw images are also shown.

Figure 5. Class-specific information extraction via (a) depth-wise heatmaps (b) masked images by Relevance CAM for Cairo (CAI), true label: CAI, predicted label: CAI.

Figure 6. Class-specific information extraction via (a) depth-wise heatmaps (b) masked images by Relevance CAM for Milan (MIL), true label: MIL, predicted label: ESB.

Figure 7. Class-specific information extraction via (a) depth-wise heatmaps (b) masked images by Relevance CAM for Almaty (ALA), true label: ALA, predicted label: ALA.

Figure 8. Class-specific information extraction via (a) depth-wise heatmaps (b) masked images by Relevance CAM for San Francisco (SFO), true label: SFO, predicted label: SFO.

Figure 9. Class-specific information extraction via (a) depth-wise heatmaps (b) masked images by Relevance CAM for Paris (PAR), true label: PAR, predicted label: PAR.

Figure 10. Class-specific information extraction via (a) depth-wise heatmaps (b) masked images by Relevance CAM for Tokyo (TKY), true label: TKY, predicted label: TKY.

Table 1. Summary of city similarity methods.

City Similarity Method	# of Cities	Region	Data Source	Analyzed Features	Method
Zhou et al. (2014) [13]	21	Asia, Europe, North America	geo-tagged images	green areas, water resources, transport, architectural forms, buildings, sport and social activities	SVM classifier
Gregor et al. (2018) [9]	385	Europe	tabular data	typology and environmental features	clustering
Federal Reserve Bank of Chicago [10]	960	United States	tabular data	equity, resilience, outlook, and housing	clustering
Kim et al. (2019) [11]	1	South Korea	city maps, smart card data	spatial interactions, city structure	clustering
Costa and Tokuda (2022) [4]	20	Europe	topology, street networks	Jaccard and interiority indices	K-means clustering
Seth et al. (2011) [4]	20	Europe	query logs	professional occupation	clustering
Ours	45	Worldwide	satellite images	urban areas, unique salient city features	deep learning (ResNet)

Table 2. City classification results for different ResNet models.

Model	Training Time (h)	Epoch # at Best Validation Accuracy	Validation Accuracy	Test Accuracy
ResNet-18	17.5	88/100	0.8336	0.8228
ResNet-34	18	54/100	0.8337	0.8287
ResNet-50	18.5	80/100	0.8511	0.8390
ResNet-101	25	60/100	0.8484	0.8340

Table 3. Best performing ten cities for city identification along with three most similar cities and corresponding similar sample patches (based on % of accuracy).

City	Accuracy	Three Most Similar Cities
Ankara (ESB)	100%	-	-	-
Buenos Aires (AEP)	100%	-	-	-
Cairo (CAI)	100%	-	-	-
Chicago (CHI)	100%	-	-	-
Hanoi (HAN)	100%	-	-	-
Mumbai (BOM)	100%	-	-	-
Oslo (OSL)	99.8%	Manila (0.2%)	-	-
Seoul (ICN)	98.6%	Shymkent (0.4%)	Beijing (0.2%)	Hong Kong (0.2%)
Melbourne (MEL)	98.4%	Mumbai (0.6%)	Kinshasa (0.4%)	Oslo (0.2%)
Lisbon (LIS)	97.0%	Seoul (0.8%)	Washington (0.6%)	Milan (0.2%)

Table 4. Worst-performing ten cities for city identification along with three most similar cities and corresponding similar sample patches (based on % of accuracy).

City	Accuracy	Three Most Similar Cities
Astana (NQZ)	25.2%	Almaty (14.8%)	Baku (14.0%)	Bishkek (12.8%)
Baku (GYD)	52.0%	Tashkent (9.8%)	Ashgabat (7.8%)	Istanbul (6.2%)
Istanbul (IST)	55.8%	Ankara (11.6%)	Sao Paulo (6.6%)	Hong Kong (5.6%)
Shymkent (CIT)	56.4%	Baku (12.6%)	Bishkek (12.6%)	Tashkent (5.2%)
Singapore (SIN)	61.8%	Dublin (6.0%)	Bangkok (5.4%)	Sao Paulo (5.2%)
Milan (MIL)	68.2%	Munich (5.6%)	Ankara (5.0%)	San Francisco (3.4%)
Bishkek (FRU)	70.8%	Shymkent (20.6%)	Astana (3.0%)	Baku (0.8%)
Paris (PAR)	71.0%	Hong Kong (5.8%)	Dublin (3.6%)	Lisbon (3.6%)
Brisbane (BNE)	72.2%	Sydney (12.4%)	Nairobi (8.2%)	Sao Paulo (3.0%)
Tashkent (TAS)	72.6%	Shymkent (8.2%)	Bishkek (4%)	Astana (2.8%)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bissarinova, U.; Tleuken, A.; Alimukhambetova, S.; Varol, H.A.; Karaca, F. DL-SLICER: Deep Learning for Satellite-Based Identification of Cities with Enhanced Resemblance. Buildings 2024, 14, 551. https://doi.org/10.3390/buildings14020551

AMA Style

Bissarinova U, Tleuken A, Alimukhambetova S, Varol HA, Karaca F. DL-SLICER: Deep Learning for Satellite-Based Identification of Cities with Enhanced Resemblance. Buildings. 2024; 14(2):551. https://doi.org/10.3390/buildings14020551

Chicago/Turabian Style

Bissarinova, Ulzhan, Aidana Tleuken, Sofiya Alimukhambetova, Huseyin Atakan Varol, and Ferhat Karaca. 2024. "DL-SLICER: Deep Learning for Satellite-Based Identification of Cities with Enhanced Resemblance" Buildings 14, no. 2: 551. https://doi.org/10.3390/buildings14020551

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

DL-SLICER: Deep Learning for Satellite-Based Identification of Cities with Enhanced Resemblance

Abstract

1. Introduction

2. Literature Review

2.1. City Similarity Tools

2.2. Use of AI and Satellite Images in Urban Planning

3. Methods

3.1. Data Collection

3.2. Dataset Structure

3.3. Data Preprocessing

3.4. DL-SLICER Model for City Classification

3.5. Explanatory Visualizations

4. Results and Discussions

4.1. City Classification

4.2. Salient Features of Urban Patterns

4.2.1. Almaty

4.2.2. San Francisco

4.2.3. Paris

4.2.4. Tokyo

5. Conclusions

6. Implementations

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A

Appendix B

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI