1. Introduction
In recent years, the process of global urbanization has been accelerating. According to data from the United Nations, by 2050, it is expected that 68% of the world’s population will live in cities [
1]. This trend has led to a dramatic increase in the complexity of urban structures, manifesting in multiple issues such as traffic congestion [
2], environmental pollution [
3], energy shortages [
4], and increased public safety risks [
5], all of which severely affect the quality of life for urban residents and the sustainable development of cities [
6]. These challenges not only impact the quality of life for city dwellers but also impose new demands on urban managers [
7]. To effectively address these issues, countries around the world are advancing smart city construction, aiming to achieve more efficient and sustainable urban management through technological innovation [
8]. In the process of building smart cities, emerging technologies such as the Internet of Things (IoT) [
9], big data [
10], cloud computing [
11], and 5G have played important roles [
12,
13]. However, with the explosive growth of urban data, establishing how to extract valuable information from vast amounts of data and apply it to practical urban management decisions has become a major challenge in the construction of smart cities.
Machine learning, as one of the core technologies of artificial intelligence, is playing an increasingly important role in the construction of smart cities [
14]. It is a method that learns and optimizes algorithmic models through data, enabling the automatic discovery of hidden patterns and trends from vast amounts of urban data, which can then be used to predict future behaviors and trends [
15]. Compared to traditional urban management and service methods, machine learning-based smart city applications offer advantages such as real-time responsiveness, dynamic adaptability, and personalization, allowing for more efficient, precise, and flexible responses to the complexity and uncertainty of urban environments [
16]. Many developed countries and regions are actively promoting the application of machine learning across various domains of smart cities, achieving remarkable results. For instance, New York City in the United States uses machine learning to analyze traffic flow data and optimize traffic signal timing, reducing travel time by 10% [
17]. Singapore employs computer vision and deep learning technologies to enable the real-time monitoring and analysis of urban environments, enhancing the efficiency and accuracy of city management [
18]. In 2016, the Dubai government, in collaboration with the Dubai Future Foundation and Smart Dubai, launched the Dubai Blockchain Strategy, aiming to widely adopt blockchain technology in government operations to improve efficiency, transparency, and security. Tokyo, Japan, utilizes machine learning algorithms to process massive energy consumption data, achieving the precise prediction and optimized allocation of urban energy demand, resulting in a 20% reduction in energy consumption [
19]. These case studies not only demonstrate the potential of machine learning in smart cities but also reflect the urgent global demand for intelligent management solutions in the process of urbanization.
Meanwhile, Beatley proposed the concept of “biophilic cities”, emphasizing that cities should closely integrate nature with urban life through design, policies, and community engagement [
20]. In recent years, the field of urban planning and construction has gradually placed greater emphasis on the concept of “Biophilic Cities”, which involves deeply integrating natural elements into urban spaces to improve residents’ health, enhance social resilience, and promote urban sustainability [
21]. Some scholars study how to achieve the goals of biophilic cities through green infrastructure, ecological design, and community participation. The development of smart cities is not isolated from the concept of biophilic cities; rather, the two can form a complementary relationship. Machine learning technologies can play a critical role in this integration process. Machine learning algorithms can optimize the distribution of urban green spaces and increase vegetation coverage, thereby improving air quality and regulating urban microclimates. Ecological monitoring systems based on sensor networks and computer vision technologies can track and analyze urban biodiversity in real time, providing essential support for formulating scientifically sound ecological conservation policies. Additionally, intelligent water resource management systems can utilize machine learning to predict precipitation patterns and optimize rainwater collection and reuse strategies, thereby enhancing urban ecological adaptability. Through these technological means, smart city construction can not only improve urban operational efficiency but also strengthen urban resilience—the ability of cities to cope with natural disasters, environmental changes, and socio-economic shocks. The rise of urban resilience can enhance residents’ subjective well-being by increasing their income or consumption, improving their mental and physical health, and boosting their social trust or social integration [
22].
However, despite the enormous potential of machine learning in smart city applications, the challenges it faces cannot be ignored. First, smart city systems are typically highly heterogeneous and dynamic, involving multi-source, multi-modal data from various fields, making it a key challenge to handle the complexity and uncertainty of such data [
23]. Second, data privacy and security issues are particularly prominent in the context of smart cities, especially when it comes to public safety and personal privacy [
24]. Ensuring the security and legal compliance of data has become an important consideration in the application of technology. In addition, the issues of explainability and fairness in machine learning models have also posed new requirements for the construction of smart cities [
25]. Urban management decisions often involve multiple stakeholders, and the black-box nature of some models may lead to non-transparent and unfair decision-making, further exacerbating social conflicts [
26]. Therefore, how to effectively apply machine learning technology in the complex environment of smart cities while ensuring its transparency, fairness, and security is a key challenge that academia and industry urgently need to resolve.
Given that the application of machine learning in smart cities has achieved preliminary results and demonstrated significant potential in urban management, public service optimization, and resource allocation, it is necessary to systematically review and analyze the current state of research in this field. However, existing studies often focus on specific application scenarios for the optimization of individual technologies, lacking a comprehensive examination of overall development trends. Additionally, discussions on issues such as model transparency, data security, fairness assurance, and cross-domain collaborative applications remain insufficient. Therefore, to systematically review the research progress on machine learning in smart cities, identify the developmental trajectory of key technologies and emerging critical issues, and reveal possible future research directions, conducting bibliometric analyses is particularly important. A bibliometric analysis, based on large-scale academic research data, can quantitatively uncover the research scale, thematic evolution, core research forces, and frontier trends in this field, providing a scientific foundation for subsequent studies and valuable references for policymakers and practitioners.
Therefore, this study focuses on the application prospects of machine learning technology in smart cities, highlighting the theoretical and practical significance of bibliometric research. The field is currently experiencing rapid growth, but the highly fragmented knowledge system and its interdisciplinary nature have led to three significant challenges. First, a vast number of research findings are dispersed across multiple disciplines, such as computer science, urban planning, and public administration, lacking the systematic integration of core issues, evolutionary pathways, and paradigm shifts. Second, the dynamic adaptation mechanism between the rapid iteration of technology and the evolving demands of urban governance remains unclear, resulting in a significant knowledge gap between fundamental theoretical research and technological applications. Third, the global distribution of research efforts is uneven, and the efficiency of constructing cross-institutional and cross-regional collaborative innovation networks needs to be improved. Using bibliometric methods, this study aims to construct a knowledge map to reveal the migration patterns of research hotspots, identify key academic communities and their collaboration models, and provide empirical support for overcoming technological application bottlenecks. This approach not only helps optimize resource allocation and facilitates interdisciplinary knowledge integration but also provides a scientific foundation for policymakers to establish technology risk assessment frameworks and promote a data-driven paradigm shift in urban governance. Ultimately, it contributes to the realization of sustainable smart city development goals. This research perspective offers important insights for both academic inquiry and practical applications, fostering the responsible use of machine learning in smart cities and cultivating a more intelligent, equitable, and trustworthy human–machine collaboration ecosystem.
The remaining sections of this study are as follows:
Section 2 is on the research methodology and data sources;
Section 3 is on the temporal and spatial analyses;
Section 4 is on the hotspot analyses;
Section 5 is on the collaborative analyses;
Section 6 is on the results and the discussion; and
Section 7 is on the conclusions, which contain the limitations and outlook.
2. Research Methodology and Data Sources
2.1. Data-Screening Process
The first step of this study involved selecting reference journal articles from the Web of Science (WoS) database to create a unified analytical database. The WoS is widely recognized as a primary source of authoritative and representative citation data, with its indexed journal articles representing high-quality achievements in international social science research [
27]. Additionally, it enables more comprehensive citation analyses. The data for this study were sourced from five sub-databases within the Web of Science Core Collection: the Science Citation Index Expanded (SCI-EXPANDED), the Social Sciences Citation Index (SSCI), the Conference Proceedings Citation Index—Science (CPCI-S), Current Chemical Reactions (CCR-EXPANDED), and the Index Chemicus (IC). The search query was set to “Topic = (‘machine learning’) AND Topic = (‘smart city’)”, with the retrieval period extending until 31 December 2024. A total of 2326 published records were generated and saved in the Plain Text file format.
In the second step, this study strictly adhered to the standardized framework of the PRISMA statement (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) and follows the process guidelines of PRISMA-ScR (an extension for Scoping Reviews) to ensure transparency and reproducibility in the literature-screening process. Given that this study aims to systematically map the knowledge evolution and structure of the interdisciplinary field of “machine learning + smart cities”, the scoping review method was employed to conduct a comprehensive scan of the existing literature, rather than focusing on specific intervention effects, as in a systematic review. Specifically, starting from the initial 2326 records retrieved from the WoS database, automated deduplication was performed using the EndNote X9 reference management software. Subsequently, two researchers independently conducted a double-blind screening of the titles and abstracts, excluding non-research studies, such as book reviews and conference announcements. Discrepancies were resolved through group discussions, resulting in a final analytical sample of 828 valid articles. The entire screening decision process was visualized using a PRISMA flow diagram (
Figure 1), strictly following the PRISMA workflow, which includes three stages: Identification, screening, and inclusion, to enhance the systematicity and reliability of the study.
2.2. Inclusion and Exclusion Criteria
This section helps readers reproduce the selection process and explains why only a limited number of papers met the selection criteria. Additionally, the following inclusion and exclusion criteria were used to filter the relevant papers:
Research topic: the studies must involve both machine learning and smart cities, with a direct connection between the two.
Source of articles: the papers must be published in one of the five sub-databases of the Web of Science (WoS) Core Collection, including SCI-EXPANDED, SSCI, CPCI-S, CCR-EXPANDED, and IC.
Language: to ensure accessibility and comprehension, only the papers written in English were included.
Full-text access: only the articles with full-text availability were included to allow for an in-depth analysis.
Publication date: the papers must have been published before 31 December 2024, to maintain the timeliness of the study.
Research topic: the studies that focused solely on either machine learning or smart cities, without addressing their intersection, were excluded.
Not peer-reviewed: the non-scientific publications, such as grey publications and opinion papers, which are commonly found in non-peer-reviewed sources, were excluded.
Language: the papers that were not published in English were excluded to ensure international readability and comparability.
Duplicate studies: the redundant studies were excluded to avoid data duplication.
Inaccessible papers: the papers that could not be fully accessed were not considered for the final selection.
After applying the above screening criteria, a total of 829 high-quality research papers were included, providing a solid data foundation for the subsequent analysis.
2.3. Research Methods
The Scimago Graphica 1.0.49 method is a commonly used scientific evaluation approach in bibliometric analyses, primarily relying on the Scimago Journal and Country Rank (SJR) and the Scopus database. It systematically analyzes academic publications, journals’ impact, and research trends by integrating multiple bibliometric indicators. This method is widely applied in scientometrics and knowledge mapping research, effectively revealing the academic influence, research hotspots, and developmental trends of a particular field. In the research domain of smart cities and machine learning, a bibliometric analysis using the Scimago method can help researchers identify high-impact journals, core research institutions, key research topics, and their evolutionary trends, thereby providing data support and scientific evidence for future studies. Furthermore, this method can also uncover international collaboration patterns and the research contributions of different countries and institutions, offering insights that can be utilized for the coordinated development of global smart city construction and intelligent technology applications.
CiteSpace 6.3.R1 is a widely used software tool for scientometric and visual analyses, developed by Chaomei Chen. Based on co-citation analysis, it visually presents the structure, evolution, and dynamics of knowledge domains [
28]. This study employed CiteSpace to create knowledge maps, identify research frontiers and breakthroughs in the field, and reveal the evolutionary trajectory of research topics. Through features such as time zone views, cluster views, and timeline views in CiteSpace, the developmental context and key nodes of research topics can be intuitively displayed, aiding in understanding the global structure and trends of machine learning applications in smart cities.
VOSviewer 1.6.20 is a text mining and visual analysis software developed by Nees Jan van Eck and Ludo Waltman, utilizing VOS (Visualization of Similarities) technology to generate network structure diagrams based on similarity calculations between nodes [
29]. This study used VOSviewer for a keyword co-occurrence analysis and a cluster analysis to identify the semantic structure and knowledge base of research topics, revealing the connections and distinctions between different themes. Through features such as density visualization and overlay visualization in VOSviewer, the distribution characteristics and evolutionary trends of machine learning applications in smart cities can be systematically and intuitively revealed.
Bibliometrix is a bibliometric analysis toolkit written in R, developed by Aria and Cuccurullo. It provides a series of functions and commands for the descriptive analysis, collaboration network analysis, and thematic co-occurrence analysis of research data [
30]. This study employed Bibliometrix for an author collaboration network analysis, revealing the distribution of research strengths and collaboration patterns in the field and identifying core authors and research institutions. Through features such as collaboration network maps and country collaboration maps in Bibliometrix, the collaborative relationships and centrality among authors and institutions in the field of machine learning applications in smart cities can be clearly displayed.
In summary, to comprehensively analyze the research status and trends of machine learning applications in smart cities, this study adopted a quantitative bibliometric analysis method, integrating tools, such as Scimago, CiteSpace, VOSviewer, and Bibliometrix, to conduct systematic research from multiple dimensions. The specific research process is illustrated in
Figure 2. First, the time-slicing technique of Scimago was employed to observe the temporal evolution trends of research topics, and Bibliometrix was used to identify high-impact journals, authors, and institutions, thereby clarifying the core literature and foundational research in the field. Second, VOSviewer and CiteSpace were utilized for study clustering and a burst analysis to identify research frontiers and trend shifts at different stages. Third, Scimago was used to map collaboration networks among authors, institutions, and countries, analyzing academic collaboration patterns. Finally, the research results from various dimensions were synthesized to summarize the knowledge system of the field, predict future development trends, and explore emerging technological directions.
3. Analysis of the Overall Research Situation
3.1. Analysis of Publication Quantity and Trends
As shown in
Figure 2, based on the trend in the publication volume from 2006 to 2024, the research process can be divided into three distinct periods, each characterized by an increase in the output and notable indicators.
During this phase, the number of publications was relatively low and grew slowly, with annual publications consistently remaining below 50. The number increased gradually from 2 in 2014 to 24 in 2017. This indicates that the application of machine learning in smart cities was still in the exploratory stage, with limited attention from the academic community. Research primarily focused on theoretical discussions and preliminary technical attempts, with few practical applications, and it had not yet garnered widespread attention. In the early 2010s, machine learning technologies were not yet mature, and their applications in smart cities remained experimental, with research mainly concentrating on exploring fundamental algorithms. Additionally, the governmental and corporate interest in smart city development had not yet peaked, resulting in limited investment and policy support.
Starting in 2017, the number of publications entered a period of rapid growth, reaching a peak of 147 in 2022. With advancements in deep learning, the capabilities of machine learning in urban data analyses significantly improved, driving explosive growth in smart city-related research during this phase. Research directions gradually expanded to include key areas such as traffic optimization [
31], energy management [
32], environmental monitoring [
33] and urban planning [
34]. Concurrently, governments worldwide introduced smart city development strategies, such as the EU’s “Digital Europe Programme” and China’s “Guidelines for New Smart City Construction”, coupled with increased investment from governments and enterprises, further propelling research in this field.
After 2022, the number of publications fluctuated but remained at a relatively high level. Following the peak of 147 in 2022, the number slightly decreased to 127 in 2023 but rebounded to 136 in 2024. This suggests that research in this field matured. With the maturation of foundational technologies, the research focus gradually shifted from theoretical exploration to the optimization of specific applications, such as intelligent traffic management, urban energy scheduling, and smart security systems. At the same time, there was an in-depth exploration of issues such as technological ethics, data privacy, and model interpretability [
25].
Looking at the trend in cumulative postings in
Figure 3, research was limited before 2014 but entered a period of rapid development after 2014, peaking in 2022. Overall, research on machine learning in smart cities underwent a process of initial exploration → rapid growth → stable development. In the future, this field may further deepen in areas such as intelligent governance, cross-domain integration, and sustainable urban development, aiming to enhance the intelligence level and practical application value of smart cities.
3.2. High-Yield Country Information
Figure 4 illustrates the global distribution of scientific research achievements related to machine learning and smart cities. The number of publications from each country is represented by varying shades of blue, with darker colors indicating a higher output of related research articles. As shown in
Figure 3, China and the United States are the two countries with the highest number of publications globally, with the darkest colors, indicating their dominant positions in research on machine learning and smart cities. This dominance may stem from the long-term, large-scale funding support provided by the Chinese and U.S. governments for artificial intelligence (AI) and smart city research. For instance, the U.S. National Science Foundation (NSF) and China’s National Natural Science Foundation (NSFC) have established numerous specialized funding programs. Additionally, both countries possess world-leading cloud computing platforms (e.g., Alibaba Cloud, AWS) and smart city pilot projects, while many developing countries still lack high-performance computing centers, extensive data collection systems, and infrastructure such as smart transportation and power grids, which hinder related research. Furthermore, several European countries (e.g., Germany, the United Kingdom, France) also stand out, with darker shades indicating a high scientific output.
However, the lighter colors in South America, Africa, and parts of Asia suggest relatively lower research outputs in these regions. This disparity is not merely a matter of research interest or capability but stems from multiple systemic barriers. First, funding gaps are a critical factor. Developing and underdeveloped countries often face severe shortages in research funding, limiting their investments in high-tech fields [
35]. Second, infrastructure limitations pose another significant obstacle. Many African and South American countries lack advanced research facilities, high-speed internet, and a stable electricity supply, all of which are essential for conducting research into machine learning and smart cities. Moreover, the United States boasts world-leading universities and research institutions that attract scholars globally, while China has strengthened talent recruitment in recent years through policies such as the “Thousand Talents Plan”. In contrast, higher education systems in Africa and South America have fewer resources for scientific training, and many talented scholars migrate to Europe and the United States due to better salaries or research conditions, further weakening local research capabilities [
36].
This map provides a global perspective, highlighting the disparities in AI and smart city development across different regions. Overall, developed countries, particularly those with advanced economies and technological capabilities, are significantly ahead in this field. As technological innovation plays an increasingly vital role in driving economic and social development, strengthening international cooperation and promoting the sharing of knowledge and experience will help narrow the technological gap between nations.
3.3. High-Yield Journal Information
As shown in
Figure 4,
IEEE Access ranks first, with 136 publications, making it one of the most influential journals in this field and highlighting its significant role in research related to smart cities and machine learning.
Sensors follows closely, with 114 publications, indicating that sensor technology also holds a critical position in smart city research, likely due to its close connection with technologies such as the Internet of Things (IoT). The journal
Sustainability has 69 publications, demonstrating that research in this field is closely tied to sustainable development. The construction of smart cities involves not only technology but also emphasizes the sustainable use of the environment and resources.
Sustainable Cities and Society (60 publications) and
Applied Sciences-Basel (56 publications) reflect the broad links between smart city development and society, as well as scientific applications, indicating a trend toward interdisciplinary research. The
IEEE Internet of Things Journal also published 56 articles, underscoring the importance of the IoT in smart cities, as IoT technology is one of the core technologies in this domain.
Electronics (39 publications) and
Energies (25 publications) reflect the research interest in and applications of electronic technology and energy systems in smart cities. Additionally,
Future Generation Computer Systems-
The International Journal and
Multimedia Tools and Applications have 25 and 19 publications, respectively, indicating the potential applications of future computing systems and multimedia tools in smart cities.
As shown in
Figure 5,
IEEE Access ranks first, with 45 publications, making it one of the most influential journals in this field, indicating its important position in research related to smart cities and machine learning [
37].
Sensors follows closely, with 44 publications, demonstrating the significant role of sensor technology in smart cities, likely closely related to technologies such as the Internet of Things (IoT) [
38]. The journal
Sustainability has 29 publications, indicating that research in this field is closely linked to sustainable development, emphasizing that the construction of smart cities involves not only technology but also the sustainable use of the environment and resources [
39]. The
IEEE Internet of Things Journal (24 publications) [
40] and
Applied Sciences-Basel (22 publications) [
41] show the broad connections between smart city construction and social and scientific applications, indicating a trend of multidisciplinary research.
Sustainable Cities and Society also published 17 articles [
42], underscoring the importance of the IoT in smart cities, with IoT technology being one of the core technologies of smart cities.
Electronics (nine publications) [
43],
Energies (nine publications) [
44], and
Sustainable Energy Technologies and Assessments (nine publications) [
45] reflect the applications of and research interest in electronics technology and energy systems in smart cities. Additionally, the publication volume of
CMC-Computers, Materials & Continua (seven publications) [
46] reflects the application of computer science and materials science in this field.
Overall, the distribution of these journals reflects the interdisciplinary nature of smart city research, encompassing multiple fields, such as the Internet of Things (IoT), machine learning, sustainable development, and energy management, showcasing the research landscape of smart cities as a comprehensive innovation product. Research trends indicate that technological exploration in smart cities is not limited to engineering and computer science but is also closely integrated with fields such as social sciences and environmental sciences. In the future, with the continuous advancement of the IoT, next-generation computing systems, and multimedia tools, the depth and breadth of smart city research will further expand, offering more possibilities for the intelligent and sustainable development of cities.
Figure 6 shows the top ten most frequently cited journals in the field of machine learning and smart cities. From the figure, it is evident that
IEEE Access is the most cited journal, leading significantly with 1158 local citations. This large number of citations is likely due to its broad coverage and the high number of articles it publishes. Following this is
Sensors-
Basel, with 608 citations, indicating its important position in research within this field. The third-ranked, the
IEEE Internet of Things Journal, has significantly fewer citations, with only 541. Additionally,
Sustainable Cities and Society (390 citations),
Arxiv (383 citations), and
Future Generation Computer Systems (363 citations) are also on the list. Although their citation counts are not as high as the top journals, their content still holds considerable influence in the fields of smart cities and machine learning. From the distribution of citations, it can be seen that research outcomes in the field of machine learning and smart cities are concentrated in a few high-impact journals, reflecting the concentration and authority of research hotspots. Moreover, the research in this field is concentrated in a few high-impact journals, demonstrating the dissemination and cross-disciplinary influence of research outcomes across different academic fields.
3.4. High-Yield Author Information
Figure 7 shows the top 10 authors with the most publications in the field of machine learning and smart cities, along with their specific number of publications. From the figure, it is evident that NESI P ranks first, with 10 articles. NESI P’s research mainly focuses on data management and optimization in smart cities, including the intelligentization of urban infrastructure, traffic flow optimization, and energy management based on machine learning [
47,
48,
49]. His research outcomes have played an important role in enhancing the operational efficiency of smart cities and promoting sustainable urban development. Following closely are KHAN MA and KIM K, both of whom have published eight articles. KHAN MA mainly focuses on the development of intelligent systems in smart cities, such as intelligent transportation, environmental monitoring, and urban safety systems, aiming to enhance urban governance through deep learning and data analysis [
50,
51,
52]. KIM K focuses on the integration of big data analysis and smart city applications, using machine learning techniques to optimize urban resource scheduling and improve the quality of public services [
53]. ABBAS S and KUMAR A have each published six articles. ABBAS S primarily researches the application of the Internet of Things (IoT) and machine learning in smart cities, including smart sensor networks, real-time data processing, and urban monitoring systems [
54,
55,
56,
57]. KUMAR A has contributed to network security and privacy protection in smart cities, focusing on the application of machine learning methods to ensure data security and prevent network threats [
58,
59]. Additionally, BELLINI P [
60], BOUKERCHE A [
61], CHEN X [
62], and GUPTA S [
63] have each published five articles. Their research covers several key areas of smart cities, such as the optimization of intelligent transportation systems, urban health monitoring, large-scale data processing, and AI-driven urban service innovation.
The research of these prolific scholars is concentrated on different aspects of smart city development, particularly in the application of machine learning. Their contributions provide a solid theoretical foundation and technical support for the development of core technologies in smart cities. Smart cities involve the real-time processing and analysis of large volumes of heterogeneous data, spanning areas such as transportation, energy, environment, and public services. The application of machine learning has significantly enhanced the intelligence of urban management. The studies conducted by these scholars not only address practical issues in smart cities, such as traffic congestion [
64], energy consumption optimization [
65], and environmental monitoring [
66], but also offer innovative ideas for building smarter and more efficient cities in the future [
67].
From
Figure 8, it can be observed that there are significant differences in the scientific research output across different countries in this field, indicating a certain degree of imbalance. China leads by a large margin in the research output in this area, with its single-country publications (SCPs) far exceeding those of other countries, demonstrating the strong capability of Chinese scholars to conduct independent research domestically. At the same time, China has also engaged in a certain level of international collaboration (MCP), but compared to its substantial SCP volume, the proportion of international collaboration is relatively low. Following closely are India and the United States, whose research outputs are roughly comparable, though they still lag significantly behind China. India has a higher SCP count, indicating strong domestic independent research capabilities, while the United States has a relatively higher proportion of international collaboration, reflecting its more open academic cooperation model.
From a regional perspective, East Asian countries (such as China and South Korea) play a significant role in research in this field, with China’s dominance being particularly notable. Meanwhile, Western European countries (such as Italy, the United Kingdom, France, and Spain) and North American countries (such as the United States and Canada) also contribute significantly to the research output, with higher MCP proportions, indicating their active participation in international collaboration. Additionally, countries in the Middle East and South Asia (such as Saudi Arabia, Pakistan, and the United Arab Emirates) have shown a certain level of research activity in this field, reflecting a growing regional interest in related studies. Other countries (such as Malaysia, Morocco, Greece, and Turkey) have relatively lower research outputs, highlighting the regional disparities in research resources and development in this field.
Overall, China has the highest proportion of single-country research, while countries such as the United States, Canada, and the United Kingdom exhibit higher proportions of international collaboration. This distribution pattern reflects the regional development characteristics of machine learning and smart city research, while also revealing the importance and diversity of international scientific collaboration.
3.5. High-Yield Institutions Information
As shown in
Figure 9, in terms of publication volume, King Saud University ranks first, with 17 publications. It is followed by the National Institute of Technology (NIT System) and Wuhan University, both with 15 publications each. The Egyptian Knowledge Bank (EKB), the Indian Institute of Technology System (IIT System), King Abdulaziz University, and the Vellore Institute of Technology (VIT) have each contributed 13 papers, demonstrating notable performance. Additionally, the University of Florence and the University of Ottawa are tied, with 12 publications each, while the Chinese Academy of Sciences follows closely, with 11 publications.
From an institutional perspective, universities and research institutions in Saudi Arabia, India, Egypt, and China exhibit strong research capabilities in this field. Saudi Arabia’s King Saud University and King Abdulaziz University are particularly active in research on smart cities and artificial intelligence, likely due to government investments in scientific research and the emphasis on smart city development. India’s NIT System, IIT System, and VIT also show high publication volumes, reflecting the country’s growing research focus on smart cities and technological innovation. Egypt’s Egyptian Knowledge Bank (EKB), as a key academic resource platform, plays a significant role in advancing research on smart cities and artificial intelligence. Meanwhile, the Chinese Academy of Sciences and Wuhan University, as leading research institutions in China, have also demonstrated considerable influence in the fields of smart cities and machine learning.
Overall, smart cities and artificial intelligence technologies are attracting attention from universities and research institutions worldwide. Research institutions in countries such as Saudi Arabia, India, Egypt, and China have shown outstanding performance in this field, driving advancements in related technologies.
4. Analysis of Research Hotspots and Frontiers
4.1. Keyword Co-Occurrence Analysis
Figure 10 illustrates the clustering of keywords in the fields of machine learning and smart cities, reflecting the interconnections between different concepts and thematic focuses in the research. The red cluster revolves around “smart city” and is closely associated with keywords such as the IoT, neural network, deep learning, and intelligent transportation system. The research focuses on using technologies like deep learning and neural networks to optimize data processing, pattern recognition, and predictive analysis in smart cities [
68,
69]. It also involves applications such as traffic flow optimization, route planning, and smart parking to improve urban transportation efficiency [
70]. The green cluster mainly focuses on artificial intelligence, data mining, and data analysis, and also involves keywords like renewable energy, sentiment analysis, and energy efficiency. Machine learning and data mining are widely applied in smart cities, including social media analysis, urban sentiment analysis, and energy optimization. The blue cluster involves keywords like cybersecurity, information security, digital twin, and challenges, reflecting that with the extensive application of IoT devices [
71], information security and cybersecurity have become key research directions. This involves issues like data privacy and network attack defense, and it highlights the challenges faced in smart city construction, such as data privacy, infrastructure security, and social acceptance [
72]. The yellow cluster revolves around internet-of-things, edge computing, wireless networks, and 5G, indicating that the high-speed, low-latency characteristics of 5G technology, combined with the local data processing capabilities of edge computing, provide more efficient computing and communication methods for smart cities. The purple cluster covers keywords like big data analytics, smart grid, reinforcement learning, and artificial neural network [
73]. By optimizing energy consumption through machine learning, the efficiency of power systems can be enhanced, such as with smart meters and demand response.
Keywords are interconnected through lines, showing their interdependence and co-occurrence in research. From the figure, it can be seen that machine learning is the core concept of multiple clusters and is closely linked with keywords like big data analytics, cloud computing, and internet-of-things, indicating that big data analysis in smart cities relies on cloud computing and AI technologies for processing. In addition, 5G and edge computing show that smart cities are evolving toward more efficient computing and communication methods. These technological advancements provide more efficient data processing and communication means for smart cities, promoting intelligent urban management and improving residents’ quality of life.
4.2. Keyword Emergence Analysis
Figure 11 presents the “Top 15 Keywords with the Strongest Citation Bursts”, which refer to the keywords that experienced a sharp increase in citations during specific time periods. This analysis is significant for revealing dynamic changes and emerging hotspots in the research field. The “Year” of a keyword indicates its first appearance, “Strength” quantifies the intensity of the citation burst, and “Begin” and “End” mark the duration of the burst. The red bars represent the active periods of the keywords.
The year 2016 marked an important milestone, with keywords such as “data mining” (strength 3.92), “data analytics” (strength 3.34), “global positioning system” (strength 2.37), “genetic algorithm” (strength 1.77), and “low cost” (strength 1.77) emerging and forming bursts that lasted until 2018–2019. These keywords are closely related to research areas such as big data, machine learning, positioning technologies, and intelligent algorithms, reflecting the academic community’s strong focus on data mining, analytics, and low-cost optimization technologies at the time. From 2019 to 2020, the burst keywords “random forest” (strength 2.07), “intelligent transportation system (ITS)” (strength 1.97), and “optimization” (strength 1.93) indicated a shift in the research focus toward intelligent transportation systems and optimization algorithms. This trend reflects the widespread application of artificial intelligence and data-driven technologies in smart transportation, such as intelligent scheduling, route optimization, and autonomous driving. In 2021, the keywords “intelligence” (strength 2.23), “attacks” (strength 2.02), and “things” (strength 1.94) highlighted the growing attention to topics such as artificial intelligence, cybersecurity attacks, and the Internet of Things (IoT). This is likely linked to the rise of AI safety concerns and IoT security threats, reflecting the academic community’s emphasis on the security of intelligent systems. From 2022 to 2023, the keywords “feature extraction” (strength 1.91), “cloud” (strength 1.85), “federated learning” (strength 2.27), and “smart home” (strength 2.11) indicated a shift in research hotspots toward feature extraction, cloud computing, federated learning, and smart home technologies. Notably, federated learning emerged as a burst keyword in 2023 and continues into 2024, demonstrating increasing attention to privacy-preserving distributed machine learning techniques.
This trend illustrates the evolution of the research field from foundational technologies in data and algorithms to the security of intelligent systems, and then to broader smart applications, such as smart homes and cloud computing. It reflects the continuous advancement of technology and the expansion of application scenarios.
4.3. Keyword Time Zone Analysis
Figure 12 presents a keyword co-occurrence time zone map in the fields of machine learning and smart cities. The size of the nodes reflects the frequency of the keywords in the literature, while the color gradient from blue to yellow indicates the temporal evolution, spanning from around 2020 to 2022. As shown in the figure, “smart city”, “machine learning”, and “internet of things” are core concepts in the research, with larger and densely connected nodes, highlighting their importance in smart city technology applications. From a temporal perspective, the gradual transition of some keywords from blue to green and yellow reflects the evolution of research hotspots in recent years. For example, keywords such as “cybersecurity” and “data privacy” exhibit colors leaning toward green and yellow, indicating that data security and privacy protection have gained increasing attention as smart cities develop [
42]. Given that smart cities involve the collection, storage, and analysis of vast amounts of data, ensuring data security has become a key research focus. Additionally, technologies such as “cloud computing”, “edge computing”, and “blockchain” also hold significant positions in smart city research [
74], with node colors showing their rising research prominence during 2021–2022. In particular, cloud computing and edge computing play critical roles in the intelligent infrastructure of smart cities, such as in data storage, real-time computing, and resource optimization.
Notably, keywords such as “5G”, “smart grid”, and “renewable energy” exhibit node colors closer to yellow, indicating increased research activity in these areas in recent years [
75]. The widespread adoption of 5G technology supports efficient communication in smart cities, while advancements in smart grids and renewable energy reflect progress in sustainability and energy management within smart cities [
76]. In artificial intelligence (AI)-related research, the appearance of keywords such as “deep learning”, “LSTM”, and “ensemble learning” demonstrates the growing application of deep learning methods in smart cities, including intelligent transportation, air pollution prediction, and urban planning.
Overall, in the construction of smart cities, machine learning, as a core technology, is being widely applied across various subfields, such as cloud computing, the Internet of Things, data privacy, and cybersecurity. Over time, the co-occurrence relationships among these keywords have become increasingly close, indicating their synergistic roles within the overall framework of smart cities, thereby driving continuous development in this field.
Figure 13 uses different colors to represent distinct research themes, ranging from #0 to #9, which include “digital twin”, “electric vehicles”, “internet of things”, “big data”, “smart grid”, “neural networks”, “attacks”, “edge computing”, “support vector machines”, and “5G”. The size of each keyword’s bubble indicates the activity level of the research theme at a specific time point, while darker colors signify greater influence or citation concentration in the field.
First, digital twin is one of the most prominent research topics, spanning the entire timeline and showing rapid growth in recent years. Digital twin technology, through the synchronous mapping of virtual models with the physical world, provides robust data support for smart cities, industrial manufacturing, and infrastructure management. With advancements in artificial intelligence and IoT technologies, digital twin applications have become increasingly widespread in areas such as smart cities, energy management, and intelligent transportation [
77]. Second, IoT technology plays a critical role in smart city development by enhancing urban management efficiency through sensor networks, data collection, and intelligent analysis [
58]. As seen in the figure, IoT-related keywords, such as “data analytics”, “intelligent transportation systems”, and “deep learning”, are closely linked, reflecting the mainstream trend of data-driven smart cities. Third, research on smart grids and electric vehicles is highly active, particularly in areas such as energy consumption and resource management. As the world transitions toward sustainable energy, smart grid technologies, integrated with big data, AI, and IoT, optimize power distribution and improve energy efficiency [
78]. Fourth, machine learning techniques, such as neural networks and support vector machines, play significant roles in smart cities, anomaly detection, security protection, and intelligent decision-making [
79]. Fifth, with the increasing digitalization of smart cities, cybersecurity has become a key research focus [
80]. Keywords such as “attacks”, “privacy”, and “intrusion detection” reflect growing concerns about cybersecurity, especially in IoT and smart infrastructure applications. Meanwhile, edge computing, as a critical technology for optimizing data processing and reducing latency, is also widely applied in smart cities, 5G networks, and intelligent manufacturing.
Research in the smart city domain is evolving toward data-driven and intelligent decision-making. With the continuous maturation of technologies such as deep learning, edge computing, and digital twins, smart city development is gradually transitioning from theoretical exploration to practical applications, aiming to address real-world challenges in energy management, environmental optimization, and security protection. In the future, advancements in 5G networks and 6G mobile communication will further drive the upgrade of smart city network infrastructure, enhancing data transmission efficiency and intelligent capabilities.
5. Analysis of Cooperation Networks
5.1. Core Author Collaborative Network
Using the Scimago software, the literature database in the field of “machine learning and smart cities” was analyzed, and an author collaboration network diagram was generated.
Figure 14 presents the collaboration network among different scholars in the form of a chord diagram, with the colors distinguishing various research groups and the thickness of the connecting lines representing the strength of collaboration (weight), reflecting the research hotspots and collaboration trends in this field.
From
Figure 14, it can be observed that researchers such as Paolo Nesi, Muhammad Adnan Khan, Muhammad Ali Imran, and Azzedine Boukerche occupy central positions in the collaboration network. Their research teams maintain close collaborations with multiple scholars, forming concentrated collaboration clusters [
47,
49,
60,
81,
82]. Paolo Nesi’s collaboration network is extensive, involving multiple research groups, and he has significant research influence in areas such as intelligent transportation and urban big data analytics. Muhammad Adnan Khan and Muhammad Ali Imran also play important roles in this field, with their collaboration clusters being tightly knit and focused on network optimization, smart city communication architectures, and IoT-related research [
57,
83]. Azzedine Boukerche and his team concentrate on research in security, privacy protection, and intelligent computing infrastructure in smart cities, collaborating with numerous scholars [
84]. Additionally, scholars such as Zohra Ahmed and Sergeher Abbas form a relatively independent collaboration network, focusing on specific applications in smart cities, such as intelligent healthcare or smart energy management [
85]. Chen Xi, Chen Huijun, and Chen Jiayuan exhibit strong collaborative relationships, with in-depth cooperation in areas such as edge computing and data mining for smart cities [
86,
87].
Overall, academic collaboration in this research field exhibits a multi-centric characteristic, with several core scholars and their teams forming dense collaboration networks, while some relatively independent research groups also exist. This indicates that research in smart cities and machine learning not only relies on international cross-team collaboration but also includes scholars focusing on specific research directions. In the future, this field could further strengthen cross-team and interdisciplinary collaboration to more comprehensively advance smart city technologies, achieving the deeper integration of data intelligence, the IoT, edge computing, and other technologies, thereby enhancing the overall intelligence level of smart cities.
5.2. Institutional Cooperation Network
From
Figure 15, it can be observed that the institutional collaboration network in the current research field exhibits a high degree of complexity and an internationalized trend of cooperation. The connections between core institutions are tight, forming multiple academic collaboration hubs, reflecting the close cooperation patterns among institutions from different regions and types. The structure of this network reveals several prominent central nodes, which represent the major research institutions in the field and play a key role in global academic exchange and collaboration.
The Chinese Academy of Sciences occupies a central position in the collaboration network, indicating its significant influence in global academic research. It maintains extensive collaborations with multiple international institutions, such as Wuhan University, Shanghai Jiao Tong University, and Korea University, forming a relatively tight collaboration network. This reflects the trend of international cooperation among Chinese research institutions in related technological fields and underscores China’s important position in the global landscape of scientific and technological innovation. Additionally, King Saud University and King Abdulaziz University, as representative research institutions in the Middle East, also exhibit strong connectivity in the collaboration network. They have formed regional collaboration clusters with multiple international universities, such as Qatar University, Taibah University, and Umm Al Qura University. The Middle East’s investment in research areas such as smart cities, artificial intelligence, and related fields is continuously growing, and its academic influence is expanding through strengthened international cooperation.
Notably, the Hong Kong University of Science and Technology and the University of Hong Kong, as representative institutions in Hong Kong, China, maintain collaborative ties with multiple international institutions. This demonstrates the bridging role of Hong Kong universities in the global academic network, facilitating academic exchanges between Asia and regions such as Europe, America, and the Middle East, while also promoting the cross-border flow of knowledge. In terms of collaboration intensity, the connections between some institutions are thicker, indicating closer cooperation in this field. For example, the connection between the Chinese Academy of Sciences and Wuhan University is strong, reflecting deep collaboration among domestic research institutions. Similarly, the collaboration between King Saud University and King Abdulaziz University is relatively tight, indicating that research institutions within the same country or region often form stable collaboration networks.
Overall,
Figure 15 highlights the evident trends of the internationalization and diversification of collaboration in the fields of machine learning and smart cities. By establishing cross-border and cross-institutional cooperation hubs, it has facilitated the global sharing and dissemination of academic research and technological innovation. In the future, by further strengthening cross-border and cross-institutional collaboration, the field of smart cities is expected to achieve more efficient technology sharing and academic exchange, providing innovative momentum for the intelligent and sustainable development of cities worldwide.
5.3. International Cooperative Research
Figure 16 illustrates the collaborative relationships among multiple countries, presented in the form of an arc network that reflects the intensity of cooperation. The lines of different colors in the figure represent distinct collaborative groups, while the thickness of the lines indicates the strength of collaboration.
Table 1 displays the top ten countries ranked by centrality scores. As shown in
Figure 16, the collaboration exhibits a clear global characteristic, involving major regions such as Asia, Europe, North America, and Oceania, and forming a transcontinental collaboration network. Countries such as China, the USA, and several European nations (e.g., Germany, France, and the UK) play central roles in the collaboration network. The dense and widely radiating collaboration lines of these countries indicate their strong international influence in research or technological cooperation and their leading role in promoting global collaboration. The close collaborative relationships among European countries (e.g., Germany, France, the UK, Spain, and Sweden) reflect the tight connections in academic research or technological fields within Europe. This phenomenon may be related to the European Union’s internal research collaboration frameworks, such as the Horizon Europe program, which has facilitated cooperation among member states.
Collaboration in Asia is particularly active, with strong connections among countries such as China (0.31), India (0.2), Japan, and South Korea (0.12). Among these, China serves as the collaborative hub in Asia, maintaining close ties with multiple Asian countries (e.g., Malaysia, Pakistan, and Saudi Arabia (0.1)), demonstrating its growing technological capabilities and international influence in this field. Additionally, the USA (0.11), as a key player in global scientific research and technological collaboration, maintains strong connections with numerous countries. Italy (0.09) and Australia (0.1) also hold significant positions in the collaboration network, particularly in their notable collaborations with the USA, Europe, and Asia. This may be attributed to the high degree of internationalization in higher education and technological research and development in these countries.
The figure reveals the patterns of global scientific research or technological collaboration, showcasing a network centered around China, the USA, and several European countries, while also highlighting the tight intra-regional cooperation and transcontinental collaboration trends. This globalized collaboration model not only promotes knowledge sharing but also drives technological advancements and industrial development. In the future, with the deepening of technological innovation and the strengthening of international cooperation, this collaboration network is likely to expand and deepen further.
6. Discussion and Research Needs
Research on machine learning in the field of smart cities has made significant progress in recent years, with expanding research hotspots and increasing international collaboration. However, addressing issues such as fairness, privacy and security, and technological sustainability in urban digital transformation remains a critical challenge for both academia and industry. In the future, the deep integration of machine learning and smart cities will further drive innovation in urban governance models, enhance the ability to optimize resource allocation, and provide global urban residents with a more intelligent, efficient, and sustainable urban living experience, as illustrated in
Figure 17.
6.1. Spatial and Temporal Distribution
The phased and regional characteristics of research in this field are influenced by multiple factors. From a temporal perspective, the evolution of research is driven not only by technological advancements but also by policy support, industrial application demands, and the improvement of data infrastructure. During the initial development stage (2014–2017), research was primarily constrained by computational capabilities and algorithm development, focusing on theoretical exploration and preliminary applications. The rapid growth stage (2017–2021) benefited from breakthroughs in deep learning technologies, the widespread adoption of Internet of Things (IoT) devices, and the promotion of smart city construction policies in various countries, leading to the expansion of research hotspots into multiple areas of urban governance. Entering the stable stage (2022 to the present), the research focus has gradually shifted from technological breakthroughs to application optimization, interdisciplinary integration, and ethical and privacy concerns, reflecting the maturation of smart city construction and the increasing trend of practice-driven academic research.
From a spatial perspective, the imbalance in research capabilities is primarily related to economic development levels, government investment, the distribution of scientific resources, and international collaboration networks. Countries such as China, the United States, and India have high research outputs in the fields of smart cities and machine learning, benefiting from rapid urbanization, strong technological industrial foundations, and government policy support. European countries exhibit high activity in international collaborations, reflecting the openness of their research systems and their capacity for transnational cooperation. In contrast, regions such as South America and Africa have lower research outputs, mainly due to limited funding, inadequate infrastructure, and a shortage of high-end technical talent.
Future research needs to place a greater emphasis on practicality and inclusivity. While maintaining technological innovation, attention should also be given to lowering the barriers to technology application and promoting the equitable sharing of smart city construction achievements. At the same time, interdisciplinary and cross-domain research collaborations will become crucial pathways for advancing this field. It is recommended to strengthen the construction of international collaboration networks, promote balanced global development in smart city research, and provide more diverse solutions to address urbanization challenges.
6.2. Hotspot Analyses
From the hotspot analyses, it is evident that the research fields of machine learning and smart cities exhibit a multi-thematic and dynamically evolving characteristic. This trend is primarily driven by factors such as technological advancements, the demand for urban digital transformation, and challenges related to data security. The red cluster revolves around technologies like smart cities, the Internet of Things (IoT), and deep learning, reflecting core application scenarios such as intelligent traffic optimization. This highlights the critical demand for smart cities to enhance urban operational efficiency and residents’ quality of life. The green cluster focuses on artificial intelligence, data mining, and energy efficiency, indicating that the widespread application needs—from social media analysis to urban energy management—are driving the popularization of data-driven decision-making. The blue cluster emphasizes cybersecurity and digital twins, underscoring the urgency of data privacy protection and the security of critical infrastructure, especially as urban systems increasingly rely on interconnected technologies. The yellow cluster integrates new infrastructures, such as 5G and edge computing, providing efficient computational and communication support for smart cities, enabling distributed intelligent computing and real-time data processing. The purple cluster focuses on big data analytics and smart grids, showcasing the potential of machine learning in energy optimization and sustainable development.
From a temporal perspective, research hotspots have undergone significant evolution. In the early stages (2016–2019), the focus was primarily on foundational technologies, such as data mining, analysis, and optimization algorithms, laying the theoretical groundwork for subsequent applications. From 2019 to 2020, research began shifting toward specific application scenarios, like intelligent transportation systems, marking the transition of smart city technologies from laboratories to real-world environments. Starting in 2021, issues such as AI security and IoT security gained more attention, reflecting the growing complexity of smart city systems and the increasing scale of data, making security a critical concern. Recently (2022–2023), emerging hotspots include feature extraction, cloud computing, federated learning, and smart homes. The rise of privacy-preserving technologies like federated learning indicates that, in the context of increasingly widespread data sharing, achieving efficient intelligent computing while ensuring data security has become one of the core challenges in research.
In the future, smart city research needs to further focus on interdisciplinary integration, data security and privacy protection, the optimization of intelligent infrastructure, sustainable development and carbon neutrality, human–machine collaboration, and explainability. It is recommended to strengthen industry–academia–research collaboration, promote the deep integration of technological innovation and practical applications, and drive the high-quality development of smart city construction. At the same time, it is essential to enhance ethical norms and standard systems to ensure that the direction of technological development aligns with societal values.
6.3. Analysis of Cooperation
The global collaboration network in the field of smart cities and machine learning is becoming increasingly close. This trend is not only due to the complexity and interdisciplinary nature of the technology itself but also driven by support from national policies, growing industry demands, and the need for data resource sharing. From the perspective of scholars’ collaboration networks, research teams involving key scholars, such as Paolo Nesi, Muhammad Adnan Khan, Muhammad Ali Imran, and Azzedine Boukerche, continue to deepen their cooperation in critical subfields of smart cities, including intelligent transportation, network optimization, security and privacy, and the Internet of Things. This close academic connection helps drive technological innovation and application implementation. Meanwhile, from the perspective of institutional collaboration networks, institutions like the Chinese Academy of Sciences and King Saud University play a leading role in international cooperation. In particular, the Chinese Academy of Sciences, leveraging its extensive international cooperation network, has established solid collaborative relationships with both domestic and international universities and research institutions, making it an important hub for research in smart cities and machine learning.
The formation of this global scientific cooperation model is partly due to the fact that the construction of smart cities involves multiple technical fields, such as data analysis, artificial intelligence, the Internet of Things, and 5G communication. No single institution or country can independently achieve breakthroughs in the entire technological chain, thus making transnational cooperation an essential path for technological innovation. On the other hand, the growing demand for smart city applications from governments and enterprises also prompts research institutions to strengthen international cooperation to share research outcomes, optimize resource allocation, and accelerate technology transformation. Additionally, with the development of artificial intelligence technology bringing challenges in data privacy and security, the international academic community is increasingly focusing on issues of security, ethics, and regulations. This also encourages scholars from different countries to enhance cooperation and jointly explore feasible technical standards and policy frameworks on a global scale.
Future research should focus on establishing multi-level international cooperation platforms to promote knowledge sharing and technology transfer; enhancing exchanges among young scholars to cultivate cross-cultural research capabilities; promoting the deep integration of industry, academia, and research to accelerate the transformation of scientific achievements; improving the evaluation mechanisms for international cooperation projects to increase cooperation efficiency; and exploring new cooperation models in the “Internet+” era to overcome geographical limitations. By driving continuous innovation in the fields of smart cities and machine learning and promoting international cooperation to deeper levels, collaborative research is expected to achieve groundbreaking progress in the intelligent transformation of cities, providing strong support for sustainable urban development worldwide.
7. Conclusions
This study employs bibliometric methods to systematically analyze research on machine learning in the field of smart cities, revealing the spatial and temporal distribution characteristics, research hotspots, and cooperation network structures in this area. The results show that research on smart cities and machine learning exhibits distinct phases and regional characteristics, continuously evolving with technological advancements and application demands. The study clarifies the developmental stages and regional distribution characteristics of the smart cities and machine learning field, aiding researchers in understanding the evolutionary trajectory and global landscape of this domain, especially highlighting the period from 2017 to 2021 as a critical phase of rapid growth. This provides a temporal reference for future research directions. Additionally, by analyzing research hotspots, this paper highlights the importance of issues such as data privacy, network security, sustainable development, and intelligent transportation, reflecting the dual demand for technological innovation and green development in smart city construction. This offers priority directions for policymakers and technology developers. Furthermore, the cooperation network analysis indicates that international academic collaboration is becoming increasingly close, with research institutions in China, the United States, and Europe playing central roles in the global cooperation system. This drives technology sharing and interdisciplinary integration, providing practical guidance for fostering collaborative innovation in global smart city development. This study not only provides a systematic analytical framework for academic research and technological innovation in the field of smart cities and machine learning but also offers a scientific basis for optimizing future research directions and rationalizing resource allocation, possessing significant theoretical and practical value.
However, this study has several major limitations. Firstly, in terms of the research methodology, relying solely on the Web of Science database for publication retrieval may result in the omission of some important studies. Specifically, some high-quality conference papers or relevant research included in other databases might not have been analyzed. Secondly, the selection of only studies published in English introduces a language bias, potentially overlooking significant research contributions from non-English academic communities, especially from countries and regions with innovative practices in smart city development. To mitigate these limitations, future research could consider expanding to other major research databases and attempt to include multilingual publications to obtain a more comprehensive research perspective. Additionally, smart cities are a multidisciplinary field involving computer science, urban planning, sociology, environmental science, and more. However, this study focuses more on machine learning technology itself, with less exploration of its interdisciplinary applications and impacts, which may lead to an incomplete understanding of the overall development of smart cities. Future research should further integrate multidisciplinary perspectives, examining the intersectional impacts of machine learning on the social, economic, cultural, and policy aspects within smart cities. For example, it could investigate how machine learning technology promotes social inclusiveness and equitable resource distribution in smart cities. With the rapid development of technologies such as generative artificial intelligence, edge computing, and blockchain, future research directions could focus on the potential applications of these emerging technologies in smart cities. Furthermore, future studies should strengthen the focus on the practices of smart cities in developing countries and non-English-speaking regions to produce more globally representative research findings.
Overall, the research on machine learning in the field of smart cities has entered a mature stage, gradually deepening towards multidisciplinary integration, technology security, and sustainable development. In the future, ensuring technological innovation while protecting data privacy, enhancing technological inclusiveness, and promoting sustainable development will become important research topics in this field. By strengthening international cooperation, promoting technology sharing, and advancing policy formulation, the construction of smart cities is expected to achieve more intelligent, efficient, and sustainable development, creating a more livable future for urban residents worldwide.