3.1. Statistics by Subject of Publication
From 1 January 2014 to 1 May 2024, 596 articles were published, involving 175 journals, 198 countries and regions, 1716 institutions, and 2360 authors. The statistical results are summarized in
Table 1.
Figure 2 depicts the quantity and trend of publications related to GI and air pollution over the past decade. In 2022, there were 130 relevant publications. Although there was a slight decrease in 2023, which can be attributed to the impact of COVID-19, the steadily increasing trendline indicates a growing interest among researchers in this field.
Regarding national distribution (
Table 2), China leads with 125 publications (20.93%), making it the country with the highest number of publications in this research area. The United States and England follow China, with 112 (18.79%) and 104 (17.45%) publications, respectively. Italy and Australia also rank among the top five countries, indicating significant attention from researchers in these countries to the impact of GI on air pollution.
In terms of institutional affiliations, the top ten research institutions are predominantly from European and North American countries, with Italy (three institutions), the United Kingdom (two institutions), and the United States (two institutions) leading the list (
Table 3). Only two research institutions from China make it to the top ten, indicating a greater focus on green sustainable development topics in European and North American countries. Conversely, there appears to be a lack of emphasis on environmental and air pollution issues in Asia, where air pollution is more severe.
In terms of disciplinary distribution, the top ten research disciplines mainly revolve around environmental science ecology, other scientific and technological fields, forestry, urban studies, plant sciences, engineering, architectural technology, meteorology and atmospheric science, public environmental and occupational health, and energy fuels (
Table 4). Certain articles belong to interdisciplinary fields, which may be simultaneously classified into different disciplinary domains. Hence, the disciplinary classification count exceeds the total number of journal articles.
3.2. Author Collaboration Network
To some extent, the total number of papers published in journals represents the academic standing of the authors in the field. In contrast, the author’s collaboration network can reflect the core group of researchers and their collaborative relationships. This aspect was visualized and analyzed using CiteSpace software, with the results shown in
Figure 3. The size of the nodes represents the number of papers co-authored by the authors; the connections between nodes indicate collaborative relationships among different authors; and the node colors represent the publication years of the papers. Analyzing the number of papers authored by researchers and their connections in the research field helps identify prolific and influential authors [
27].
From
Figure 3, we screened authors with greater than or equal to four publications. Each node corresponds to an author, and the connecting lines between nodes represent links between authors, the node color corresponds to the time of appearance. After software calculations, there were 2361 nodes and 7185 edges. The network diagram depicts the largest network relationship. Regarding the number of published articles, Kumar has the highest number, with 32 papers (
Table 5), making him the most prolific author in this field. Due to being extensively cited by many authors, Kumar P has formed close collaborative relationships with numerous researchers and has significantly influenced the field. Other noteworthy authors include Branquinho, C, Zhang, KM, Mcphearson, T, and Abhijith, KV. Although they have not published the most papers, their works have been cited to varying degrees by other scholars in the field (
Table 6), indicating their influence and research potential in this area.
3.3. Journal Citation Path Dual-Map
The citation path dual-map of journals provides an intuitive depiction of the paths between citing and cited literature in various disciplinary fields, offering insights into the development and changes within the field (
Figure 4). The left side of the figure represents the disciplinary fields of the citing journals, while the right side represents those of the cited journals. These disciplinary classifications are based on the journal classifications provided by the Web of Science database in 2011. The red pathways indicate the development of disciplinary fields within the literature.
The graph shows that the literature in this research field has been mainly published over the past decade in “Ecology, earth, marine” and “Veterinary, animal, science”. Between 2014 and 2024, there has been a shift towards “Mathematics, systems, mathematical”. A smaller portion of the literature is in “Psychology, education, health” and “Economics, and politics”. The cited literature primarily originates from disciplines such as “Environmental, toxicology, nutrition” and “Plant, ecology, geology, geophysics”, while another portion comes from “Psychology, education, social” and “Economics, economic, political” fields. Furthermore, a small portion is sourced from “Health, nursing, medicine”.
This research field exhibits distinct interdisciplinary characteristics, likely involving complex systems and multidimensional issues necessitating the integration of knowledge and methodologies from various disciplines. Moreover, from a disciplinary development perspective, the field appears to be transitioning from predominantly natural and biological sciences toward greater reliance on mathematical and systems science methodologies. This shift may be driven by the need for more precise models and data analysis to address the complexity of research questions. Additionally, research in this field may require handling large volumes of complex data and developing comprehensive models to understand system behavior and changes, incorporating methods such as drone measurements and geographic information systems (GIS).
3.4. Keyword Co-Occurrence Network
One of the primary approaches in keyword co-occurrence analysis is to extract information such as keywords, abstracts, and titles from the target literature to form an intuitive knowledge map based on co-occurrence relationships. By recording the frequency of keyword occurrences, high-frequency keywords in the target literature can be identified, combined with a timeline to reveal hot topics in a research field over a period of time [
27]. In terms of parameter settings, given that the literature database for this study spans from 2014 to 2024, ten years, the time slice is set to one year, and the node filtering method selects the top 50, representing the top 50 most cited or occurring items in each time slice. A sub-network of keywords that appear only once in frequency is also pruned.
The results (
Figure 5) show a co-occurrence network with 216 keyword nodes and 948 connections. Node size represents the frequency of keyword occurrence, while the connections between nodes and their colors represent the relationships established over time. The thickness of connections indicates the strength of keyword co-occurrence (the thicker the line, the stronger the connection between keywords), and the color of nodes and lines represents the year of appearance. Unless otherwise specified, all nodes, colors, and connections mentioned in this paper are executed according to the above description.
Analyzing the graph, aside from the two largest nodes, “Green infrastructure” and “air pollution”, other noteworthy keywords include “ecosystem services”, “climate change”, “particulate matter”, “deposition”, “vegetation”, “urban heat island”, and “health”. These keywords indicate that the research focus in this field mainly revolves around the impact of GI on ecological environments, urban climate change, and health.
Examining the timeline, starting from 2022, research has primarily revolved around keywords such as “particulate matter”, “deposition”, “emissions”, “street canyon”, “urban air quality”, “physical activity”, and “pollutant dispersion”. This indicates that the research trend in this field is gradually shifting towards particulate matter deposition, street canyon dynamics, and pollutant dispersion, which may become new directions for future research.
Furthermore, keyword emergence visually indicates a particular keyword’s first appearance, end time, and citation intensity. Analyzing keyword bursts (
Figure 6), we observe that “Built environment” first appeared in 2016 and experienced a surge in citations from 2022 to 2024, with an intensity of 3.85, confirming the earlier prediction. Additionally, “Temperature” first appeared in 2020 and experienced a surge in citations from 2022 to 2024, with an intensity of 2.68, indicating it remains a current research hotspot. The other two hotspot keywords are “urban forestry” and “mortality”, further confirming the current research focus in this field.
In addition, the centrality of keywords is a crucial indicator of the positional significance of different keywords in the research field. It serves as an important basis for determining scholars’ focus of attention. Examining the centrality indicators representing the importance of node positions in the network (
Table 7), “Physical activity” and “Climate change” exhibit close connections with other hotspot keywords, indicating their frequent placement along paths connecting other keywords. This suggests that they play an active role in facilitating document interrelationships.
3.5. Keyword Cluster Network
Besides the keyword co-occurrence network, CiteSpace can automatically classify these keywords into varying-sized clusters using built-in algorithms. Observing these clusters can help us to discern the distribution of research topics in the field. Additionally, the size of the clusters reflects the popularity of the research topics and their changes over time. The parameter settings for this graph are the same as those for the keyword co-occurrence network, with the largest ten clusters selected from the network. The resulting keyword cluster graph is depicted in
Figure 7. Each colored block represents a cluster, with the color indicating the cluster’s average publication year. This graph comprises 527 nodes and 2450 links. The modularity value (Q) is associated with the density of nodes, where a higher Q value indicates better clustering effectiveness. The average silhouette value (S) measures the homogeneity of clusters, with a higher S indicating greater homogeneity and, thus, higher credibility. In this graph, Q = 0.7011, suggesting good clustering effectiveness, while S = 0.8669 indicates high credibility for this clustering [
27].
The largest ten clusters in this research field are listed in
Table 8. The average year data shows that the average year of these clusters is around 2016, indicating that this field has already formed a relatively mature research system. The “Size” column represents the number of keywords included in each cluster. By examining the connections in the graph, it can be seen that in 2024, the research topics in this field mainly revolve around keywords such as “impact”, “particulate matter”, “environmental”, “city”, “urban green infrastructure”, “trees”, and “urban street canyon”. However, no new research topics have emerged in recent years, indicating the need for more opportunities for interdisciplinary integration and greater attention from researchers to drive the development of this research field.
In CiteSpace, the intermediary centrality of a piece of literature can be calculated based on its citation and co-citation relationships with the other literature. A higher intermediary centrality value indicates a more important node position in the network, belonging to two categories: (a) hub nodes highly connected with other nodes and (b) nodes located between different clusters. Since nodes between clusters are likely to connect different research topics, they may represent emerging research trends [
33].
In Cluster #0, the literature with the highest intermediary centrality related to the theme “Green Space” is the study by Ebisu et al. [
34]: “Association between greenness, urbanicity, and birth weight”. This study utilized birth certificate data from Connecticut, USA (2000–2006) to investigate the relationship between green spaces or green areas and birth weight. The findings revealed that increased green space near residences was associated with increased birth weight and a reduced risk of low birth weight (LBW). The results suggest that integrating more green spaces into urban environments can reduce the risk of adverse birth outcomes and play a significant role in public health planning. Meanwhile, another piece of literature with high intermediary centrality [
35] confirms from the public health perspective that GI positively affects the community and individual physical and mental health. Weerakkody et al. [
36] explored the differences in atmospheric particulate matter (PM) capture ability among plant species. The results indicated that selecting plant combinations with small leaves, hairs, and waxy surfaces may enhance the effectiveness of green wall systems as atmospheric PM filters. Additionally, refs. [
37,
38], respectively, confirmed from other perspectives the positive impact of GI on improving air pollution.
In Cluster #1, the most cited keywords are air pollution (206), vegetation (102), and deposition (61). The literature with the highest intermediary centrality related to the theme “Roadside Vegetation Barrier” is by Tong et al. [
39], titled “Roadside vegetation barrier designs to mitigate near-road air pollution impacts”. This study assessed the impact of roadside vegetation barriers on air quality, indicating that wide and dense vegetation barriers are more effective. Additionally, ref. [
28] discussed the influence of roadside green belts on individual exposure to local air pollution in street and building environments. The literature mentioned that greening of building envelope structures—green walls and green roofs—is a passive atmospheric pollution control measure considered a sustainable control method.
In Cluster #2, the most cited keywords are green infrastructure (334), quality (89), urban (42), dispersion (33), and street canyon (32). The literature with the highest intermediary centrality related to the theme “Quantitative assessment” is by Derkzen et al. [
40], published “Quantifying urban ecosystem services based on high-resolution data of urban green space: an assessment for Rotterdam, the Netherlands”. This study quantified and mapped six ecosystem services supplied by green spaces in Rotterdam using high-resolution land cover data, emphasizing the importance of green space design in urban planning for healthier and more climate-resilient cities. Meanwhile, Baró et al. [
41] proposed a new method to assess the mismatch between supply and demand of ecosystem services in urban areas. Based on environmental quality standards and policy objectives, it was revealed that regulating ecosystem services has a minor role in reducing urban air pollution and greenhouse gas emissions. The research results indicate that the contribution of regulatory ecosystem services provided by urban GI to maintain environmental quality standards is limited, serving as a complementary measure compared to other urban policies. Russo et al. [
42] integrated ENVI-met and UFORE models with the local field, pollution, and climate data to explore the local-scale effects of different tree-dominated streetscapes in Bolzano, Italy, on mitigating temperature and air pollution. The results demonstrate the positive role of urban trees in reducing pollution, improving temperature, and enhancing human comfort, contributing to assessing the role of urban GI in improving human well-being and mitigating the impacts of climate change.
Another noteworthy cluster is Cluster #4, focusing on sustainable cities. The most frequently occurring keywords are “urban heat island” (32), “space” (32), and “roofs” (14). This cluster encompasses topics related to sustainable urban development, including green cities, sponge cities, drainage systems, and urban heat islands. The literature with the highest intermediary centrality in this cluster is by Song et al. [
43] titled, “Diurnal changes in urban boundary layer environment induced by urban greening”. Using integrated urban-land-atmosphere modeling, this study found that green roofs significantly reduce urban boundary layer temperatures, while the effect of street canyon greening is limited. Urban greening induces different trends in air quality changes over diurnal cycles, contributing to improved urban environments and sustainability. Another highly central piece of literature [
44] in this cluster summarizes the various environmental and social benefits brought to cities by combining urban drainage systems and GI.
3.6. Keyword Cluster Timeline
The authors created a timeline graph of keyword clusters to analyze the evolution of research development over time from a temporal perspective (
Figure 8). The timeline graph arranges the keywords of research topics according to chronological order, providing a clear display in a two-dimensional coordinate system with time as the horizontal axis. In the timeline graph, the size of the nodes represents the frequency of occurrence of the keywords; the year associated with the nodes indicates the year of the first appearance of the keyword; the lines between nodes represent keywords appearing together in a document, indicating the relationship between keywords; and red nodes represent hotspot keywords.
Through analysis, it was found that the high-frequency keywords first appeared predominantly in 2014 across various clusters. Although new keywords continue to emerge after that, there is an overall trend of decline. To some extent, this suggests a decrease in the overall research interest in this field in recent years, a phenomenon also supported by the data presented in
Table 8. Furthermore, in recent years, the main research topics in this field have mainly focused on two directions: Cluster #0, green space; and Cluster #1, roadside vegetation barrier. From a temporal perspective, the research content has evolved from the initial focus on green roofs and vegetation to more in-depth topics such as modeling exercises, nature-based solutions, PM2.5 dispersion, etc.
3.7. The Literature Co-Citation Network and Overly Network
The literature co-citation network is a scientific bibliometric method used in literature analysis. It quantifies the co-citation relationships between documents [
45]. Co-citation analysis measures the frequency at which cited references (cited reference) are cited by other documents (citing reference), analyzing and calculating the cross-referencing relationships between selected papers. When other documents frequently cite two documents, they indicate a strong correlation [
45,
46]. Network analysis is employed to visualize the co-citation structure of a network of documents. To better understand the connections and development process between research areas, researchers calculated and visualized the literature co-citation network using Citespace (
Figure 9). The authors overlaid the visualization networks to observe the citation situation of the literature more intuitively from 2023 to 2024. The logic behind overlaying the networks is to superimpose smaller data as one layer onto the layer of larger data. This allows for a more intuitive observation of the citation situation of the years under analysis (small data) within the entire data range (
Figure 10) and the development process and changes in research areas. The Q value of this visualization result is 0.7062, and the S value is 0.8931, indicating high credibility for the results.
By comparing the two figures, it can be observed that the literature frequently cited from 2023 to 2024 mainly focuses on the following topics: evergreen woody plants, sustainable communities, unmanned aerial vehicle measurement, wind condition, green roofs, living walls, spatiotemporal analysis, etc. (indicated by red lines). This indicates that the recent research directions in GI and air pollution predominantly lean towards several areas, including (a) green plants and ecology [
47,
48,
49,
50,
51]: researching plant species, growth habits, and their roles in ecosystems; (b) sustainable development and communities [
15,
52,
53,
54]: establishing and maintaining environmentally friendly and resource-efficient urban communities; (c) drone technology and applications [
55,
56,
57,
58]: drones are used for environmental monitoring and data collection, including measuring wind conditions and thermal data; (d) green technology and urban planning [
16,
59,
60,
61,
62]: improving urban environments, conserving energy, and beautifying cities through plant cultivation; (e) spatial and temporal data analysis [
63,
64,
65,
66]: analyzing data changes over time and space using GIS or other techniques; (f) life safety and health [
67,
68,
69]: investigating the impact of GI on human health.
Furthermore, these research areas emphasize environmental protection and sustainable development, highlighting the application of technological innovations in addressing environmental issues. However, compared to the research domain clusters from 2018 to 2019, which mainly focused on traditional environmental issues and urban management areas, such as urban ecosystem services, traffic-related pollutant concentration, and metropolitan cities (indicated by green lines), there is an overall trend of weakening. This could be attributed to several reasons: (a) Technological advancements: In recent years, the advancement and widespread adoption of technologies such as drones and smart devices have provided new methods and higher precision for environmental detection and analysis. (b) Increased environmental awareness: With a significant global increase in attention to environmental protection and sustainable development, research directions have shifted from singular pollution source control towards comprehensive ecosystem management and sustainable community development. (c) Escalation of climate change: The increasingly significant impacts of climate change have prompted research to focus more on ecosystem adaptation and mitigation strategies, such as studying the influence of wind conditions on climate patterns and the role of plants in mitigating climate change. (d) Urbanization processes: With the advancement of global urbanization, research focuses have expanded from pollution control in major cities to broader studies on urban ecosystem services and GI to address complex urban environmental issues.
Among the specific nodes the literature represents, the most central document regarding intermediary centrality is [
37]. This paper connects clusters #0, #5, #7, and #8 and occupies a central position in cluster #1, indicating its significant influence on multiple research topics. Using Barcelona as a case study, the results suggest that urban forests significantly mitigate pollution, albeit moderately compared to overall urban pollution levels and greenhouse gas emissions. Thus, there is a need to coordinate efforts in GI at a broader spatial scale to counteract urban pollution effectively. Additionally, in terms of positioning, documents authored by Abhijith et al. [
70] (centrality 0.17) and Nowak et al. [
71] (centrality 0.15) are located at the boundary of clusters #5 and #6, connecting multiple nodes, indicating that these two papers have, to some extent, propelled the development of the research field. [
70] investigated the impact of roadside GI on different particulate matter concentrations, while [
71] studied the positive effects of ecosystem services on health. The other influential review paper is authored by Abhijith and Kumar [
28] (centrality 0.15), which examines the influence of urban GI on air pollution levels and suggests a comprehensive and quantitative approach to deploying various vegetation types in the built environment. The study evaluated individual exposure to air pollution sources in open roads and built street canyon environments in the presence of vegetation, critically summarizing the available literature to better understand the interaction between vegetation and surrounding built environments and identify methods for using GI to reduce local air pollution exposure. This node is the most frequently cited throughout the research field, indicating its high impact.
3.8. Modeling of Structural Variation in the Literature in the Field of GI and Air Pollution
Chen initially proposed research on the Theory of Structural Variation in 2012 [
72]. Its core issue is how to quantify the novelty of a document. According to Professor Chen’s theory, the creativity or innovation of scientific research largely involves connectivity, the ability to connect unrelated concepts or ideas, essentially bridging between different islands. The second condition is that the appearance of this bridge quickly attracts other researchers’ attention to the research field and leads to more research results being published. Simply, it triggers a butterfly effect, which CiteSpace refers to as the structural variation model.
The structural variation model mainly includes three indicators measuring the degree of structural transformation. These are Modularity Change Rate (ΔModularity), which refers to the increase in connections in the knowledge base network caused by citing the literature. The larger the ΔM value, the greater the potential impact of the literature on disciplinary development; Cluster Linkage Change Rate (ΔCluster linkage) indicates the change in the span of node connections between different clusters in the basic network caused by citing the literature. A larger ΔCL value indicates a stronger interdisciplinary attribute of the literature and Centrality Dispersion (ΔCentrality), which measures the change in centrality of nodes between two time periods. A larger ΔC value indicates that the literature becomes more important during the period [
27].
Among all the literature in this research field, the significant literature in all three indicators is the paper by Abhijith et al. [
28], published in 2017, as visualized in
Figure 11. In the figure, the solid blue lines represent citation relationships before the publication of this paper. In contrast, the dashed red lines represent citation relationships formed after the publication of this paper. In essence, after the publication of this paper, the research field gradually attracted more attention from researchers, forming a larger cross-citation subnetwork, demonstrating that this paper has the strongest impact in this research field.
The field has another highly influential article, published by Bottalico et al. [
73] in 2017, as visualized in
Figure 12. This article, based in Florence, Italy, utilized high spatial resolution remote sensing data and on-site observations to create a spatial distribution map of urban forests and estimate the leaf area index (LAI). It measured and analyzed the effectiveness of urban forests in removing the air pollutants PM
10 and O
3. The results emphasized the different contributions of various vegetation types to air pollution removal and suggested increasing the diversity of green coverage in urban planning to enhance air quality improvement. Through comprehensive screening of the three parameters, the literature in this research field includes articles with strong influence and research potential, as shown in
Table 9. It is worth noting that, due to the increasing citation and impact over time, there may be articles with potential that have not yet gained attention.