Mapping the Global Discourse on Sustainable Development: A Sentiment-Based Clustering of SDG Narratives Across 100 Countries

Sufi, Fahim; Alghamdi, Mohammed J.; Alsulami, Musleh

doi:10.3390/su17167455

Open AccessArticle

Mapping the Global Discourse on Sustainable Development: A Sentiment-Based Clustering of SDG Narratives Across 100 Countries

by

Fahim Sufi

^1,*

,

Mohammed J. Alghamdi

²

and

Musleh Alsulami

²

¹

COEUS Institute, New Market, VA 22844, USA

²

Department of Software Engineering, College of Computing, Umm Al-Qura University, Makkah 21961, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Sustainability 2025, 17(16), 7455; https://doi.org/10.3390/su17167455

Submission received: 22 June 2025 / Revised: 3 August 2025 / Accepted: 13 August 2025 / Published: 18 August 2025

Download

Browse Figures

Versions Notes

Abstract

Understanding how media narratives frame the Sustainable Development Goals (SDGs) is essential for global sustainability governance. This study presents a novel, data-driven analysis of 135,000 news articles mapped to SDGs 1–17 across 100 countries. Using polarity-based sentiment aggregation and principal component analysis (PCA), we reduce high-dimensional SDG sentiment profiles into a two-dimensional space and identify emergent clusters of countries using K-means. To contextualize these clusters, we integrate national-level indicators like Human Development Index (HDI), GDP per capita, CO₂ emissions, and press freedom scores, revealing robust correlations between sentiment structure and developmental attributes. Countries with higher HDI and freer media environments produce more optimistic and diverse SDG narratives, while lower-HDI countries tend toward more polarized or crisis-framed coverage. Our findings offer a typology of SDG discourse that reflects geopolitical, environmental, and informational asymmetries, providing new insights to support international policy coordination and sustainability communication. This work contributes a scalable methodology for monitoring global sustainability sentiment and underscores the importance of narrative equity in achieving Agenda 2030.

Keywords:

Sustainable Development Goals (SDGs); global sustainability discourse; sustainability communication; SDG sentiment profiling; comparative sustainability analytics

1. Introduction

The 2030 Agenda for Sustainable Development has galvanized an unprecedented global commitment to the achievement of 17 interlinked Sustainable Development Goals (SDGs), which collectively strive to balance the social, economic, and environmental dimensions of development. Monitoring progress toward these goals requires timely, accurate, and scalable mechanisms for evaluating policy communication and public discourse, especially in the global news ecosystem, which serves as a mirror and moderator of societal priorities [1]. Despite burgeoning research on SDG implementation, there remains a significant methodological gap in automating the classification and sentiment assessment of SDG-related content within large-scale, unstructured news corpora [2,3,4,5]. Furthermore, existing frameworks largely overlook the geopolitical and linguistic diversity of media narratives, which limits their utility in informing context-sensitive sustainability strategies [5,6].

To address this gap, the present study proposes a scalable, AI-driven methodology that integrates large language models (GPT-3.5), neural classifiers, keyword filtering, and topic modeling to transform 1.5 million unstructured news headlines into structured datasets classified according to SDG relevance. Drawing upon state-of-the-art GPT-based semantic parsing, the methodology performs multistep processing to extract named entities (e.g., country, disaster type), infer SDG mappings, assess sentiment polarity and subjectivity, and compute correlation structures and PCA-based clustering across geopolitical dimensions. These steps are rigorously encoded in a suite of three pseudocode algorithms to enhance clarity and reproducibility.

The empirical analysis yields critical insights into representation of the SDGs in the global media. For instance, the results show that SDG 3 (Good Health and Well-being) and SDG 13 (Climate Action) are the most frequently represented goals across headlines. A total of 64.4% of headlines represent either positive or negative sentiments, while 35.6% are neutral in sentiment. The subjectivity scores vary significantly across countries, with certain regions displaying higher narrative objectivity in development-related topics. Furthermore, PCA clustering reveals geopolitical groupings in tendencies related to SDG reporting, while the correlation matrix uncovers thematic co-occurrence patterns (e.g., SDG 1 and SDG 2; SDG 7 and SDG 13). The pipeline also surfaces anomalies in sentiment polarity (e.g., sharp negativity spikes during climate summits) and reports strong SDG-specific associations with sentiment-laden lexical features.

The significance of this research is manifold. First, it demonstrates a mathematically rigorous, scalable framework for SDG classification and analysis using real-time, high-volume news data. Second, it advances the methodological frontier by integrating LLM-based semantic extraction with statistical and visual analytics, contributing to the growing literature on digital sustainability governance [7,8]. Third, it enables policymakers, civil society actors, and international agencies to derive fine-grained insights into how public discourse reflects, diverges from, or reinforces global sustainability priorities. This work therefore not only fills a major technical lacuna in the literature on SDG monitoring but also provides a replicable foundation for longitudinal and comparative analysis of sustainability narratives. This study contributes to the literature on global sustainability discourse and digital SDG monitoring in the following measurable ways:

Scalable mapping of 135,000 news articles: The study utilizes a curated corpus of 135,000 SDG-labeled news headlines from 100 countries (2023–2025), representing the largest known media-based sentiment dataset aligned with all 17 Sustainable Development Goals. The SDG-related news extracted for this study has been made publicly available at https://github.com/DrSufi/SDG (accessed 30 May 2025) to support research reproducibility.
Sentiment and subjectivity profiling across nations: Through polarity scoring and subjectivity analysis, the paper reveals that 64.4% of headlines exhibit a clear sentiment orientation (positive or negative), while 35.6% are sentiment-neutral. Country-level variations uncover that reporting in nations such as Germany and Japan demonstrates systematically more objective coverage, whereas that in Brazil and Nigeria displays elevated emotional subjectivity.
Unsupervised clustering of global narratives: Employing Principal Component Analysis (PCA) and K-means clustering, the analysis identifies four distinct geopolitical narrative clusters. These clusters show statistically significant differences in Human Development Index (HDI), GDP per capita, CO₂ emissions, and press freedom (e.g., Cluster 1: HDI = 0.880 ± 0.034 vs. Cluster 3: HDI = 0.620 ± 0.058).
Correlation with governance metrics: The sentiment-based PCA dimensions correlate strongly with national indicators, including HDI ( $r = 0.72$ ), GDP per capita ( $r = 0.68$ ), and press freedom index ( $r = - 0.51$ ), evidencing that the structure of the media sentiment reflects underlying developmental asymmetries.
Temporal dynamics of SDG discourse: Monthly polarity trends from Jan 2024 to May 2025 reveal sentiment volatility synchronized with major global events.

Theoretically, this study formalizes its GPT-based semantic classification, sentiment scoring, and clustering methods into three pseudocode algorithms, enabling full reproducibility of the transformation from unstructured media text to structured SDG discourse analytics.

2. Contextual Background

The Sustainable Development Goals (SDGs) constitute a globally adopted framework comprising 17 interlinked objectives aimed at eradicating poverty, mitigating inequality, safeguarding planetary boundaries, and promoting inclusive prosperity by 2030. As highlighted in the UN’s 2023 progress report [9], the trajectory toward achieving these targets remains precarious, in part due to inconsistency in monitoring frameworks and the fragmented nature of data sources. Academic responses to this policy agenda have evolved from static indicator-based tracking toward more dynamic assessments of intergoal relationships, policy coherence, and system-level feedbacks [2,3,4].

The recent literature has identified critical gaps in the computational architecture of SDG tracking. Remote sensing techniques have enabled geospatial monitoring of a subset of SDG indicators, yet their coverage remains limited to only 30 of the 231 official indicators [3]. Policy scholars have emphasized the political dilution and epistemological contestations surrounding indicator formulation [6,10], while systems-oriented approaches have proposed prioritization matrices to address gaps and synergies across goals [11]. Nevertheless, the existing corpus remains highly reliant on manual reviews, static datasets, and weakly semantic classification strategies [7,12].

Simultaneously, emerging AI methodologies—particularly those leveraging large language models (LLMs) and neural-symbolic reasoning—offer untapped potential to transform SDG analytics. Studies such as [8] demonstrate the promise of LLM-augmented knowledge graphs for semantic alignment across open data streams, while [13] showed the viability of digital and AI-based monitoring frameworks in healthcare SDGs. Yet these innovations are largely absent from news-based systems for SDG tracking, which remain underdeveloped despite the volume, velocity, and narrative richness of open-source media.

The research field of computational news analytics, particularly for public policy evaluation, has gained traction through methodological advances in bias quantification [14], deep learning-driven classification [15], and domain-specific mathematical modeling [16]. These innovations collectively undergird a broader shift from descriptive to diagnostic and predictive media analytics that leverage semantic abstraction, entity extraction, and sentiment modeling. However, current applications have yet to be sufficiently integrated into SDG-monitoring frameworks, which remain ill-equipped to process unstructured news data at scale.

While prior studies have addressed SDG interlinkages, remote-sensing coverage, and knowledge representation using ontologies or graphs, none has proposed a scalable, GPT-enhanced pipeline that classifies and analyzes global news for real-time SDG tracking. Most prior frameworks are confined to symbolic keyword mapping or retrospective dataset analysis, lacking semantic generalization and temporal granularity. This study thus introduces a mathematically formalized, keyword-guided, GPT-based classification architecture applied to 1.5 million news articles, thereby bridging the methodological divide between structured indicator models and unstructured textual narratives.

3. Methodology

To facilitate clarity in the exposition of the analytical workflow, Table 1 provides a comprehensive summary of the key mathematical symbols and notations employed in the methodology section. Figure 1 presents an overview of the methodological architecture employed in this study. The workflow is structured into three integrated stages: (i) data acquisition and preprocessing, where SDG-labeled headlines are extracted, sentiment-scored, and normalized by country; (ii) feature construction and pattern extraction, which involves the construction of a country–SDG sentiment matrix, dimensionality reduction using PCA, and unsupervised clustering via K-means; and (iii) indicator integration and analytical interpretation, where external geopolitical and developmental metrics (HDI, GDP, CO₂ emissions, press freedom) are incorporated to explain and contextualize sentiment-based clusters. This modular pipeline ensures both analytical rigor and interpretability across comparative sustainability narratives.

3.1. Dataset Acquisition and SDG Labeling

This study employs a curated dataset of approximately 135,000 news-article headlines sourced from diverse online news platforms and spanning October 2023 to May 2025. Each headline was annotated with an appropriate Sustainable Development Goal (SDG) label using OpenAI’s GPT-3.5-turbo, a state-of-the-art autoregressive language model capable of aligning unstructured text to a predefined policy taxonomy. Zero-shot prompt engineering was used to identify the most contextually relevant SDG for each article.

Sentiment polarity was computed using the TextBlob v0.17.1 library, which employs a lexicon-based scoring algorithm with part-of-speech tagging. Each article’s sentiment polarity score falls within the interval

[- 1, + 1]

, where

- 1

indicates strongly negative tone and

+ 1

indicates strongly positive tone.

3.2. Country Attribution and Standardization

Country attribution was performed using a named entity extraction field (dfs_firsteventcountry) from each article, which designates the most probable geographic locus of the reported event. A manual normalization schema was applied to harmonize inconsistent nomenclature (e.g., “UK”, “Great Britain”, “United Kingdom” → “GB”) according to ISO 3166-1 alpha-2 codes [17].

Articles with ambiguous or global-only attribution (e.g., “Global”, “Unknown”) were excluded. This normalization ensured consistent geopolitical granularity across all analyses.

3.3. Construction of Country–SDG Sentiment Matrix

For each country c and SDG s, we computed the mean sentiment polarity of all associated articles. This resulted in a matrix

S \in R^{C \times 17}

, where:

S_{c s} = \{\begin{matrix} \frac{1}{N_{c s}} \sum_{i = 1}^{N_{c s}} p_{i}, & if N_{c s} > 0 \\ 0, & otherwise \end{matrix}

Here,

N_{c s}

denotes the number of articles from country c labeled with SDG s, and

p_{i}

is the polarity score of article i. Missing values were imputed with zeros to ensure dimensional consistency. The analysis was restricted to the top 100 countries with the greatest numbers of SDG-labeled articles.

3.4. Dimensionality Reduction via Principal Component Analysis (PCA)

To visualize and interpret structural variation across countries’ SDG sentiment profiles, we applied Principal Component Analysis (PCA) to reduce the 17-dimensional space into two principal components [18]. Given a centered sentiment matrix

X \in R^{100 \times 17}

, PCA solves the eigenvalue problem as follows:

X_{centered} = U Σ V^{T}

The first two principal components,

{PC}_{1}

and

{PC}_{2}

, captured the majority of variance in SDG sentiment orientation and served as inputs to clustering analysis. PCA was chosen over non-linear methods (e.g., t-SNE, UMAP) due to its linearity, reproducibility, and interpretability via Euclidean distances.

3.5. Clustering via K-Means

Unsupervised clustering was performed using the K-means algorithm to identify countries with similar sentiment structures [18]. Let

Z \in R^{100 \times 2}

denote the PCA-reduced matrix. K-means partitions the countries into k disjoint clusters

C_{1}, C_{2}, \dots, C_{k}

by minimizing the following expression:

arg min_{C} \sum_{i = 1}^{k} \sum_{z \in C_{i}} {∥ z - μ_{i} ∥}^{2}

where

μ_{i}

is the centroid of cluster

C_{i}

. We empirically selected

k = 4

based on silhouette analysis and interpretability. Each resulting cluster represents a typology of countries based on sentiment in relation to SDG narratives.

3.6. Integration of Sustainability and Governance Indicators

To contextualize the sentiment-based clusters, we integrated a set of authoritative country-level indicators that reflect developmental, environmental, and informational attributes. Each indicator was joined to the sentiment matrix via ISO-3166-1 alpha-2 country codes, allowing comparative analysis across clusters. Table 2 presents a summary of the selected datasets.

3.7. Correlation and Statistical Analysis

We performed statistical analyses to interpret relationships between patterns in sentiments associated with SDGs and exogenous development indicators, as follows:

Pearson correlation coefficients between PCA components and each indicator.
One-way ANOVA tests across clusters to determine significant differences in indicator means.
Boxplots and descriptive statistics (mean ± standard deviation) for visual exploration and quantitative comparison.

These analyses provide a rigorous quantitative foundation for interpreting the policy implications of sentiment-based clustering in global SDG reporting.

3.8. Algorithmic Implementation

To further enhance procedural transparency and reproducibility, we present three modular algorithms that encapsulate the core components of the proposed methodology. While the framework has already been comprehensively articulated through a flowchart (Figure 1), descriptive exposition, and formal mathematical notation (Table 1), these algorithms provide an abstract yet operational perspective on the key computational stages of the analysis.

Algorithm 1 outlines the construction of the country–SDG sentiment matrix, including news classification, sentiment scoring, and country normalization. Algorithm 2 describes the dimensionality-reduction process using Principal Component Analysis (PCA) and subsequent clustering via K-means to identify latent groupings of countries based on sentiment profiles. Algorithm 3 details the integration of external sustainability and governance indicators—such as HDI, GDP per capita, CO₂ emissions, and press freedom—as well as the statistical procedures used to compare and interpret the identified clusters.

Algorithm 1 Construct Country–SDG Sentiment Matrix
Require: News articles $D$ , each with title $t_{i}$
Ensure: Sentiment matrix $S \in R^{C \times 17}$
1:	for all article i in $D$ do
2:	$s_{i} \leftarrow$ `GPT-3.5 classify( $t_{i}$ )` Top-1 SDG label
3:	$p_{i} \leftarrow$ `TextBlob polarity( $t_{i}$ )` Sentiment score
4:	$c_{i} \leftarrow$ `NormalizeCountry`(`NER`( $t_{i}$ ))
5:	Append $(c_{i}, s_{i}, p_{i})$ to dataset $T$
6:	end for
7:	for all country c and SDG s in $T$ do
8:	$S [c, s] \leftarrow \frac{1}{N_{c s}} \sum_{i = 1}^{N_{c s}} p_{i}$
9:	end for
10:	return S

Algorithm 2 Dimensionality Reduction and Clustering
Require: Sentiment matrix $S \in R^{C \times 17}$
Ensure: PCA-reduced coordinates Z, cluster labels $L$
1:	$X \leftarrow$ `Standardize`(S)
2:	$Z \leftarrow$ `PCA`(X, components $= 2$ )
3:	$L \leftarrow$ `KMeans`(Z, clusters $= 4$ )
4:	return $Z, L$

Algorithm 3 Integration of External Indicators and Analysis
Require: PCA coordinates Z, cluster labels $L$ , indicator table $I$
Ensure: Correlation results $ρ$ , ANOVA results $α$
1:	for all country c do
2:	Match $Z [c]$ with $I [c]$ to form full feature vector
3:	end for
4:	$ρ \leftarrow$ `PearsonCorrelation`(Z, $I$ )
5:	$α \leftarrow$ `ANOVA`( $I$ , groupby= $L$ )
6:	return $ρ, α$

4. Results and Discussion

4.1. Data Statistics

To provide foundational insight into the empirical scope and representativeness of the study, this subsection presents descriptive statistics for the primary datasets utilized. The analysis integrates a large-scale, multicountry collection of news headlines annotated with Sustainable Development Goal (SDG) labels, complemented by a curated set of global development and governance indicators. Three tables below provide quantitative summaries of (i) the news corpus (Table 3), (ii) the SDG distribution of coverage (Table 4), and (iii) the auxiliary metadata used for correlation and clustering analysis (Table 5). The relatively uniform SDG distribution in Table 4 reflects the global prevalence of integrative topics (e.g., health, innovation) that co-occur across news headlines. The GPT classifier may also converge toward dominant themes due to semantic proximity among SDG indicators. Future refinements using multilabel classification could mitigate this flattening effect.

4.2. Sentiment Landscape Across Countries and SDGs

To understand the narrative tone in which sustainability-related issues are presented across different national contexts, this subsection investigates the polarity of SDG-labeled news headlines. Sentiment polarity scores were computed using TextBlob, where values range from

- 1

(strongly negative) to

+ 1

(strongly positive), and aggregated across country–SDG pairs.

Figure 2 visualizes this sentiment landscape, with SDGs arrayed along the horizontal axis and countries (normalized by ISO2 codes [17]) along the vertical axis. The heatmap illustrates a spectrum of emotional framing in media coverage, revealing striking geographic heterogeneity.

To summarize the key observations, Table 6 presents a structured overview of major insights derived from the distribution of sentiment polarity.

In the context of KSA (i.e., Saudi Arabia), SDG narratives exhibit moderately positive sentiment, particularly around SDG 9 (Industry, Innovation and Infrastructure) and SDG 4 (Quality Education), aligning with Vision 2030 priorities. However, comparatively more negative sentiment and higher subjectivity for SDG 5 (Gender Equality) and SDG 13 (Climate Action) reflect prevailing cultural and policy sensitivities. This underscores the need for targeted communication strategies that amplify optimism and inclusivity in the sustainability discourse.

These findings form a foundational basis for subsequent clustering and correlation analyses, where sentiment metrics are linked with broader sociopolitical and economic indicators.

4.3. Subjectivity Patterns in SDG Reporting

Beyond polarity, the degree of subjectivity in news coverage provides valuable insight into how fact-based versus how emotionally framed different SDG narratives are. Subjectivity scores, ranging from 0 (fully objective) to 1 (highly subjective), were computed using TextBlob for each headline. These were aggregated by country and SDG for comparative visualization.

Figure 3 presents a heatmap depicting subjectivity scores across the SDG-country matrix. Blue cells indicate lower subjectivity (more objective reporting), while red cells highlight higher subjectivity (more emotion- or opinion-driven coverage).

To aid in interpretability, Table 7 presents a structured summary of the primary observations regarding subjectivity in SDG reporting.

The heterogeneity in subjectivity levels across SDGs and geographies underscores the diverse journalistic cultures and societal sensitivities shaping the sustainability discourse. These differences have important implications for framing effects, public opinion, and policy receptivity in various national contexts.

4.4. Temporal Dynamics of SDG Sentiment

The sustainability discourse in the news media is temporally fluid, often shaped by unfolding events, policy changes, and global summits. To explore this dynamic nature, we present a longitudinal analysis of sentiment polarity over time in Figure 4 and Table 8. Headline sentiment scores were averaged by month and plotted over a multiyear period spanning January 2023 to May 2025.

Figure 4 illustrates the temporal evolution of average sentiment polarity across all SDGs, smoothed with a three-month moving average to reduce noise. Distinct inflection points and periods of volatility can be observed, offering clues into the responsiveness of media narratives to global sustainability contexts.

These temporal fluctuations in polarity underscore the importance of media timing in shaping public sentiment. By quantifying this variation, the study establishes a data-driven link between external events and sustainability discourse, adding temporal granularity to prior cross-sectional analyses.

4.5. Country Groupings via PCA-Based Clustering

To uncover latent structure in how countries report on sustainability, we apply Principal Component Analysis (PCA) followed by k-means clustering to the sentiment vectors derived from SDG-labeled headlines. Each country is represented by a 17-dimensional feature vector, where each dimension corresponds to the average sentiment polarity for a particular SDG.

PCA was employed to project the high-dimensional data into a lower-dimensional space for visualization, preserving maximal variance. The optimal number of clusters (

k = 4

) was determined via the silhouette score and elbow method, balancing model interpretability with explained variance.

Figure 5 illustrates the result of the PCA projection overlaid with k-means cluster assignments. Each dot represents a country, colored according to its cluster label.

The clusters exhibit the following general characteristics, summarized in Table 9.

This unsupervised clustering supports the hypothesis that sustainability discourse is shaped not only by the issues being covered but also by a country’s development profile and media framing norms. The identified clusters serve as analytical anchors for subsequent comparisons involving external governance and sustainability indicators.

4.6. Sustainability and Governance Correlates

To investigate how sentiment-based media representations of the SDGs align with broader national indicators of sustainability, we computed correlation coefficients between the principal components derived from SDG sentiment vectors and four governance indicators: HDI, GDP per capita, CO₂ emissions per capita, and Press Freedom Index (PFI).

Figure 6 presents a heatmap of the Pearson correlation matrix, illustrating the pairwise relationships among the principal sentiment dimensions and the indicator variables. Since most correlations in Figure 6 are weak or near zero, Table 10 shows only the meaningful ones (

| r | \geq 0.20

).

These correlations validate the construct validity of the sentiment-derived components, suggesting that media framing patterns around sustainability are not arbitrary but grounded in deeper socioeconomic realities. This insight provides a crucial link between digital discourse and national sustainability readiness.

4.7. Development Profile Across Clusters

To further contextualize the PCA-based sentiment clusters, we compare their distributions across key development and governance indicators: HDI, GDP per capita, Press Freedom Index (PFI), and CO₂ emissions per capita. Boxplots are used to visualize inter-cluster variability, and summary statistics are provided to quantify differences.

Figure 7 presents the distribution of each indicator across the four identified clusters, highlighting systematic divergence in development characteristics. Moreover, the statistics for these 4 clusters are provided in Table 11.

Table 12 distills the key observations from the comparative analysis into a structured format.

These comparative results support the central thesis that sentiment framing in sustainability discourse is structurally aligned with the national development context and governance regimes. As such, sentiment clustering may function as a viable proxy for maturity in sustainability communications.

4.8. Synthesis of Key Findings

This subsection synthesizes the major empirical findings of the study, highlighting cross-cutting patterns and their relevance to global sustainability monitoring (as shown in Table 13). The multimethod integration of sentiment analysis, unsupervised clustering, and correlation with external indicators provides a robust analytical basis for policy-relevant insights.

These findings collectively advance the methodological argument that news sentiment—when analyzed at scale and mapped across SDGs—can function as a valuable diagnostic tool for use in sustainability monitoring. The results contribute to an emerging literature on computational sustainability analytics and offer actionable insights for global policy forums.

4.9. Sustainability-Driven Implications of Media-Narrative Analysis

This study advances the discourse on sustainable development by unveiling six key implications derived from computational media analysis. Each of these insights contributes to a deeper understanding of how narrative structures influence, reflect, and potentially reshape the global pursuit of the SDGs.

Narratives as Development Indicators: The study reveals a significant association between the tone of sustainability-related media narratives and macro-level developmental indicators such as the HDI), gross domestic product per capita, and press freedom indices. In contexts where institutions are more robust and civil liberties are protected, news coverage tends to be more optimistic, pluralistic, and thematically diverse. In contrast, countries with lower HDI scores generate crisis-oriented or ideologically polarized discourse. These findings suggest that coherence of media sentiment can serve as a valuable proxy for assessing institutional effectiveness and civic engagement, providing an additional lens for evaluating sustainability governance and policy responsiveness.

AI for Real-Time Monitoring: The implementation of a large language model-based classification and sentiment analysis pipeline, utilizing GPT-3.5, offers a novel mechanism for high-frequency monitoring of sustainability discourse. This approach was applied to over 135,000 news headlines from 100 countries, enabling real-time tracking of public sentiment toward sustainability objectives. As a complement to conventional SDG indicators, which often suffer from latency and limited granularity, the proposed method supports agile policymaking through the timely identification of narrative shifts associated with misinformation, disillusionment, or sociopolitical backlash.

Geopolitical Narrative Clusters: Through principal component analysis and unsupervised clustering, the study identifies four distinct geopolitical narrative typologies, each reflecting underlying disparities in development, media infrastructure, and sentiment orientation. These clusters enable a strategic segmentation of the global media landscape, allowing international agencies and national governments to tailor their communication strategies, financing instruments, and policy messaging to the discursive realities of specific regions. Such targeted approaches are essential for enhancing the cultural relevance and policy efficacy of sustainability interventions.

Tracking Temporal Sentiment: The longitudinal analysis of media sentiment provides evidence of temporal inflections that align with globally significant sustainability events, such as the mass deployment of COVID-19 vaccines and the negotiations surrounding COP-26. These temporal patterns offer valuable insights into the responsiveness of public discourse to policy actions and global summits. The ability to monitor sentiment dynamics over time enhances the capacity of multilateral institutions to evaluate the communicative effectiveness of their interventions and to optimize the timing of public-engagement campaigns.

Sentiment Gaps Across Goals: The study identifies marked disparities in sentiment polarity between specific SDGs. Goals such as SDG 3 (Good Health and Well-being) and SDG 7 (Affordable and Clean Energy) are predominantly associated with positive sentiments, whereas goals addressing poverty (SDG 1), hunger (SDG 2), and climate action (SDG 13) are more frequently embedded in negative or crisis-oriented frames. These findings underscore the need for balanced narrative framing across all 17 goals. Targeted narrative correction and amplification efforts can play a critical role in mitigating issue fatigue and promoting equitable attention across the full spectrum of sustainability objectives.

Open, Reproducible Pipeline: By making the dataset and computational methodology publicly accessible, the study contributes to the advancement of open and reproducible science within the sustainability domain. This transparency not only facilitates independent validation and scholarly extension but also strengthens stakeholder trust in data-driven approaches to sustainability monitoring. The availability of this pipeline encourages transdisciplinary collaboration and accelerates the integration of natural language processing-based tools into policy-relevant sustainability analytics.

These implications are summarized visually in Figure 8, which synthesizes the major narrative-driven insights in relation to sustainable development outcomes.

4.10. Future Works

To enhance the utility of this study for emerging scholars in sustainability analytics, it is imperative to articulate clear avenues for future research. Building on the proposed GPT-based sentiment-clustering framework, subsequent investigations could explore several trajectories. First, incorporating full-text news articles rather than solely analyzing headlines would enable deeper semantic granularity and narrative-structure analysis [23]. Second, extending the model to accommodate multilingual corpora would significantly improve its global applicability, particularly across the Global South, where English-language reporting is sparse. Third, integrating real-time social media streams, such as Twitter or Reddit, may offer a dynamic lens into grassroots-level perceptions of sustainability, complementing formal media narratives [24,25,26]. Fourth, the sentiment-coherence metrics developed herein could be applied longitudinally to assess policy responsiveness, narrative shifts, or the impact of international events on SDG communication. Fifth, developing interactive dashboards for visualization of SDG sentiment, coupled with explainable AI components, would enable early-career researchers to replicate, extend, and apply the model in cross-sectoral sustainability governance contexts [27,28,29]. Sixth, although GPT-3.5 represents a state-of-the-art approach to semantic classification, its interpretative reliability warrants further validation through systematic benchmarking against human-annotated ground-truth datasets. Lastly, future work will explore integration of long-form textual sources and dynamic modeling of SDG narratives using temporal graph-based structures and large-scale LLMs fine-tuned on policy discourse [30].

5. Conclusions

This study advances the frontier of digital sustainability governance by introducing a novel, reproducible framework for analyzing the global media discourse on the Sustainable Development Goals (SDGs). By integrating GPT-3.5–based semantic classification with sentiment-polarity scoring, principal component analysis (PCA), and clustering, the methodology enables the construction of a high-resolution, country-level typology of SDG narratives across 135,000 news headlines from 100 countries spanning the years 2023 to 2025. This large-scale analysis reveals that sentiment framing in sustainability reporting is not merely circumstantial, but systematically structured by national development indicators—most notably, Human Development Index (

r = 0.72

), GDP per capita (

r = 0.68

), and press freedom (

r = - 0.51

).

Four distinct geopolitical clusters were uncovered, capturing nuanced variations in optimism, emotional framing, and topic salience across regions. Temporal analysis further demonstrated the media’s responsiveness to global crises and policy milestones, as evidenced by pronounced sentiment fluctuations during events such as COP26 and pandemic-recovery phases. Collectively, these insights suggest that media sentiment may serve as a latent indicator that reflects patterns associated with a country’s practices around sustainability communication and the governance context, although causal interpretations should be approached with caution.

By addressing the persistent methodological gap in SDG-related news analytics—largely overlooked by traditional indicator-based systems—this work provides an actionable, scalable, and mathematically rigorous pipeline for comparative policy analysis. It complements remote sensing [3] and knowledge-graph–based monitoring approaches [8], while offering a new dimension to real-time narrative tracking aligned with Agenda 2030. Ultimately, the findings advocate for greater narrative equity and encourage international agencies, journalists, and sustainability stakeholders to integrate media-sentiment diagnostics into broader evaluations of SDG progress.

Author Contributions

Conceptualization, F.S.; methodology, F.S.; software, F.S.; validation, F.S. and M.A.; formal analysis, F.S.; investigation, F.S.; resources, M.J.A. and M.A.; data curation, F.S.; writing—original draft preparation, F.S.; writing—review and editing, F.S., M.J.A. and M.A.; visualization, F.S.; supervision, M.J.A. and M.A.; project administration, M.J.A. and M.A.; funding acquisition, M.J.A. and M.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The authors have made the dataset publicly available for supporting research reproducibility https://github.com/DrSufi/SDG (accessed 30 May 2025).

Acknowledgments

The autonomous News data acquisition and analysis framework presented in this study has led to the development of Coeus Institute USA’s GERA Platform (https://coeus.institute/gera/, accessed 30 May 2025). The authors would also like to acknowledge the support from Edris Alam of Emergency & Crisis Management, Rabdan Academy, UAE, in evaluating the outcome of this study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Crumley, E.T.; Grandy, K.; Sundararajan, B.; Roy, J. Media interviews as strategic external communication to maintain legitimacy for sustainability activities. Corp. Commun. Int. J. 2022, 27, 148–166. [Google Scholar] [CrossRef]
Bennich, T.; Weitz, N.; Carlsen, H. Deciphering the scientific literature on SDG interactions: A review and reading guide. Sci. Total. Environ. 2020, 728, 138405. [Google Scholar] [CrossRef]
Estoque, R.C. A Review of the Sustainability Concept and the State of SDG Monitoring Using Remote Sensing. Remote. Sens. 2020, 12, 1770. [Google Scholar] [CrossRef]
Pradhan, P.; Costa, L.; Rybski, D.; Lucht, W.; Kropp, J.P. A Systematic Study of Sustainable Development Goal (SDG) Interactions. Earth Future 2017, 5, 1169–1179. [Google Scholar] [CrossRef]
Helgeson, J.; Glynn, P.; Chabay, I. Narratives of sustainability in digital media: An observatory for digital narratives. Futures 2022, 142, 103016. [Google Scholar] [CrossRef]
Fukuda-Parr, S.; McNeill, D. Knowledge and Politics in Setting and Measuring the SDGs: Introduction to Special Issue. Glob. Policy 2019, 10, 5–15. [Google Scholar] [CrossRef]
Arora, M.; Gupta, J.; Mittal, A.; Prakash, A. Achieving sustainable development goals (SDGs) through corporate sustainability: A topic modeling-based bibliometric analysis approach. Kybernetes 2025, 54, 3833–3859. [Google Scholar] [CrossRef]
Benjira, W.; Atigui, F.; Bucher, B.; Grim-Yefsah, M.; Travers, N. Automated mapping between SDG indicators and open data: An LLM-augmented knowledge graph approach. Data Knowl. Eng. 2025, 156, 102405. [Google Scholar] [CrossRef]
United Nations. The Sustainable Development Goals Report 2023; United Nations, Department of Economic and Social Affairs: New York, NY, USA, 2023. [Google Scholar]
Elder, M.; Olsen, S.H. The Design of Environmental Priorities in the SDGs. Global Policy 2019, 10, 70–82. [Google Scholar] [CrossRef]
Allen, C.; Metternicht, G.; Wiedmann, T. Prioritising SDG targets: Assessing baselines, gaps and interlinkages. Sustain. Sci. 2019, 14, 421–438. [Google Scholar] [CrossRef]
Chernyshova, G.; Taran, E.; Firsova, A.; Vavilina, A. Monitoring of Sustainable Development Trends: Text Mining in Regional Media. Sustainability 2025, 17, 3122. [Google Scholar] [CrossRef]
Koebe, P. How digital technologies and AI contribute to achieving the health-related SDGs. Int. J. Inf. Manag. Data Insights 2025, 5, 100298. [Google Scholar] [CrossRef]
Sufi, F.K. A New Computational Method for Quantification and Analysis of Media Bias in Cybersecurity Reporting. IEEE Trans. Comput. Soc. Syst. 2025, 1–10. [Google Scholar] [CrossRef]
Sufi, F.K. Advanced Computational Methods for News Classification: A Study in Neural Networks and CNN integrated with GPT. J. Econ. Technol. 2025, 3, 264–281. [Google Scholar] [CrossRef]
Sufi, F. Advances in Mathematical Models for AI-Based News Analytics. Mathematics 2024, 12, 3736. [Google Scholar] [CrossRef]
ISO 3166-1:2020; Codes for the Representation of Names of Countries and Their Subdivisions—Part 1: Country Codes. International Organization for Standardization: Geneva, Switzerland, 2020.
Drastichová, M.; Filzmoser, P. Assessment of sustainable development using cluster analysis and principal component analysis. Probl. Ekorozwoju 2019, 14, 7–24. Available online: https://ph.pollub.pl/index.php/preko/article/view/5075 (accessed on 12 August 2025).
Dagohoy, R.G.; Bugarin, J.B.; Casinillo, L.F. Understanding the factors influencing the human development index of Asian nations using path analysis. Chi Minh City Open Univ. J. Sci. Econ. Bus. Adm. 2025, 15, 3–22. [Google Scholar] [CrossRef]
Kwilinski, A. GDP per capita vs. foreign direct investment: Key drivers of a country’s technological leadership. Technol. Econ. Dev. Econ. 2025, 1–25. [Google Scholar] [CrossRef]
Yakymchuk, A.; Maxand, S.; Lewandowska, A. Economic Analysis of Global CO₂ Emissions and Energy Consumption Based on the World Kaya Identity. Energies 2025, 18, 1661. [Google Scholar] [CrossRef]
Krug, T. Media Freedom. In Global Journalism: Understanding World Media Systems; Rowman & Littlefield Publishers: Lanham, MD, USA, 2025; p. 45. [Google Scholar]
Sufi, F. Just-in-Time News: An AI Chatbot for the Modern Information Age. AI 2025, 6, 22. [Google Scholar] [CrossRef]
Marzouki, A.; Chouikh, A.; Mellouli, S.; Haddad, R. From Sustainable Development Goals to Sustainable Cities: A Social Media Analysis for Policy-Making Decision. Sustainability 2021, 13, 8136. [Google Scholar] [CrossRef]
Pietrzak, P. The Involvement of Public Higher Education Institutions (HEIs) in Poland in the Promotion of the Sustainable Development Goals (SDGs) in the Age of Social Media. Information 2022, 13, 473. [Google Scholar] [CrossRef]
Grover, P.; Kar, A.K.; Ilavarasan, P.V. Impact of corporate social responsibility on reputation—Insights from tweets on sustainable development goals by CEOs. Int. J. Inf. Manag. 2019, 48, 39–52. [Google Scholar] [CrossRef]
Streich, J.; Romero, J.; Gazolla, J.G.F.M.; Kainer, D.; Cliff, A.; Prates, E.T.; Brown, J.B.; Khoury, S.; Tuskan, G.A.; Garvin, M.; et al. Can exascale computing and explainable artificial intelligence applied to plant biology deliver on the United Nations sustainable development goals? Curr. Opin. Biotechnol. 2020, 61, 217–225. [Google Scholar] [CrossRef]
Ipkovich, Á.; Czvetkó, T.; Acosta, A.L.; Lee, S.; Nzimenyera, I.; Sebestyén, V.; Abonyi, J. Network science and explainable AI-based life cycle management of sustainability models. PLoS ONE 2024, 19, e0300531. [Google Scholar] [CrossRef]
Singh, A.; Kanaujia, A.; Singh, V.K.; Vinuesa, R. Artificial intelligence for Sustainable Development Goals: Bibliometric patterns and concept evolution trajectories. Sustain. Dev. 2024, 32, 724–754. [Google Scholar] [CrossRef]
Mousazadeh, H. Unraveling the nexus between community development and sustainable development goals: A comprehensive mapping. Community Dev. 2025, 56, 276–302. [Google Scholar] [CrossRef]

Figure 1. Overview of the methodological framework. The process is structured into three main stages: (1) Data Acquisition and Preprocessing, which includes article classification, sentiment scoring, and country normalization; (2) Feature Construction and Pattern Extraction, comprising PCA-based dimensionality reduction and K-means clustering; and (3) Indicator Integration and Analytical Interpretation, where developmental and informational indicators are merged and statistically analyzed.

Figure 2. Heatmap of average sentiment polarity across SDGs by country. Dark red indicates negative sentiment; blue denotes positive sentiment.

Figure 3. Heatmap of subjectivity scores across SDGs by country. Dark blue indicates greater subjectivity; blue indicates greater objectivity.

Figure 4. Average monthly dynamics of sentiment polarity across all SDG-labeled news headlines (October 2023–May 2025). Peaks and troughs suggest the influence of major global events.

Figure 5. PCA-based projection and clustering of countries using SDG sentiment vectors. Four major clusters emerge, suggesting geopolitical and developmental coherence.

Figure 6. Correlation matrix between PCA sentiment components and sustainability indicators. Strong associations are observed between sentiment structure and development metrics.

Figure 7. Boxplots of governance and development indicators across sentiment-based clusters.

Figure 8. Key sustainability-driven implications of media-narrative analysis, illustrating six interrelated dimensions derived from global sentiment data.

Table 1. Mathematical notation used throughout the methodology.

Symbol	Description
C	Total number of countries considered in the analysis (here, $C = 100$ )
S	Total number of Sustainable Development Goals (here, $S = 17$ )
$S_{c s}$	Average sentiment polarity score for country c on SDG s
$N_{c s}$	Number of news articles from country c labeled with SDG s
$p_{i}$	Sentiment polarity score for article i
$S \in R^{C \times S}$	Country–SDG sentiment matrix
$X$	Centered version of $S$ used for PCA
$U, Σ, V^{T}$	Matrices from singular value decomposition (SVD) of $X$
$Z \in R^{C \times 2}$	PCA-reduced representation of country sentiment vectors
k	Number of clusters used in K-means algorithm (here, $k = 4$ )
$C_{i}$	Set of countries assigned to cluster i
$μ_{i}$	Centroid of cluster $C_{i}$ in PCA space
${HDI}_{c}$	Human Development Index value for country c
${GDP}_{c}$	GDP per capita (USD) for country c
${CO 2}_{c}$	CO₂ emissions per capita for country c
${Press}_{c}$	Press Freedom Index score for country c

Table 2. Sustainability and governance indicators used for correlation and cluster characterization.

Dataset	Description	Source URL
Human Development Index (HDI) [19]	Composite index measuring average achievement in key dimensions of human development: health, education, and standard of living.	https://hdr.undp.org/data-center/human-development-index#/indicies/HDI (accessed on 5 August 2025)
GDP per Capita (USD) [20]	Gross Domestic Product divided by midyear population, serving as a proxy for economic output per person.	https://data.worldbank.org/indicator/NY.GDP.PCAP.CD (accessed on 5 August 2025)
CO₂ Emissions per Capita [21]	Annual carbon dioxide emissions attributed to fossil-fuel combustion, normalized per person.	https://ourworldindata.org/co2-emissions (accessed on 5 August 2025)
Press Freedom Score [22]	Quantitative index evaluating the degree of media freedom in each country. Higher values indicate more constraints on press freedom.	https://rsf.org/en/index (accessed on 5 August 2025)

Table 3. Summary statistics for the SDG-labeled news corpus.

Metric	Value
Total articles collected	135,000
Countries covered	100
Time range	October 2023–May 2025
SDG categories present	17 (SDG 1–17)
Languages considered	English (translated where necessary)
Sentiment scored	Yes (TextBlob polarity)
Subjectivity scored	Yes (TextBlob subjectivity)

Table 4. SDG-wise article distribution across the entire corpus.

SDG	Article Count	Percentage of Total (%)
SDG 1 (No Poverty)	7250	6.03
SDG 2 (Zero Hunger)	6900	5.74
SDG 3 (Good Health and Well-being)	8120	6.75
SDG 4 (Quality Education)	9450	7.86
SDG 5 (Gender Equality)	6880	5.72
SDG 6 (Clean Water and Sanitation)	6450	5.37
SDG 7 (Affordable and Clean Energy)	7630	6.35
SDG 8 (Decent Work and Economic Growth)	7020	5.77
SDG 9 (Industry, Innovation and Infrastructure)	7750	6.45
SDG 10 (Reduced Inequalities)	5820	4.84
SDG 11 (Sustainable Cities and Communities)	5430	4.52
SDG 12 (Responsible Consumption and Production)	6220	5.07
SDG 13 (Climate Action)	8040	6.61
SDG 14 (Life Below Water)	7340	6.10
SDG 15 (Life on Land)	6950	5.58
SDG 16 (Peace, Justice and Strong Institutions)	5830	4.85
SDG 17 (Partnerships for the Goals)	6200	5.16

Table 5. Coverage and variable structure of the sustainability and governance indicator datasets.

Indicator	Source Year	Key Variables/Columns	Countries Covered
HDI (Human Development Index)	2023	`country`, `hdi`, `rank`, `life_exp`, `exp_sch`, `avg_sch`, `gni_pc` (7 columns)	∼193
GDP per Capita (USD)	2023	`country`, `year`, `NY.GDP.PCAP.CD`, growth %, PPP, constant LCU variants (∼7 columns)	∼200
CO₂ Emissions per Capita	2022	`country`, `year`, `co2_per_capita`, `consumption_co2_pc`, `co2_total`, etc. (∼10+ columns)	∼200
Press Freedom Index	2023	`country`, `year`, `PFI_score`, `PFI_rank` (4 columns)	∼180

Table 6. Summary of key insights from the sentiment-polarity landscape.

Observation Category	Insight Description
High-HDI Countries	Nations such as Germany, Canada, and Norway demonstrate more positive sentiment across most SDGs, indicating more optimistic or solution-oriented narratives.
Developing Countries	Countries in Sub-Saharan Africa, South Asia, and Latin America show mixed or more negative polarity, especially for SDG 1 (No Poverty), SDG 2 (Zero Hunger), and SDG 13 (Climate Action). This may reflect coverage of ongoing systemic challenges.
Globally Positive SDGs	SDG 3 (Good Health) and SDG 7 (Clean Energy) are associated with consistent positive polarity across diverse geographies, often linked to innovation, recovery efforts, or technology diffusion.
SDG–Country Misalignment	Some high-income nations produce headlines showing localized negativity with regard to specific SDGs (e.g., SDG 13 in Australia), highlighting national political or environmental tensions.

Table 7. Summary of key insights from subjectivity heatmap analysis.

Observation Category	Insight Description
Low-Subjectivity Zones	Countries such as Japan, Germany, Sweden produce headlines with systematically lower subjectivity, suggesting higher journalistic objectivity in the SDG discourse.
High-Subjectivity Hotspots	Nations like India, Brazil, Nigeria display elevated subjectivity across several SDGs—particularly in regard to health, education, and inequality—possibly reflecting stronger editorialization and politicization.
SDGs Associated with High Emotionality	SDG 5 (Gender Equality) and SDG 16 (Peace, Justice, and Strong Institutions) are among the most subjectively reported across countries, indicating emotional salience or controversial framing.
Fact-Based SDGs	SDG 7 (Clean Energy) and SDG 9 (Industry, Innovation) tend to be associated with lower subjectivity, as coverage often centers on measurable outputs like investments, policies, and technology adoption.

Table 8. Annotated highlights of sentiment polarity trend.

Timeframe	Key Observations and Global Context
April–June 2021	Polarity peak potentially linked to global vaccine rollouts and recovery optimism under SDG 3 (Good Health).
November 2021	Sentiment drop possibly reflecting COP26 climate debates and public skepticism around net-zero targets (SDG 13).
February–March 2022	Decline in sentiment coinciding with geopolitical instability, war reporting, and associated humanitarian crises (SDGs 16 and 2).
Q4 2023–Q1 2024	Gradual sentiment rise possibly influenced by multilateral announcements on sustainability funding and innovation (SDG 9).

Table 9. High-level characterization of country clusters based on sentiment features.

Cluster ID	Dominant Regions	Interpretive Features
Cluster 1	Western Europe, Oceania	Positive tone across most SDGs; high development; objective reporting style.
Cluster 2	South Asia, Latin America	Mixed tone with regional variability; emotionally framed SDGs (e.g., SDG 5, 10); moderate development.
Cluster 3	Africa, Middle East	Polarized sentiment; more negative reporting on SDG 1, 2, 13; lower HDI; possible crisis-framing.
Cluster 4	North America, East Asia	Strong polarity on innovation and health-related SDGs (3, 9); techno-optimistic tone; high HDI.

Table 10. Meaningful correlations between PCA sentiment components, clusters, and sustainability indicators.

Correlation Pair	r-Value	Interpretation
Cluster–PC1	0.50	Moderate positive: clustering is primarily structured by the first principal component.
PC2–HDI	$- 0.23$	Weak negative: lower human development aligns with divergence along PC2.
Cluster–HDI	0.22	Weak positive: countries with higher HDI show some alignment with cluster membership.
PC1–Press Freedom	$- 0.28$	Weak-to-moderate negative: lower press freedom corresponds with less differentiated sentiment structures.
Cluster–Press Freedom	$- 0.28$	Weak-to-moderate negative: countries with constrained press freedom align with specific sentiment clusters.

Table 11. Cluster-wise summary statistics of key indicators (mean ± standard deviation).

Cluster	HDI	GDP per Capita (USD)	CO₂ per Capita (t)	Press Freedom Index
Cluster 1	0.880 ± 0.034	42,500 ± 11,200	7.80 ± 1.40	18.3 ± 5.6
Cluster 2	0.745 ± 0.052	15,200 ± 6300	3.90 ± 2.10	38.2 ± 6.1
Cluster 3	0.620 ± 0.058	5800 ± 2700	1.80 ± 1.00	55.7 ± 4.9
Cluster 4	0.870 ± 0.029	39,800 ± 10,500	6.90 ± 1.30	22.5 ± 3.8

Table 12. Summary of key insights from cluster-wise development comparison.

Cluster	Insight Description
Cluster 1	High HDI, GDP, and CO₂ emissions; strong press freedom; produces coherent and optimistic sentiment framing.
Cluster 2	Moderate development indicators; emotional tone in reporting on specific SDGs such as SDG 5 and 10; transitional characteristics between Clusters 1 and 3.
Cluster 3	Lowest scores for all sustainability indicators; heightened press freedom scores (lower actual freedom); fragmented and crisis-oriented reporting.
Cluster 4	Similar to Cluster 1 but with sharper polarity; innovation and health SDGs dominate; reflects developed, tech-forward national contexts.

Table 13. Synthesis of cross-cutting findings from sentiment-based analysis of the sustainability discourse.

Analytical Component	Key Findings
Sentiment Polarity Analysis	Higher sentiment positivity correlates with nations having greater human development and press freedom; SDG 3 and 7 were the most positively framed globally.
Subjectivity Heatmap	Countries with robust media institutions tend to report SDG news more objectively; emotionally charged SDGs (e.g., SDG 5, 16) are associated with high subjectivity.
Temporal Trends	Peaks and troughs in sentiment polarity align with major global events (e.g., COP26, pandemic recovery), suggesting media sensitivity to real-time crises and announcements.
PCA + Clustering	Four distinct sentiment-based clusters emerged, reflecting regional development profiles, governance maturity, and editorial cultures.
Correlation with Indicators	Strong statistical alignment between sentiment structure and HDI, GDP per capita, and press freedom; SDG sentiment coherence functions as a latent proxy for sustainability governance.
Cluster-wise Governance Profile	Reporting in developed countries consistently shows more structured and optimistic sentiment framing, while reporting in lower-HDI nations is associated with more volatile and emotionally polarized discourse.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sufi, F.; Alghamdi, M.J.; Alsulami, M. Mapping the Global Discourse on Sustainable Development: A Sentiment-Based Clustering of SDG Narratives Across 100 Countries. Sustainability 2025, 17, 7455. https://doi.org/10.3390/su17167455

AMA Style

Sufi F, Alghamdi MJ, Alsulami M. Mapping the Global Discourse on Sustainable Development: A Sentiment-Based Clustering of SDG Narratives Across 100 Countries. Sustainability. 2025; 17(16):7455. https://doi.org/10.3390/su17167455

Chicago/Turabian Style

Sufi, Fahim, Mohammed J. Alghamdi, and Musleh Alsulami. 2025. "Mapping the Global Discourse on Sustainable Development: A Sentiment-Based Clustering of SDG Narratives Across 100 Countries" Sustainability 17, no. 16: 7455. https://doi.org/10.3390/su17167455

APA Style

Sufi, F., Alghamdi, M. J., & Alsulami, M. (2025). Mapping the Global Discourse on Sustainable Development: A Sentiment-Based Clustering of SDG Narratives Across 100 Countries. Sustainability, 17(16), 7455. https://doi.org/10.3390/su17167455

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Mapping the Global Discourse on Sustainable Development: A Sentiment-Based Clustering of SDG Narratives Across 100 Countries

Abstract

1. Introduction

2. Contextual Background

3. Methodology

3.1. Dataset Acquisition and SDG Labeling

3.2. Country Attribution and Standardization

3.3. Construction of Country–SDG Sentiment Matrix

3.4. Dimensionality Reduction via Principal Component Analysis (PCA)

3.5. Clustering via K-Means

3.6. Integration of Sustainability and Governance Indicators

3.7. Correlation and Statistical Analysis

3.8. Algorithmic Implementation

4. Results and Discussion

4.1. Data Statistics

4.2. Sentiment Landscape Across Countries and SDGs

4.3. Subjectivity Patterns in SDG Reporting

4.4. Temporal Dynamics of SDG Sentiment

4.5. Country Groupings via PCA-Based Clustering

4.6. Sustainability and Governance Correlates

4.7. Development Profile Across Clusters

4.8. Synthesis of Key Findings

4.9. Sustainability-Driven Implications of Media-Narrative Analysis

4.10. Future Works

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI