Community-Aware Two-Stage Diversification for Social Media User Recommendation with Graph Neural Networks

Yoshida, Soh

doi:10.3390/info17010029

Open AccessArticle

Community-Aware Two-Stage Diversification for Social Media User Recommendation with Graph Neural Networks

by

Soh Yoshida

Faculty of Engineering Science, Kansai University, Suita-shi 564-8680, Japan

Information 2026, 17(1), 29; https://doi.org/10.3390/info17010029

Submission received: 2 December 2025 / Revised: 24 December 2025 / Accepted: 29 December 2025 / Published: 31 December 2025

(This article belongs to the Special Issue 2nd Edition of Modern Recommender Systems: Approaches, Challenges and Applications)

Download

Browse Figures

Versions Notes

Abstract

The occurrence of filter bubbles and echo chambers in social media recommendation systems poses a significant threat to information diversity and democratic discourse. Although graph neural networks (GNNs) achieve leading accuracy in user recommendation, their optimization for engagement metrics inadvertently reinforces homophily, creating isolated information ecosystems. This research developed community-aware two-stage diversification with GNNs (CATD-GNN), a method that leverages the inherent community structure of social networks to promote diversity without sacrificing recommendation quality. CATD-GNN integrates community detection with GNN learning through a two-stage diversification process. The proposed method employs the Louvain method to identify community structures as pseudo-categories, then applies submodular neighbor selection and community-based loss reweighting during GNN training (Stage 1), followed by coverage and redundancy-aware reranking (Stage 2). Twitter data capturing Black Lives Matter discourse and Reddit political discussion networks were used to evaluate the method. CATD-GNN achieves improvements in diversity metrics while maintaining competitive accuracy. The two-stage architecture demonstrates a synergistic effect: the combination of diversity-aware training and coverage-based reranking produces greater improvements than either component alone. The proposed method successfully identifies and recommends users from different communities while preserving recommendation relevance.

Keywords:

graph neural networks; social media recommendation; filter bubbles; diversity; community detection

Graphical Abstract

1. Introduction

Social media platforms have transformed information dissemination and social interaction, with recommendation systems serving as critical infrastructure that shapes user experiences and information exposure [1]. These systems determine not only which content users encounter but also with whom they interact, influencing the formation of social networks and the flow of information within them [2]. The societal implications extend beyond individual user satisfaction, affecting political polarization, social cohesion, and the spread of misinformation across communities [3]. Contemporary social media recommendation systems achieve remarkable accuracy through graph neural network (GNN) architectures, which model complex interaction patterns [4]. These methods leverage high-order connectivity in user–user interaction networks to identify similar users and predict future connections [5]. Advanced techniques incorporating self-supervised learning and contrastive methods have further improved recommendation quality, with recent approaches such as XSimGCL demonstrating gains in precision and recall [6,7].

Despite these technical advances, the optimization for accuracy has created severe filter bubble effects that isolate users within homogeneous information ecosystems. The term “filter bubble,” coined by Pariser [8], refers to the state of intellectual isolation that can result when algorithms selectively present information based on users’ previous behavior, limiting exposure to diverse perspectives [9]. Empirical studies reveal that recommendation algorithms amplify existing homophily patterns, resulting in echo chambers in which users predominantly encounter perspectives aligned with their existing beliefs [10]. In contrast to filter bubbles, which refer to the algorithmic curation of content [8,9], echo chambers emerge from social network structures in which users are primarily exposed to like-minded individuals [3]. In recommendation systems, both phenomena can reinforce each other: algorithmic filtering creates bubbles that strengthen echo-chamber effects within communities [11]. The problem is further compounded by feedback loops inherent in recommendation systems [12,13]. As users interact with recommended content, their behavior reinforces the algorithmic understanding of their preferences, progressively narrowing the diversity of future recommendations [14]. This dynamic process, called the “information cocoon” effect in the recent literature [15,16], creates increasingly isolated information environments that are difficult to escape without external intervention. This phenomenon manifests particularly strongly in user recommendation systems, where suggested connections reinforce community boundaries rather than bridging them [17]. Recent quantitative analyses demonstrate that over 90% of recommended connections fall within users’ existing communities, even though organic interactions exhibit much lower baseline homophily [18].

Existing diversity-aware recommendation methods can be categorized into reranking, learning-to-rank, cluster-based, and fusion-based approaches [19,20]. Reranking methods adjust the rankings from existing recommender systems by combining diversity constraints to improve recommendations [21]. The traditional maximal marginal relevance (MMR) method [22] iteratively selects items balancing relevance and novelty through greedy optimization. However, these post-processing approaches operate on candidate sets generated by accuracy-optimized models, limiting their ability to introduce genuine diversity [23]. Learning-to-rank methods incorporate diversity objectives during the training process through regularization terms [24] or modified loss functions, but typically require explicit category information unavailable in user recommendation contexts [25]. Cluster-based methods leverage the principle that similar items should be grouped into the same cluster and promote diversity by recommending items from different clusters [26,27]. Fusion-based methods aggregate results from different recommender systems to obtain recommendations with both high utility and diversity [28]. The absence of principled approaches for diversity-aware learning in user recommendation systems that can operate without explicit categorization represents a critical gap in current research [29].

This paper introduces a novel method called community-aware two-stage diversification with GNNs (CATD-GNN) that leverages the community structure inherent in social networks to promote diversity in user recommendations while maintaining recommendation quality. First, unlike existing methods that apply diversity as post-processing or require predefined categories, CATD-GNN integrates diversity directly into GNN message passing through submodular neighbor selection and community-based loss reweighting, ensuring that learned representations inherently support cross-community recommendations. Second, this approach employs community detection on user-user interaction graphs to identify natural groupings that serve as pseudo-categories for diversity optimization [30], eliminating the external labels required by existing work. Finally, by incorporating community-aware mechanisms during both GNN training and post-processing reranking, the method addresses filter bubble effects at multiple levels in the recommendation pipeline [31], where the two stages synergistically reinforce each other’s effectiveness.

The main contributions are the following:

This paper presents a method to use identified community structures as pseudo-categories in user recommendation systems, addressing the lack of direct user attribute categorization through topology-driven community detection.
The approach involves community-aware graph learning, integrating submodular neighbor selection and loss reweighting into GNN training to produce embeddings conducive to varied recommendations.
Extensive empirical validation indicates that CATD-GNN significantly enhances diversity while preserving competitive accuracy in various real-world social networks, presenting an effective method for mitigating filter bubbles.
The proposed two-stage architecture is shown to offer complementary effects, with both learning-based diversification and coverage-aware reranking contributing improvements that neither component achieves independently.

The remainder of this paper is organized as follows: Section 2 provides a review of related work on recommendation, recommendation diversity, filter bubble mitigation, and community detection. Section 3 presents the CATD-GNN method with mathematical formulations. Section 4 describes experimental setup, datasets, and results. Section 5 concludes with implications and future research directions.

2. Related Work

This section reviews the literature relevant to community-aware diversification in social media recommendation. The review is organized into four thematic areas that collectively motivate and contextualize the proposed method. First, Section 2.1 examines GNN-based recommendation methods, establishing the technical foundation upon which CATD-GNN builds while identifying their tendency to exacerbate filter bubbles. Second, Section 2.2 surveys diversity-aware recommendation approaches, highlighting the gap in methods that can operate without predefined categories. Third, Section 2.3 reviews filter bubble mitigation strategies in social media, demonstrating the need for approaches that integrate diversity into the learning process rather than applying it as post-processing. Finally, Section 2.4 discusses community detection methods and their relationship to information homogeneity, providing the theoretical basis for using detected communities as pseudo-categories. Together, these four areas identify the research gap addressed by CATD-GNN: the absence of principled methods for diversity-aware GNN learning in user recommendation that leverage inherent community structure without requiring explicit categorization.

2.1. GNNs for Recommendation

GNNs have advanced recommendation systems by effectively modeling user-item interactions as bipartite graphs. The foundational work by Kipf and Welling [32] introduced graph convolutional networks (GCNs), which aggregate neighborhood information through spectral graph convolutions. Following this development, NGCF [5] pioneered the explicit modeling of high-order connectivity by injecting collaborative signals into embeddings through multiple propagation layers, achieving significant improvements over traditional collaborative filtering methods. LightGCN [4] showed that feature transformation and nonlinear activation are unnecessary for recommendation. By simplifying the architecture so that it only includes neighborhood aggregation and layer combination, LightGCN achieves both superior performance and computational efficiency. Recent advances in self-supervised learning have further enhanced representation quality, with SGL [6] applying stochastic augmentations including node dropout, edge dropout, and random walk sampling to create multiple graph views for contrastive learning. XSimGCL [7] simplifies this approach by introducing noise-based augmentation, eliminating the need for complex graph perturbations while achieving state-of-the-art performance. While GNN-based methods have achieved high recommendation accuracy, their tendency to exacerbate filter bubbles has been documented across multiple domains. In e-commerce settings, Ge et al. [14] showed that GNN-based recommenders create stronger echo chambers than traditional collaborative filtering methods, with users receiving increasingly homogeneous product recommendations. Similar effects have been observed in news recommendation [33], where GNN models amplify existing consumption patterns, leading to progressive narrowing of information diversity over time.

2.2. Diversity in Recommendation Systems

Recommendation diversity is a factor in mitigating popularity bias, ensuring fairness, and improving user satisfaction. Adomavicius and Kwon [25] established the foundational distinction between individual diversity (variety within single user recommendations) and aggregate diversity (variety across all users), demonstrating that optimizing for accuracy alone can reduce aggregate diversity by up to 85%. Nguyen et al. [9] further demonstrated that collaborative filtering algorithms can reduce content diversity by up to 50%, providing early evidence of filter bubble effects. Diversity-aware methods can be categorized into four main approaches [19,20]. Reranking methods adjust rankings from existing systems by combining diversity constraints. The MMR method [22] provides a greedy algorithm that balances relevance and diversity by iteratively selecting items that maximize the combination of relevance scores and dissimilarity to already selected items. Determinantal point processes (DPPs) [23] offer a probabilistic framework for diverse subset selection based on kernel matrices that capture item similarities. The key advantage of DPPs is their ability to model repulsive interactions between items through the determinant of a kernel matrix, enabling globally optimal diverse subset selection. Chen et al. [23] developed fast greedy MAP inference algorithms for DPP to improve recommendation diversity while maintaining computational efficiency. However, both MMR and DPP-based approaches are limited by the diversity present in the initial candidate sets [23] because they operate as post-processing methods on fixed embeddings generated by accuracy-optimized models. Learning-to-rank methods incorporate diversity objectives during training through regularization terms [24,34] or modified reward models [35], but typically require explicit category information. Cluster-based methods promote diversity by recommending items from different clusters [26,27], while fusion-based methods aggregate results from multiple recommender systems [28]. Popularity bias mitigation represents a primary motivation for diversity optimization. Despite advances in GNN-based optimization, these methods inadvertently exacerbate popularity bias through mechanisms that provide disproportionate exposure to popular items [36]. Abdollahpouri et al. [29] categorized mitigation methods into pre-processing, in-processing, and post-processing approaches, with Zhou et al. [37] proposing adaptive debiasing specifically for graph collaborative filtering.

Diversity-aware GNN methods have emerged through architectural innovations. DGCF [38] learns disentangled representations by decomposing user intents into independent factors, whereas DCCF [39] extends this with contrastive learning. Multi-interest representation learning has also gained attention [40,41]. DGRec [42] employs category-aware loss reweighting and submodular neighbor selection to promote long-tail items, demonstrating that explicit diversity mechanisms can be integrated into GNN architectures. However, these methods primarily focus on intent disentanglement or require explicit item categories rather than addressing community-level diversity in social networks without predefined categorization.

2.3. Filter Bubble Mitigation in Social Media

The mitigation of filter bubbles on social media platforms addresses the issues of community isolation and information homogeneity. Pariser [8] laid the groundwork by highlighting how personalization algorithms result in segregated information spaces. Empirical studies have substantiated these claims; Bakshy et al. [2] demonstrated that Facebook’s algorithmic sorting diminishes encounters with diverse content, while Cinelli et al. [3] observed persistent echo-chamber phenomena across various networks.

The quantification of filter bubbles necessitates suitable metrics. In the study by Lunardi et al. [43], detailed metrics were introduced for assessing news recommendation systems. Michiels et al. [44] created regression models to analyze online news consumption patterns. Liu et al. [45] revealed that algorithms intensify existing political biases, and Aridor et al. [46] determined that algorithmic curation is responsible for around 60% of the uniformity observed in content consumption. User-centered approaches have also gained attention. Wang et al. [47] proposed user-controllable mechanisms that allow individuals to adjust diversity-relevance trade-offs. McKay et al. [48] investigated user responses to diverse content exposure, while Gao et al. [49] introduced counterfactual reasoning to burst filter bubbles. Li et al. [50] proposed reinforcement learning frameworks treating diversity as an explicit reward signal.

Community-aware approaches have been gaining traction. Tang et al. [18] introduced community detection-based collaborative GCN (CD-CGCN), which employs community detection to counteract filter bubbles. Their research confirms that cutting-edge models frequently over-recommend items within the same community, with intra-list filter bubble index (ILFBI) values of over 0.9, thereby supporting community-based strategies. The alignment of similar metrics underscores the utility of community structure in assessing filter bubbles.

Structural diversification strategies aim to alter network topology. Sanz-Cruzado and Castells [51] suggested the use of modularity complements for recommending weak ties, achieving a 25% reduction in polarization metrics. Musco et al. [52] devised algorithms that reduce polarization by adding edges, with approximation assurances. Nevertheless, these methods act as post-processing, and hence they are unable to change the learned representations. By contrast, community-based strategies use inherent clustering. The community-aware model (CAM) by Grossetti et al. [53] integrates community similarity matrices with user preferences for cross-community recommendations, showing a decrease in filter bubble effects for 15%–64% of affected users. However, CAM also functions as a post-processing step, limiting its ability to create diverse representations.

Theoretical frameworks have been used to explain filter bubble formation. Chitra and Musco [10] demonstrated that the intensity of filter bubbles increases quadratically in relation to the ratio of intra-community versus inter-community edge probabilities. Garimella et al. [17] proposed the random walk controversy (RWC) score, showing that effective mitigation necessitates a minimum 30% reduction in scores.

2.4. Community Detection and Filter Bubble Formation

The Louvain method [30] has emerged as the standard for large-scale detection owing to high-quality results based on Newman’s modularity optimization [54]. Alternative methods such as that of Leiden [55] address stability issues while maintaining efficiency. Pares et al. [56] introduced “fluid communities” based on label propagation. Although these methods produce similar structures for social networks, algorithm choice affects the granularity of detected communities.

The connection between community structure and information homogeneity has been empirically established. Halberstam and Knight [57] found that 73% of political information sharing occurs within ideologically aligned communities. Matakos et al. [58] confirmed that homophilous interactions concentrate opinions within communities, creating echo chambers. These findings align with the principle of homophily proposed by McPherson et al. [59]—similarity breeds connection—explaining the formation of densely connected clusters. Temporal dynamics also impact filter bubble persistence. Kumar et al. [60] demonstrated that major events temporarily disrupt community boundaries, but communities reconverge to stable configurations within weeks. This stability justifies the use of detected communities as pseudo-categories for diversity-aware recommendation because these communities capture persistent polarization structures. Cinelli et al. [3] revealed that users within the same community are exposed to remarkably homogeneous information, with intra-community content similarity exceeding 85%. This homogeneity directly contributes to filter bubble formation. Unlike arbitrary categorization schemes, community detection captures the actual polarization structure inherent in user interactions, ensuring that diversification efforts target the boundaries that create filter bubbles rather than artificial distinctions.

3. Proposed Method

This section introduces CATD-GNN, a community-aware two-stage diversification method designed to mitigate filter bubbles in social-media user recommendation through deep integration with GNN architectures. An overview of the proposed method is shown in Figure 1. It includes: (1) a problem formulation that establishes the multi-objective optimization framework (Section 3.1); (2) community detection to derive pseudo-categories from the underlying network topology (Section 3.2); (3) a bipartite graph construction that enables GNN-based recommendation (Section 3.3); (4) Stage 1, which incorporates community-aware mechanisms during GNN training through submodular neighbor selection and loss reweighting (Section 3.4); and (5) Stage 2, which performs coverage- and redundancy-aware reranking for final diversification (Section 3.5). Finally, Section 3.6 presents a complexity analysis demonstrating that the proposed method remains computationally efficient.

3.1. Problem Formulation

Consider a social network consisting of n users, denoted as

V = {v_{1}, v_{2}, \dots, v_{n}}

. The relationships between these users form a user–user interaction graph

G_{user} = (V, E_{user})

, where an edge

(v_{i}, v_{j}) \in E_{user}

represents an interaction or connection between users

v_{i}

and

v_{j}

. This graph structure is encoded in an adjacency matrix

A \in {0, 1}^{n \times n}

, where

A_{i j} = 1

indicates that users i and j have interacted. The user recommendation task aims to identify potential new connections for each user in the network. For a given query user

v \in V

, the system ranks all other users by their likelihood of forming meaningful connections with v. This requires learning a scoring function

f : V \times V \to R

that assigns numerical scores to user pairs. Traditional recommendation systems optimize solely for accuracy, maximizing metrics such as precision and recall. This approach often reinforces existing community boundaries, creating filter bubbles where users are only recommended connections within their immediate social circles. To address this limitation, the following multi-objective optimization problem is formulated for this framework:

max_{f} [(1 - λ) \cdot Accuracy (f) + λ \cdot Diversity (f)],

(1)

where

λ \in [0, 1]

serves as a trade-off parameter balancing the accuracy and diversity objectives.

3.2. Community Detection for Pseudo-Category Generation

Unlike item recommendation systems in which products naturally belong to predefined categories, user recommendation systems lack explicit categorization schemes. To enable diversity-aware recommendation, community detection identifies natural groupings emerging from interaction patterns. The Louvain algorithm [30] partitions users into communities by optimizing modularity as follows:

Q = \frac{1}{2 m} \sum_{i, j \in V} [A_{i j} - \frac{k_{i} k_{j}}{2 m}] δ (c_{i}, c_{j}),

(2)

where

k_{i} = \sum_{j} A_{i j}

denotes the degree of user i,

m = \frac{1}{2} \sum_{i, j} A_{i j}

represents the total number of edges,

c_{i}

indicates the community assignment of user i, and

δ (c_{i}, c_{j})

equals 1 if users i and j belong to the same community and 0 otherwise. The term

A_{i j} - \frac{k_{i} k_{j}}{2 m}

measures the difference between the actual and expected number of edges in a random network with the same degree distribution. Positive modularity values indicate that the number of edges within communities exceeds random expectations. The Louvain algorithm iteratively optimizes this metric through local optimization and network aggregation phases, yielding community assignments

C = {C_{1}, C_{2}, \dots, C_{K}}

such that each user

v \in V

belongs to exactly one community

C (v) \in C

.

The Louvain algorithm is an efficient and widely adopted solution for community detection. However, it has known limitations, including the potential to yield poorly connected or even disconnected communities, especially when applied iteratively [55]. To mitigate randomness in our implementation, we perform multiple restarts, selecting the partition with the highest modularity score. The Leiden algorithm [55] offers an alternative that guarantees well-connected communities and addresses some stability concerns; however, comparative analysis of community detection algorithms is beyond the scope of this work and represents an important direction for future research.

Validation of Community-Based Pseudo-Categories

To validate that detected communities serve as effective pseudo-categories for diversity optimization, an empirical comparison of community-based categorization and content-based topic classification was conducted. The key hypothesis was that community detection captures the actual polarization structure inherent in user interactions, making it more suitable for identifying filter bubble boundaries than topical clustering.

The homogeneity level (HL) metric [43], calculated as follows, was employed to quantify the internal consistency of categorizations:

HL = \frac{\sum_{i, j \in R} \frac{| n_{i} \cap n_{j} |}{| n_{i} \cup n_{j} |}}{(\binom{| R |}{2})},

(3)

where

R

represents a set of recommended users and

n_{i}

denotes the feature vector for user i. Higher HL values indicate stronger homogeneity within categories, suggesting more effective grouping for filter bubble analysis.

For this validation study, recommendation lists

R

using the XSimGCL model [7] were obtained, which represents current best practices in GNN-based recommendation. The datasets used for this analysis (Twitter-BLM and Reddit-Ideological) are described in detail in Section 4.1.1. For each user in the test sets, the top-20 recommendations were generated and their compositions were analyzed under different categorization schemes. Topic-based categorization was performed using BERTopic [61], a neural topic modeling technique that leverages transformer-based embeddings. The process consists of three steps: (1) All textual content associated with each user was collected. For Twitter-BLM, this includes their retweets; for Reddit-Ideological, this comprises their posted comments and articles. (2) BERTopic generated document embeddings using a pre-trained language model (paraphrase-multilingual-mpnet-base-v2), reduced the dimensionality via UMAP, clustered the embeddings using HDBSCAN, and extracted topic representations using c-TF-IDF. (3) Each user was assigned to their dominant topic based on the aggregated topic distribution of their associated content. The resulting topic assignments served as an alternative categorization scheme for comparison with community detection.

According to [43], the HL is computed by defining user feature vectors

n_{i}

based on the selected categorization. For community-based categorization,

n_{i}

is a one-hot vector of dimension K (number of communities), where

n_{i} [c] = 1

if user i belongs to community c, and 0 otherwise. For topic-based categorization,

n_{i}

is similarly a one-hot vector of dimension T (number of topics), where

n_{i} [t] = 1

if user i is assigned to topic t. This enables the homogeneity of users within recommendation lists under different categorization schemes to be quantified. Empirical investigations reveal notable differences in categorization techniques. BERTopic, which represents topic classification, achieves moderate uniformity with HL values ranging from 0.2 to 0.4, as shown in Figure 2. In contrast, community-based categorization obtains HL values approaching 1.0. This variation highlights the superior efficiency of community detection in representing the polarization of user interaction patterns because it exceeds topic modeling in recognizing filter bubble boundaries. This result moreover indicates that individuals in the same community exhibit analogous interaction behaviors, affirming communities as effective pseudo-categories for diversity-conscious recommendations.

It should be noted that the correspondence between topological communities and meaningful diversity boundaries represents an empirical finding specific to these social network datasets. The high HL values suggest that community structure effectively captures the interaction-based polarization that creates filter bubbles in social discourse contexts. However, in networks where user interactions are driven by factors orthogonal to the diversity dimensions of interest, topological communities may not serve as appropriate proxies, and alternative categorization schemes incorporating explicit semantic or demographic attributes may be required.

3.3. Bipartite Graph Transformation

GNNs for recommendation are typically designed for bipartite graphs modeling user-item interactions. To leverage these architectures for user–user recommendations, the proposed method transforms the original interaction graph into a bipartite representation. Two disjoint copies of the user set are created: query users

U = {u_{1}, u_{2}, \dots, u_{n}}

representing users seeking recommendations, and candidate users

I = {i_{1}, i_{2}, \dots, i_{n}}

representing users who can be recommended.

The bipartite graph

G_{bipartite} = (U \cup I, E_{bipartite})

contains edges defined by

(u_{j}, i_{k}) \in E_{bipartite} \Leftrightarrow (v_{j}, v_{k}) \in E_{user} .

(4)

This transformation preserves the original interaction structure in an interaction matrix

R \in {0, 1}^{n \times n}

where

R_{j k} = A_{j k}

. Each node receives an initial d-dimensional embedding:

e_{u}^{(0)} \in R^{d}

for query users and

e_{i}^{(0)} \in R^{d}

for candidate users. Community assignments transfer directly:

C (u_{j}) = C (i_{j}) = C (v_{j})

for all

j \in {1, \dots, n}

.

3.4. Stage 1: Community-Aware Graph Learning

3.4.1. LightGCN

The proposed method employs LightGCN [4] for learning user representations through iterative neighborhood aggregation. At each propagation layer

ℓ \in {0, 1, \dots, L}

, embeddings are updated as follows:

\begin{matrix} e_{u}^{(ℓ + 1)} & = \sum_{i \in N_{u}} \frac{1}{\sqrt{| N_{u} | | N_{i} |}} e_{i}^{(ℓ)}, \end{matrix}

(5)

\begin{matrix} e_{i}^{(ℓ + 1)} & = \sum_{u \in N_{i}} \frac{1}{\sqrt{| N_{i} | | N_{u} |}} e_{u}^{(ℓ)}, \end{matrix}

(6)

where

N_{u}

and

N_{i}

denote the neighborhoods of query user u and candidate user i, respectively. The normalization factor

\frac{1}{\sqrt{| N_{u} | | N_{i} |}}

prevents embedding magnitude amplification during propagation.

3.4.2. Layer-Wise Attention Mechanism

Following the successful approach in DGRec [42], a layer-wise attention mechanism is introduced to adaptively combine information from different propagation depths. Unlike uniform averaging, where each layer contributes equally, the attention mechanism learns to weight layer contributions based on their informativeness for the recommendation task.

After L propagation layers, attention weights are computed for each layer as

α_{ℓ} = \frac{exp (w^{T} e_{u}^{(ℓ)} + b_{ℓ})}{\sum_{m = 0}^{L} exp (w^{T} e_{u}^{(m)} + b_{m})},

(7)

where

w \in R^{d}

is a learnable weight vector and

b_{ℓ}

is the bias term for layer ℓ. The attention weights

α_{ℓ}

are normalized through softmax to sum to 1. The final representation combines multi-scale information with the learned attention weights

e_{u} = \sum_{ℓ = 0}^{L} α_{ℓ} e_{u}^{(ℓ)}, e_{i} = \sum_{ℓ = 0}^{L} α_{ℓ} e_{i}^{(ℓ)} .

(8)

This attention mechanism allows the model to automatically determine the optimal contribution of each propagation layer. Initial layers capture the local neighborhood structure while deeper layers incorporate higher-order connectivity patterns. The learned weights adapt to the specific characteristics of each dataset or densely connected network; lower layers may receive higher weights to avoid over-smoothing, whereas sparser networks may benefit from deeper propagation.

3.4.3. Submodular Neighbor Selection

Standard graph convolution aggregates all neighbors equally, causing highly connected users to dominate representations. The proposed method addresses this through submodular optimization for diverse neighbor selection. For query user u and its neighborhood

N_{u}

, a subset

S_{u} \subseteq N_{u}

is selected to maximize

f (S_{u}) = \sum_{i \in N_{u}} max_{j \in S_{u}} sim (e_{i}, e_{j}),

(9)

where similarity employs a Gaussian kernel:

sim (e_{i}, e_{j}) = exp (- \frac{∥ e_{i} - e_{j} ∥_{2}^{2}}{2 σ^{2}}) .

(10)

The facility location function in Equation (9) measures coverage: for each neighbor in the full set, it computes the maximum similarity to the selected subset. Maximizing this function ensures the selected subset represents diverse regions of the neighborhood. The greedy algorithm provides

(1 - 1 / e)

-approximation guarantees [62], iteratively adding neighbors with maximum marginal gain until the size of the neighborhood reaches

min (| N_{u} |, K_{max})

, where

K_{max}

is the maximum selected neighborhood size.

3.4.4. Community-Based Loss Reweighting

To ensure balanced representation across communities, the training loss is reweighted so that it is inversely proportional to community size as follows:

w_{C (u)} = \frac{1 - β}{1 - β^{| C (u) |}} \cdot \frac{K}{| C (u) |},

(11)

where

| C (u) |

denotes the size of user u’s community, K is the total number of communities, and

β \in (0, 1)

controls the smoothing strength. The first term prevents extreme weights through exponential smoothing, and the second provides inverse frequency weighting.

The training objective uses Bayesian personalized ranking [63] with community reweighting as follows:

L = \sum_{(u, i, j) \in D} w_{C (u)} \cdot [- log σ ({\hat{y}}_{u i} - {\hat{y}}_{u j})] + λ {∥ Θ ∥}_{2}^{2},

(12)

where

D

contains training triplets,

{\hat{y}}_{u i} = e_{u}^{T} e_{i}

denotes predicted scores,

σ (\cdot)

is the sigmoid function, and

λ

controls L2 regularization.

3.5. Stage 2: Coverage and Redundancy-Aware Reranking

3.5.1. Binomial Diversity Framework

Recommendation diversity is quantified through a probabilistic framework modeling community appearances. Specifically, the proposed method adapts the binomial diversity framework [31]. For a recommendation list R of length N, the number of times community c appears follows a binomial distribution:

P (X_{c} = k) = (\binom{N}{k}) p_{c}^{k} {(1 - p_{c})}^{N - k},

(13)

where

p_{c} = | C_{c} | / n

represents the base probability of selecting from community c.

The coverage metric

Ψ (R)

measures the expected number of distinct communities:

Ψ (R) = \sum_{c \in C} [1 - {(1 - p_{c})}^{N}],

(14)

where the term

{(1 - p_{c})}^{N}

gives the probability of community c being absent from the list. Therefore,

1 - {(1 - p_{c})}^{N}

represents the probability of at least one appearance, and summing yields the expected community count.

The non-redundancy metric

Ω (R)

penalizes repeated community appearances:

Ω (R) = 1 - \frac{1}{N} \sum_{c \in C} \sum_{k = 2}^{N} k \cdot P (X_{c} = k | X_{c} > 0) .

(15)

The inner sum computes the expected redundant appearances beyond the first one. Normalizing and subtracting from 1 yields a metric such that higher values indicate less redundancy.

The combined diversity score

Φ (R)

balances both objectives:

Φ (R) = \sqrt{Ψ (R) \cdot Ω (R)} .

(16)

3.5.2. Greedy Reranking Algorithm

Given initial ranking

L

from Stage 1, a diversified list R is constructed through greedy selection. The initial ranking

L

consists of candidate users sorted by their predicted relevance scores from the trained GNN model, where the relevance score for candidate user i with respect to query user u is computed as

s_{i} = {\hat{y}}_{u i} = e_{u}^{T} e_{i},

(17)

where

e_{u}

and

e_{i}

are the final embeddings from Stage 1 obtained through Equation (8).

The diversified list is then constructed iteratively by selecting users that maximize the combination of relevance and diversity as

i^{*} = arg max_{i \in L ∖ R} [(1 - λ) \cdot s_{i} + λ \cdot Δ_{i}],

(18)

where

s_{i}

denotes the relevance score from Equation (17),

Δ_{i} = Φ (R \cup {i}) - Φ (R)

represents the marginal diversity gain from adding user i to the current recommendation list R, and

λ \in [0, 1]

controls the trade-off between relevance and diversity. This greedy procedure continues until the desired number of recommendations is reached.

3.6. Complexity Analysis

The computational complexity of CATD-GNN can be decomposed into three distinct phases: preprocessing, training, and inference. Time and space complexity are analyzed for each phase to provide an understanding of the method’s scalability.

Preprocessing Phase. Community detection using the Louvain algorithm requires $O (m log n)$ time complexity, where m denotes the number of edges and n the number of nodes. The space complexity is $O (n + m)$ for storing the graph and community assignments. This one-time cost amortizes across all subsequent training and inference operations. The algorithm typically converges within 5–10 iterations for social networks because of their inherent community structure, making the practical complexity closer to $O (m)$ for most real-world networks.

Training Phase. The training complexity comprises four main components that must be analyzed separately:
- Submodular neighbor selection: For each batch of size B, selecting diverse neighbors requires computing pairwise similarities among candidates. This yields $O (B \cdot {\bar{d}}^{2} \cdot d)$ time complexity per batch, where $\bar{d}$ denotes the average node degree and d is the embedding dimension. The greedy selection algorithm with lazy evaluation reduces this to $O (B \cdot K_{max} \cdot \bar{d} \cdot d)$ in practice, where $K_{max} ≪ \bar{d}$ is the maximum selected neighborhood size.
- Graph propagation: The modified LightGCN propagation with selected neighbors requires $O (B \cdot K_{max} \cdot L \cdot d)$ time per batch for L propagation layers, significantly less than the $O (B \cdot \bar{d} \cdot L \cdot d)$ required by standard aggregation.
- Attention mechanism: Computing attention weights across L layers as in Equation (7) requires $O (B \cdot L^{2} \cdot d)$ time for weight computation and normalization.
- Community-based loss reweighting: Computing weights for each sample requires $O (B \cdot K)$ time, where K is the number of communities, because it is necessary to look up community sizes and compute the weighting formula.

The total training complexity per epoch becomes

O (| B_{total} | \cdot [K_{max} \cdot \bar{d} \cdot d + K_{max} \cdot L \cdot d + L^{2} \cdot d + K]),

(19)

where

| B_{total} | = ⌈ n / B ⌉

represents the total number of batches. Because typically

K_{max} ≪ \bar{d}

,

L ≪ n

, and

K ≪ n

, the dominant term is

O (n \cdot K_{max} \cdot \bar{d} \cdot d)

.

The space complexity during training is

O (n \cdot d + n \cdot L \cdot d)

for storing embeddings across all layers, plus

O (B \cdot K_{max})

for storing batch-specific selected neighbors.

Inference Phase. For each query user, generating recommendations involves three steps:
- Embedding computation: Computing the query user’s embedding through L-layer propagation with attention requires $O (L \cdot K_{max} \cdot d + L^{2} \cdot d)$ time.
- Candidate scoring: Computing relevance scores for all n candidate users via inner products requires $O (n \cdot d)$ time.
- Diversity-aware reranking: The greedy selection based on Equation (18) requires computing the marginal diversity gains. For each of N positions, up to n candidates are evaluated, computing diversity scores involving K communities. This yields $O (N \cdot n \cdot K)$ time complexity. With lazy evaluation using priority queues, this reduces to $O (N \cdot K \cdot log n)$ when $N ≪ n$ .

The total inference complexity per user is

O (L \cdot K_{max} \cdot d + n \cdot d + N \cdot K \cdot log n) .

(20)

4. Experiments

This section presents an experimental evaluation of the CATD-GNN method. This investigation centers on three critical research questions that address both the effectiveness and practical applicability of the proposed approach. First, it examines whether the proposed method can successfully improve recommendation diversity while maintaining acceptable accuracy levels compared with recent methods (RQ1). Second, it investigates how each component of the proposed two-stage architecture contributes to the overall performance, particularly analyzing the synergistic effects between submodular neighbor selection, community-based loss reweighting, and coverage-aware reranking (RQ2). Finally, the ability of CATD-GNN to mitigate filter bubbles is demonstrated and quantified across different social network structures to confirm its ability to reduce echo chamber effects (RQ3).

4.1. Experimental Setup

4.1.1. Dataset

Twitter-BLM Dataset. The Twitter-BLM dataset [64] consists of retweet interactions during a Black Lives Matter (BLM) discourse from a subset of the complete dataset spanning 1–14 June 2020. The original corpus contains 63.9 million tweets from 13.0 million users collected from 2013 to 2021, but this study considered a focused temporal window during the peak of the BLM protests following George Floyd’s death. A user–user interaction graph was constructed, where nodes represent Twitter users and directed edges $(v_{i}, v_{j})$ indicate that user $v_{i}$ retweeted content from user $v_{j}$ . Users with fewer than five interactions were removed to ensure meaningful connectivity patterns and exclude the top 2% most active users to mitigate extreme popularity bias. The resulting network comprises 23,397 users. The Louvain algorithm automatically identified 69 communities through modularity optimization, with community sizes ranging from 12 to 1847 users.

Reddit-Ideological Dataset. The Reddit-Ideological dataset [65] comprises user-article interactions from ideologically oriented subreddits including r/politics, r/Conservative, r/Liberal, and restricted communities. The dataset contains 377,144 articles across three ideological categories (Liberal: 72,488 articles from six subreddits, Conservative: 79,573 articles from six subreddits, and Restricted: 225,083 articles from 16 subreddits). To construct the user–user interaction graph, a co-engagement method was employed, where edges connect users who interact with common content. Specifically, an undirected edge $(v_{i}, v_{j})$ was established when both users posted or commented on articles within the same subreddit during the same temporal window (7-day period). Edge weights were computed as $w_{i j} = \sum_{s \in S_{i j}} log (1 + n_{i s}) \cdot log (1 + n_{j s})$ , where $S_{i j}$ denotes the set of subreddits where both users were active, and $n_{i s}$ represents the number of posts/comments by user i in subreddit s. This logarithmic weighting prevents highly active users from dominating the graph structure while preserving engagement intensity. Edges with weights below threshold $τ = 2.0$ were removed to eliminate spurious connections from minimal co-occurrences. The threshold value was determined by analyzing the weight distribution, where $τ = μ - 0.5 σ$ ( $μ$ : mean edge weight, $σ$ : standard deviation) ensures the retention of statistically significant connections while maintaining graph connectivity. The preprocessed network contains 45,231 users and 812,453 edges. Community detection via the Louvain algorithm identified 128 communities, where community sizes ranged from 23 to 2847 users.

Train-Test Split Protocol. For both datasets, temporal splitting was employed to simulate realistic deployment scenarios. The first 70% of chronologically ordered interactions formed the training set, the next 15% constituted the validation set, and the final 15% served as the test set. Users that appeared only in the test interactions were removed to enable personalized evaluation. This temporal split is more challenging than random splitting because it requires predicting future interactions based on historical patterns.

Dataset Generalizability. While the experiments focus on two datasets, these datasets represent different social network structures that test the method’s generalizability across diverse contexts. The Twitter-BLM dataset exemplifies organic community formation around social movements, where users naturally cluster based on ideological alignment without explicit platform mechanisms. In contrast, Reddit-Ideological demonstrates institutionalized segregation through subreddit boundaries, representing platforms with explicit community structures. These datasets collectively capture the two primary modes of community formation in social media: emergent clustering (Twitter) and designed separation (Reddit).

4.1.2. Baseline Methods

CATD-GNN is compared with the following three categories of baseline methods, ensuring a coverage of existing approaches.

Accuracy-Focused GNN Methods. These represent state-of-the-art approaches optimized primarily for recommendation accuracy.

LightGCN [4]: simplifies GCN by removing feature transformation and nonlinear activation while maintaining pure neighborhood aggregation.
NGCF [5]: models high-order connectivity by injecting collaborative signals through multiple propagation layers.
SGL [6]: employs self-supervised learning with graph augmentation including node dropout, edge dropout, and random walk sampling.
XSimGCL [7]: the current state-of-the-art using noise-based contrastive learning without complex graph perturbations.

All GNN-based methods use identical architectures (two layers, 32-dimensional embeddings) for fair comparison. Post-processing methods (MMR, DPP) were applied to the XSimGCL outputs as it provides the strongest baseline performance.

Diversity-Aware Methods. These methods explicitly incorporate diversity objectives.

DGRec [42]: originally designed for e-commerce with explicit item categories and adapted to use detected communities as pseudo-categories.
MMR [22]: MMR with iterative selection balancing relevance and novelty using $λ = 0.5$ .
DPP [23]: DPP with fast greedy MAP inference for diverse subset selection.

Filter Bubble Mitigation Methods. These approaches specifically target echo chamber effects in social networks.

CAM [53]: uses similarity matrices weighted by community membership.
Weak Ties [51]: recommends users based on the modularity complement to promote weak tie formation.
CD-CGCN [18]: community detection to counteract filter bubbles with two-stage filtering.

4.1.3. Evaluation Metrics

Accuracy Metrics. For user recommendation evaluation, accuracy metrics measure how well the system predicts actual future interactions. Given a ranked list $R_{u}^{K}$ of top-K recommended users for query user u and the ground truth test interactions $T_{u}$ ,

Precision@K measures the fraction of recommended users who are relevant:

Precision @ K = \frac{1}{| U |} \sum_{u \in U} \frac{| R_{u}^{K} \cap T_{u} |}{K} .

(21)

Recall@K measures the fraction of relevant users who are recommended:

Recall @ K = \frac{1}{| U |} \sum_{u \in U} \frac{| R_{u}^{K} \cap T_{u} |}{| T_{u} |} .

(22)

NDCG@K (Normalized Discounted Cumulative Gain) accounts for ranking position:

NDCG @ K = \frac{1}{| U |} \sum_{u \in U} \frac{\sum_{i = 1}^{K} \frac{I [r_{i} \in T_{u}]}{{log}_{2} (i + 1)}}{\sum_{i = 1}^{min (K, | T_{u} |)} \frac{1}{{log}_{2} (i + 1)}},

(23)

where

r_{i}

denotes the user at position i in the ranking and

I [\cdot]

is the indicator function.

Diversity Metrics. Following Tang et al. [18], community-aware diversity metrics quantify filter bubble effects in user recommendations.

Intra-List Filter Bubble Index (ILFBI@K) measures the proportion of recommended users from the query user’s own community:

ILFBI @ K = \frac{1}{| U |} \sum_{u \in U} \frac{| {v \in R_{u}^{K} : C (v) = C (u)} |}{K},

(24)

where

C (v)

denotes the community of user v. Lower values indicate better cross-community diversity.

Community Gini Index (CGI@K) quantifies inequality in community representation:

CGI @ K = \frac{1}{| U |} \sum_{u \in U} [1 - \frac{2 \sum_{i = 1}^{K - 1} S_{i}}{K \cdot S_{K}} - \frac{1}{K}],

(25)

where

S_{i}

represents the cumulative count of communities sorted by frequency in

R_{u}^{K}

. Lower values indicate more balanced community distribution.

Coverage@K measures the average number of distinct communities represented in each user’s recommendation list:

Coverage @ K = \frac{1}{| U |} \sum_{u \in U} |{C (v) : v \in R_{u}^{K}}|,

(26)

where

C (v)

denotes the community of user v. Higher values indicate greater community diversity within individual recommendation lists.

Entropy@K quantifies the uniformity of user appearance frequencies across all recommendation lists:

Entropy @ K = - \sum_{v \in V} p_{v} log p_{v},

(27)

where

p_{v} = \frac{count (v)}{\sum_{w \in V} count (w)}

and

count (v)

is the number of times user v appears across all recommendation lists.

4.1.4. Implementation Details

The experimental configuration employs standard hyperparameters identified through systematic grid search on the validation set. The embedding dimension was set to

d = 32

, providing sufficient representational capacity for capturing user relationships. The GNN architecture consisted of

L = 2

propagation layers, which effectively captures third-order collaborative signals without over-smoothing. Training used the Adam optimizer with learning rate

η = 0.001

, batch size

B = 2048

, dropout probability 0.1, and L2 regularization coefficient

10^{- 4}

.

The diversity-specific parameters were configured as follows: the community weight parameter

β = 0.9

upweighted underrepresented communities while maintaining representation of popular communities. The diversity trade-off parameter

λ = 0.7

balanced accuracy and diversity objectives in the experiments. The Gaussian kernel width

σ = 1.0

for submodular selection distinguished between similar and dissimilar neighbors. The submodular selection budget

K_{max} = 20

limited the number of neighbors per user to maintain computational efficiency. Training employed early stopping with a patience of 20 epochs based on validation NDCG.

All experiments were repeated five times with different random seeds to ensure statistical reliability. The random-seed control was as follows: (1) model initialization, (2) negative sampling during training, (3) dropout mask generation, and (4) mini-batch shuffling. Seeds were fixed across all methods for fair comparison. Results report the mean values and standard deviations across the five runs.

4.2. Main Results (RQ1)

Table 1 compares the performance of the methods on the Twitter-BLM dataset. CATD-GNN demonstrates the effectiveness of integrating diversity mechanisms directly into GNN training. The method reduces filter bubble effects substantially—the ILFBI metric decreases from values indicating near-total intra-community recommendations in the baseline methods to a more balanced distribution. This reduction occurs because the submodular neighbor selection actively samples users from different communities during message passing, while the reranking stage further promotes cross-community connections.

Among the diversity-focused baselines, CD-CGCN achieves the second-best diversity performance through its clustering-based approach. However, CD-CGCN incurs similar accuracy penalties to those of CATD-GNN, indicating that any method achieving meaningful diversity must sacrifice some relevance. Post-processing methods (CAM, MMR, and DPP) produce smaller diversity improvements because they operate on fixed embeddings that already encode popularity bias from standard training. DGRec, which applies diversification only during inference, cannot match CATD-GNN’s performance because the underlying representations remain optimized solely for accuracy.

Table 2 shows the results on the Reddit-Ideological dataset, where subreddit boundaries create explicit political segregation. The dataset’s structure presents a different challenge: users actively self-select into ideologically homogeneous communities, making cross-community recommendations potentially less relevant. CATD-GNN addressed this by maintaining higher intra-community recommendation rates than Twitter-BLM while still introducing meaningful cross-community exposure. The method identified and recommended users who participated in multiple political communities, serving as bridges between otherwise disconnected groups.

The consistency of the relative improvements over baseline methods across both datasets validates the generalizability of the proposed method. Twitter-BLM represents organic community formation around a social movement, whereas Reddit-Ideological exhibits deliberate political segregation. Despite these structural differences, CATD-GNN achieves similar relative reductions in echo chamber metrics when compared with existing GNN-based debiasing approaches on both platforms. The absolute coverage values differ between datasets because of the varying community counts.

Performance Across Recommendation List Sizes

Figure 3 examines how recommendation list size affects the accuracy–diversity trade-off. The experiments varied K from 10 to 200, representing different application scenarios from compact friend suggestions to extensive discovery feeds.

Regarding accuracy (Figure 3a), CATD-GNN maintains a performance close to that of XSimGCL at

K = 10

because both methods focus on the most relevant recommendations. As K increases, the performance gap stabilizes rather than widening, demonstrating that the diversity mechanisms do not degrade the ranking of highly relevant users. The convergence of NDCG curves at larger K values occurs because all methods eventually recommend less relevant users as the list extends beyond core connections.

Regarding diversity (Figure 3b), CATD-GNN consistently recommends users from more diverse communities than the baseline methods. The coverage metric represents the average number of distinct communities represented in each user’s recommendation list. The coverage values indicate the breadth of community representation. At smaller K values, most methods show limited diversity, even with

K = 10

recommendations, baseline methods typically draw from only three to four communities, reflecting strong homophily effects. CATD-GNN breaks this pattern by actively selecting users from different communities through its two-stage approach.

The coverage growth rate differs among methods. While LightGCN and XSimGCL show sub-linear growth because they reinforce existing community preferences, CATD-GNN exhibits stronger growth, particularly in the

K = 50

to

K = 200

range. This occurs because the diversity-aware training creates embeddings in which users from different communities maintain distinguishable representations, allowing the reranking stage to effectively identify and promote cross-community connections. DGRec shows intermediate performance, but its growth rate diminishes at higher K values because post-processing alone cannot overcome the homophily encoded during standard training.

4.3. Component Analysis (RQ2)

4.3.1. Ablation Studies

Table 3 quantifies the contribution of each component in CATD-GNN, addressing RQ2. The results reveal mutual effects within the two-stage architecture. When Stage 1 operates independently, it achieves moderate diversity improvements, whereas Stage 2 alone shows limited effectiveness. However, their combination in the full CATD-GNN model demonstrates improvements across all diversity metrics, confirming that diversity-aware training creates embeddings containing latent diversity signals that the reranking mechanism can effectively exploit.

The submodular neighbor selection component contributes significantly to diversity. Its removal resulted in a 15.0% reduction in coverage and increased ILFBI by 9.4% on Twitter-BLM. This degradation occurred because high-degree nodes dominate the aggregation process in standard graph convolution, leading to popularity bias in the learned representations. The submodular optimization in Equation (9) effectively mitigates this by maximizing the coverage of diverse neighborhoods.

Community-based loss reweighting shows more subtle but important effects. Its removal reduced coverage by 8.5% and increased ILFBI by 5.6% on Twitter-BLM. The impact is more pronounced on Reddit-Ideological, where minority political communities require explicit upweighting to maintain representation. Without this reweighting, the model converges to solutions that primarily serve majority communities, exacerbating filter bubbles.

The comparison between Stage 2 variants provides insights into the importance of principled reranking. Random reranking achieves reasonable coverage but severely degrades NDCG by 12.4% with respect to the NDCG of the full model. This establishes a baseline showing that statistical diversity through randomization is trivial to achieve, but maintaining relevance requires intelligent optimization. The coverage and redundancy-aware reranking maintains 96.3% of Stage 1’s NDCG while improving coverage by 27.2%, demonstrating the effectiveness of the binomial diversity framework.

The ablation results also reveal dataset-specific patterns. On Reddit-Ideological, removing Stage 2 had a smaller impact on accuracy but larger impact on diversity metrics than it did on Twitter-BLM. This suggests that Reddit’s stronger community boundaries make post-processing diversification more challenging, emphasizing the importance of diversity-aware training in Stage 1. Interestingly, the configuration without reweighting maintains a relatively high performance, indicating that submodular selection and reranking can partially compensate for the absence of community-based loss weighting. However, the full model’s superior performance across all metrics confirms that each component contributes meaningfully to the overall objective. The non-additive improvements when combining components validate the architectural design choices and suggest that diversity-aware recommendation requires intervention at multiple levels of the learning pipeline.

4.3.2. Hyperparameter Sensitivity Analysis

Figure 4 evaluates three critical hyperparameters using metrics aligned with their primary functions. All evaluations used

K = 100

for consistency with the main experimental results. The trade-off parameter

λ

controls the accuracy-diversity balance in Stage 2 reranking. It was evaluated through both NDCG@100 and Coverage@100 because it directly weights these competing objectives. The smooth inverse relationship confirms reliable control without discontinuities. At

λ = 0.7

, the method achieves an effective balance.

Community weight

β

affects training dynamics by controlling the upweighting strength for minority communities. Its impact was evaluated using ILFBI, which directly measures filter bubble effects. A stable performance within [0.8, 0.95] indicates robustness to precise tuning. Values below 0.8 provide insufficient minority representation, resulting in higher ILFBI values, while values above 0.95 create training instabilities due to extreme weight disparities.

Gaussian kernel width

σ

determines similarity discrimination in submodular selection, and coverage measures the resulting diversity of recommendations. The response peaks at

σ = 1.0

, identifying the optimal scale for distinguishing similar and diverse neighbors. This optimal value appears consistent across datasets, determined by the typical cosine similarity distribution in embedding spaces.

4.4. Filter Bubble Mitigation Visualization (RQ3)

4.4.1. Community Distribution Visualization

To address RQ3, visual and quantitative analyses were conducted to demonstrate how CATD-GNN mitigates filter bubbles across different social network structures. To quantitatively assess filter bubble mitigation, the distribution of recommended items across detected communities was analyzed. For each recommendation method, the top-K recommendations for all test users were generated and the frequency distribution was computed as follows:

f_{c} = \frac{1}{| U_{test} | \cdot K} \sum_{u \in U_{test}} \sum_{v \in R_{u}^{K}} I [C (v) = c],

(28)

where

f_{c}

denotes the recommendation frequency for community c,

R_{u}^{K}

represents the top-K recommendations for user u,

C (v)

indicates the community assignment of user v, and

I [\cdot]

is the indicator function. Diversity was quantified using Shannon entropy:

H = - \sum_{c} f_{c} log f_{c}

, where higher values indicate more uniform distributions.

Figure 5 visualizes the community distributions for both datasets at

K = 100

. The Twitter-BLM dataset exhibits clear filter bubble effects with the baseline methods (Figure 5a), where recommendations concentrate 60.1% of recommendations in the top-three communities. This moderate concentration reflects typical social media behavior, where users primarily interact within their immediate social circles while maintaining some cross-community connections. CATD-GNN (Figure 5b) achieves a more balanced distribution through a mixture of uniform and preference-aware components, reducing the top-three concentration to 34.4% while maintaining relevance through controlled preference for users’ primary communities. The entropy increases from 2.28 to 2.85, representing a 25.0% improvement in diversity.

The Reddit-Ideological dataset presents distinct characteristics due to its political nature. Baseline methods (Figure 5c) yield a two-component distribution: major political subreddits dominate with 62.3% of recommendations concentrated in the top-five communities. This pattern reflects Reddit’s structure where users strongly affiliate with specific political ideologies but also engage with smaller, specialized communities. The baseline entropy of 2.74 is higher than that of Twitter, indicating that even with political polarization, Reddit maintains some inherent diversity due to its diverse subreddit ecosystem.

CATD-GNN (Figure 5d) successfully reduces the dominance of major political communities, lowering the top-five concentration to 30.7%. The method achieves this through a balanced mixture approach (65% uniform, 35% preference-based), which preserves some alignment with users’ political orientations while promoting cross-community exposure. The resulting entropy of 3.32 represents a 21.5% improvement. Interestingly, the improvement is more modest compared that that of Twitter-BLM, suggesting that strong ideological boundaries in political discourse present unique challenges for diversification that require careful balance between exploration and user satisfaction.

These results show that CATD-GNN addresses the filter bubble problem across different social network structures. This method adapts to the inherent characteristics of each platform, achieving greater relative improvement on Twitter, where social boundaries are more fluid, while respecting the stronger community identities in Reddit’s political landscape. The consistent entropy improvements across both datasets validate the generalizability of the proposed approach while highlighting the importance of platform-specific considerations in recommendation diversification.

4.4.2. Filter Bubble Mitigation Analysis

To quantify filter bubble mitigation effectiveness, the random walk controversy (RWC) metric [17] was used, which measures polarization through random walk dynamics on a recommendation-augmented network. An augmented graph

G^{'}

was constructed by adding edges from each user to their top-K recommended users as

G^{'} = (V, E \cup E_{rec}) where E_{rec} = {(u, v) : v \in R_{u}^{K}, \forall u \in V} .

(29)

On this augmented graph, random walks were performed to compute transition probabilities between communities. For each pair of communities

(i, j)

, 1000 random walks were initiated from nodes in community i and the probability

P_{i j}

of reaching community j within 100 steps was measured. The pairwise RWC score is

{RWC}_{i j} = P_{i i} \cdot P_{j j} - P_{i j} \cdot P_{j i},

(30)

where high values indicate polarization (walks stay within communities) and low values indicate mixing (walks cross boundaries). The aggregate RWC score weights all community pairs by their relative sizes:

RWC (G^{'}) = \sum_{i = 1}^{| C | - 1} \sum_{j = i + 1}^{| C |} \frac{| C_{i} | \cdot | C_{j} |}{n (n - 1) / 2} \cdot {RWC}_{i j} .

(31)

Figure 6 presents the RWC scores across all methods. On Twitter-BLM, CATD-GNN achieves an RWC score of 0.48, which represents a 34.2% reduction in the RWV obtained by XSimGCL (0.73). This exceeds the 30% reduction threshold identified as necessary for meaningful polarization mitigation. This improvement stems from creating recommendation paths that bridge previously disconnected communities. Reddit-Ideological exhibits higher baseline polarization due to explicit political segregation. CATD-GNN reduces RWC from 0.81 to 0.56, achieving a 30.9% reduction. While the absolute RWC remains higher than that for Twitter-BLM, the relative improvement demonstrates the method’s effectiveness even in strongly polarized environments.

5. Conclusions

This paper presented CATD-GNN, a community-aware two-stage diversification method that addresses filter bubbles in social media user recommendation through GNNs. By leveraging detected community structures as pseudo-categories and integrating diversity mechanisms directly into GNN training, the proposed method demonstrates that recommendation diversity and accuracy need not be mutually exclusive objectives. The approach combines graph-based learning with community-aware diversification to promote cross-community exposure while preserving recommendation relevance through strategic neighbor selection and loss reweighting. The experiments on Twitter-BLM and Reddit-Ideological datasets validated the effectiveness of using community structure as a natural categorization scheme for diversity enhancement. The method successfully mitigates echo chamber effects across different social network structures, from topic-based communities to ideologically polarized groups, demonstrating its adaptability to diverse social media contexts. Beyond improving diversity metrics, CATD-GNN maintains the inherent advantages of GNNs in capturing complex user relationships and interaction patterns.

Limitations

This study has several limitations. First, the proposed method relies on static community detection, whereas recent temporal GNN approaches [66] show that incorporating time-evolving structures significantly improves performance. Second, the assumption of hard community assignments may oversimplify real-world scenarios where users maintain overlapping memberships [67]. Third, while the complexity analysis shows theoretical tractability, scaling to million-node networks would require adopting mini-batch sampling strategies [68] and distributed training methods to handle the computational demands of industrial-scale social platforms. Fourth, the reliance on Louvain-based community detection introduces potential sensitivity to algorithmic randomness and resolution limits; future work should investigate alternative methods such as Leiden [55] and conduct formal stability analysis across multiple detection runs. Fifth, the theoretical complexity analysis in Section 3.6 provides hardware-independent scalability bounds. However, empirical runtime benchmarks were excluded due to significant variation from implementation-specific factors, such as hardware configuration and optimization levels. Future work targeting industrial deployment should conduct controlled timing experiments across networks of varying scales with standardized implementations.

Funding

This work was supported by the Japan Society for the Promotion of Science, KAKENHI Grant Number 22K18007, Japan, and the Kansai University Fund for Supporting Young Scholars, 2025.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available in [Zenodo, Mendeley Data] at [https://doi.org/10.5281/zenodo.4056562, https://doi.org/10.17632/2tdr9sjd83.3].

Acknowledgments

The author thanks Hiroki Fujiwara for conducting the feasibility study and creating the prototype while he was enrolled at Kansai University.

Conflicts of Interest

The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Vosoughi, S.; Roy, D.; Aral, S. The spread of true and false news online. Science 2018, 359, 1146–1151. [Google Scholar] [CrossRef] [PubMed]
Bakshy, E.; Messing, S.; Adamic, L.A. Exposure to ideologically diverse news and opinion on Facebook. Science 2015, 348, 1130–1132. [Google Scholar] [CrossRef] [PubMed]
Cinelli, M.; De Francisci Morales, G.; Galeazzi, A.; Quattrociocchi, W.; Starnini, M. The echo chamber effect on social media. Proc. Natl. Acad. Sci. USA 2021, 118, e2023301118. [Google Scholar] [CrossRef] [PubMed]
He, X.; Deng, K.; Wang, X.; Li, Y.; Zhang, Y.; Wang, M. LightGCN: Simplifying and powering graph convolution network for recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual, 25–30 July 2020; pp. 639–648. [Google Scholar]
Wang, X.; He, X.; Wang, M.; Feng, F.; Chua, T.S. Neural graph collaborative filtering. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Paris, France, 21–25 July 2019; pp. 165–174. [Google Scholar]
Wu, J.; Wang, X.; Feng, F.; He, X.; Chen, L.; Lian, J.; Xie, X. Self-supervised graph learning for recommendation. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual, 11–15 July 2021; pp. 726–735. [Google Scholar]
Yu, J.; Xia, X.; Chen, T.; Cui, L.; Hung, N.Q.V.; Yin, H. XSimGCL: Towards extremely simple graph contrastive learning for recommendation. IEEE Trans. Knowl. Data Eng. 2024, 36, 913–926. [Google Scholar] [CrossRef]
Pariser, E. The Filter Bubble: How the New Personalized Web Is Changing What We Read and How We Think; Penguin Books: London, UK, 2011. [Google Scholar]
Nguyen, T.T.; Hui, P.M.; Harper, F.M.; Terveen, L.; Konstan, J.A. Exploring the filter bubble: The effect of using recommender systems on content diversity. In Proceedings of the 23rd International Conference on World Wide Web, Seoul, Korea, 7–11 April 2014; pp. 677–686. [Google Scholar]
Chitra, U.; Musco, C. Analyzing the impact of filter bubbles on social network polarization. In Proceedings of the 13th ACM International Conference on Web Search Data Mining, Houston, DX, USA, 3–7 February 2020; pp. 115–123. [Google Scholar]
Gao, Z.; Shen, T.; Mai, Z.; Bouadjenek, M.R.; Waller, I.; Anderson, A.; Bodkin, R.; Sanner, S. Mitigating the filter bubble while maintaining relevance: Targeted diversification with VAE-based recommender systems. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, 11–15 July 2022; pp. 2524–2531. [Google Scholar]
Mansoury, M.; Abdollahpouri, H.; Pechenizkiy, M.; Mobasher, B.; Burke, R. Feedback loop and bias amplification in recommender systems. In Proceedings of the 29th ACM international conference on information & knowledge management, Virtual, 19–23 October 2020; pp. 2145–2148. [Google Scholar]
Sun, W.; Khenissi, S.; Nasraoui, O.; Shafto, P. Debiasing the human-recommender system feedback loop in collaborative filtering. In Proceedings of the Companion Proceedings of The 2019 World Wide Web Conference, San Francisco, CA, USA, 13–17 May 2019; pp. 645–651. [Google Scholar]
Ge, Y.; Zhao, S.; Zhou, H.; Pei, C.; Sun, F.; Ou, W. Understanding echo chambers in e-commerce recommender systems. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual, 25–30 July 2020; pp. 2261–2270. [Google Scholar]
Li, N.; Gao, C.; Piao, J.; Huang, X.; Yue, A.; Zhou, L.; Liao, Q.; Li, Y. An exploratory study of information cocoon on short-form video platform. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, Atlanta, GA, USA, 17–21 October 2022; pp. 4178–4182. [Google Scholar]
Piao, J.; Liu, J.; Zhang, F.; Su, J.; Li, Y. Human–AI adaptive dynamics drives the emergence of information cocoons. Nature Mach. Intell. 2023, 5, 1214–1224. [Google Scholar] [CrossRef]
Garimella, K.; De Francisci Morales, G.; Gionis, A.; Mathioudakis, M. Reducing controversy by connecting opposing views. In Proceedings of the 10th ACM International Conference on Web Search Data Mining, Cambridge, UK, 6–10 February 2017; pp. 81–90. [Google Scholar]
Tang, M.; Huang, X.; Sang, J. Mitigating filter bubble from the perspective of community detection: A universal framework. arXiv 2025, arXiv:2508.11239. [Google Scholar]
Castells, P.; Hurley, N.; Vargas, S. Novelty and diversity in recommender systems. In Recommender Systems Handbook; Springer: New York, NY, USA, 2021; pp. 603–646. [Google Scholar]
Wu, Q.; Liu, Y.; Miao, C.; Zhao, Y.; Guan, L.; Tang, H. Recent advances in diversified recommendation. arXiv 2019, arXiv:1905.06589. [Google Scholar]
Ziegler, C.N.; McNee, S.M.; Konstan, J.A.; Lausen, G. Improving recommendation lists through topic diversification. In Proceedings of the 14th International Conference on World Wide Web, Chiba, Japan, 10–14 May 2005; pp. 22–32. [Google Scholar]
Carbonell, J.G.; Goldstein, J. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st Annu. International ACM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia, 24–28 August 1998; pp. 335–336. [Google Scholar]
Chen, L.; Zhang, G.; Zhou, E. Fast greedy MAP inference for determinantal point process to improve recommendation diversity. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montrèal, Canada, 3–8 December 2018; pp. 5622–5633. [Google Scholar]
Wasilewski, J.; Hurley, N. Incorporating diversity in a learning to rank recommender system. In Proceedings of the Twenty-Ninth International Florida Artificial Intelligence Research Society Conference, Key Largo, FL, USA, 16–18 May 2016. [Google Scholar]
Adomavicius, G.; Kwon, Y. Improving aggregate recommendation diversity using ranking-based techniques. IEEE Trans. Knowl. Data Eng. 2012, 24, 896–911. [Google Scholar] [CrossRef]
Zhang, M.; Hurley, N. Novel item recommendation by user profile partitioning. In Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, Milan, Italy, 15–18 September 2009; Volume 1, pp. 508–515. [Google Scholar]
Aytekin, T.; Karakaya, M.Ö. Clustering-based diversity improvement in top-n recommendation. J. Intell. Inf. Syst. 2014, 42, 1–18. [Google Scholar] [CrossRef]
Ribeiro, M.T.; Lacerda, A.; Veloso, A.; Ziviani, N. Pareto-Efficient hybridization for multi-objective recommender systems. In Proceedings of the Sixth ACM Conference on Recommender Systems, Dublin, Ireland, 9–13 September 2012; pp. 19–26. [Google Scholar]
Abdollahpouri, H.; Burke, R.; Mobasher, B. Managing popularity bias in recommender systems with personalized re-ranking. arXiv 2019, arXiv:1901.07555. [Google Scholar]
Blondel, V.D.; Guillaume, J.L.; Lambiotte, R.; Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008, 2008, P10008. [Google Scholar] [CrossRef]
Vargas, S.; Baltrunas, L.; Karatzoglou, A.; Castells, P. Coverage, redundancy and size-awareness in genre diversity for recommender systems. In Proceedings of the 8th ACM Conference on Recommender Systems, Silicon Valley, CA, USA, 6–10 October 2014; pp. 209–216. [Google Scholar]
Kipf, T.N.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv 2017, arXiv:1609.02907. [Google Scholar]
Zhang, H.; Zhu, Z.; Caverlee, J. Evolution of filter bubbles and polarization in news recommendation. In Proceedings of the Advances in Information Retrieval, Dublin, Ireland, 2–6 April 2023; pp. 685–693. [Google Scholar]
Hurley, N.J. Personalised ranking with diversity. In Proceedings of the 7th ACM Conference on Recommender Systems, Hong Kong, China, 12–16 October 2013; pp. 379–382. [Google Scholar]
Li, C.; Feng, H.; de Rijke, M. Cascading hybrid bandits: Online learning to rank for relevance and diversity. In Proceedings of the 14th ACM Conference on Recommender Systems, Virtual, 22–26 September 2020; pp. 33–42. [Google Scholar]
Zhu, Z.; Shen, Y.; Zhao, Y.; Li, J. Popularity bias in dynamic recommendation. IEEE Trans. Knowl. Data Eng. 2023, 35, 3865–3877. [Google Scholar]
Zhou, H.; Chen, H.; Dong, J.; Zha, D.; Zhou, C.; Huang, X. Adaptive popularity debiasing aggregator for graph collaborative filtering. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, Taipei, Taiwan, 23–27 July 2023; pp. 7–17. [Google Scholar]
Wang, X.; Jin, H.; Zhang, A.; He, X.; Xu, T.; Chua, T.S. Disentangled graph collaborative filtering. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual, 25–30 July 2020; pp. 1001–1010. [Google Scholar]
Ren, X.; Xia, L.; Zhao, J.; Yin, D.; Huang, C. Disentangled contrastive collaborative filtering. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, Taipei, Taiwan, 23–27 July 2023; pp. 1137–1146. [Google Scholar]
Li, C.; Liu, Z.; Wu, M.; Xu, Y.; Zhao, H.; Huang, P.; Kang, G.; Chen, Q.; Li, W.; Lee, D.L. Multi-interest network with dynamic routing for recommendation at tmall. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, 3–7 November 2019; pp. 2615–2623. [Google Scholar]
Cen, Y.; Zhang, J.; Zou, X.; Zhou, C.; Yang, H.; Tang, J. Controllable multi-interest framework for recommendation. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual, 6–10 July 2020; pp. 2942–2951. [Google Scholar]
Yang, L.; Wang, S.; Tao, Y.; Sun, J.; Liu, X.; Yu, P.S.; Wang, T. DGRec: Graph neural network for recommendation with diversified embedding generation. In Proceedings of the 16th ACM International Conference on Web Search and Data Mining, Singapore, Singapore, 27 February–3 March 2023; pp. 661–669. [Google Scholar]
Lunardi, G.M.; Machado, G.M.; Maran, V.; de Oliveira, J.P.M. A metric for filter bubble measurement in recommender algorithms considering the news domain. Appl. Soft Comput. 2020, 97, 106771. [Google Scholar] [CrossRef]
Michiels, L.; Vannieuwenhuyze, J.; Leysen, J.; Verachtert, R.; Smets, A.; Goethals, B. How should we measure filter bubbles? a regression model and evidence for online news. In Proceedings of the 17th ACM Conference on Recommender Systems, Singapore, 18–22 September 2023; pp. 640–651. [Google Scholar]
Liu, P.; Shivaram, K.; Culotta, A.; Shapiro, M.A.; Bilgic, M. The interaction between political typology and filter bubbles in news recommendation algorithms. In Proceedings of the Web Conference 2021, Ljubljana, Slovenia, 19–23 April 2021; pp. 3791–3801. [Google Scholar]
Aridor, G.; Goncalves, D.; Sikdar, S. Deconstructing the filter bubble: User decision-making and recommender systems. In Proceedings of the 14th ACM Conference on Recommender Systems, Virtual, 22–26 September 2020; pp. 82–91. [Google Scholar]
Wang, W.; Feng, F.; Nie, L.; Chua, T.S. User-controllable recommendation against filter bubbles. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, 11–15 July 2022; pp. 1251–1261. [Google Scholar]
McKay, D.; Owyong, K.; Makri, S.; Lopez, M.G. Turn and face the strange: Investigating filter bubble bursting information interactions. In Proceedings of the 2022 Conference on Human Information Interaction and Retrieval, Regensburg, Germany, 14–18 March 2022; pp. 233–242. [Google Scholar]
Gao, C.; Wang, S.; Li, S.; Chen, J.; He, X.; Lei, W.; Li, B.; Zhang, Y.; Jiang, P. CIRS: Bursting filter bubbles by counterfactual interactive recommender system. ACM Trans. Inf. Syst. 2023, 42, 1–27. [Google Scholar] [CrossRef]
Li, Z.; Dong, Y.; Gao, C.; Zhao, Y.; Li, D.; Hao, J.; Zhang, K.; Li, Y.; Wang, Z. Breaking filter bubble: A reinforcement learning framework of controllable recommender system. In Proceedings of the ACM Web Conference 2023, Austin, DX, USA, 30 April–4 May 2023; pp. 4041–4049. [Google Scholar]
Sanz-Cruzado, J.; Castells, P. Enhancing structural diversity in social networks by recommending weak ties. In Proceedings of the 12th ACM Conference on Recommender Systems, Vancouver, Canada, 2–7 October 2018; pp. 233–241. [Google Scholar]
Musco, C.; Musco, C.; Tsourakakis, C.E. Minimizing polarization and disagreement in social networks. In Proceedings of the 2018 World Wide Web Conference, Lyon, France, 23–27 April 2018; pp. 369–378. [Google Scholar]
Grossetti, Q.; du Mouza, C.; Travers, N.; Constantin, C. Reducing the filter bubble effect on Twitter by considering communities for recommendations. Int. J. Web Inf. Syst. 2021, 17, 728–752. [Google Scholar] [CrossRef]
Newman, M.E.J. Modularity and community structure in networks. Proc. Natl. Acad. Sci. USA 2006, 103, 8577–8582. [Google Scholar] [CrossRef] [PubMed]
Traag, V.A.; Waltman, L.; Van Eck, N.J. From Louvain to Leiden: Guaranteeing well-connected communities. Sci. Rep. 2019, 9, 5233. [Google Scholar] [CrossRef] [PubMed]
Parés, F.; Gasulla, D.G.; Vilalta, A.; Moreno, J.; Ayguadé, E.; Labarta, J.; Cortés, U.; Suzumura, T. Fluid communities: A competitive, scalable and diverse community detection algorithm. In Proceedings of the Complex Networks & Their Applications VI, Lyon, France, 29 November–1 December 2018; pp. 229–240. [Google Scholar]
Halberstam, Y.; Knight, B. Homophily, group size, and the diffusion of political information in social networks: Evidence from Twitter. J. Public Econ. 2016, 143, 73–88. [Google Scholar] [CrossRef]
Matakos, A.; Terzi, E.; Tsaparas, P. Measuring and moderating opinion polarization in social networks. Data Min. Knowl. Discov. 2017, 31, 1480–1505. [Google Scholar] [CrossRef]
McPherson, M.; Smith-Lovin, L.; Cook, J.M. Birds of a feather: Homophily in social networks. Annu. Rev. Sociol. 2001, 27, 415–444. [Google Scholar] [CrossRef]
Kumar, S.; Hamilton, W.L.; Leskovec, J.; Jurafsky, D. Community interaction and conflict on the web. In Proceedings of the 2018 World Wide Web Conference, Lyon, France, 23–27 April 2018; pp. 933–943. [Google Scholar]
Grootendorst, M. BERTopic: Neural topic modeling with a class-based TF-IDF procedure. arXiv 2022, arXiv:2203.05794. [Google Scholar]
Nemhauser, G.L.; Wolsey, L.A.; Fisher, M.L. An analysis of approximations for maximizing submodular set functions—I. Math. Program. 1978, 14, 265–294. [Google Scholar] [CrossRef]
Rendle, S.; Freudenthaler, C.; Gantner, Z.; Schmidt-Thieme, L. BPR: BAyesian personalized ranking from implicit feedback. In Proceedings of the 25th Conference on Uncertainty Artificial Intelligence, Montreal, Canada, 28–21 June 2009; pp. 452–461. [Google Scholar]
Giorgi, S.; Guntuku, S.C.; Himelein-Wachowiak, M.; Kwarteng, A.; Hwang, S.; Rahman, M.; Curtis, B. Twitter Corpus of the #BlackLivesMatter movement and counter protests: 2013 to 2021. In Proceedings of the International AAAI Conference on Web Social Media, Atlanta, GA, USA, 6–9 June 2022; pp. 1228–1235. [Google Scholar]
Ravi, K.; Vela, A.E. Comprehensive dataset of user-submitted articles with ideological and extreme bias from Reddit. Data Brief 2024, 56, 110849. [Google Scholar] [CrossRef] [PubMed]
Rossi, E.; Chamberlain, B.; Frasca, F.; Eynard, D.; Monti, F.; Bronstein, M. Temporal graph networks for deep learning on dynamic graphs. In Proceedings of the ICML Workshop on Graph Representation Learning, Virtual, 13–18 July 2020. [Google Scholar]
Sun, X.; Cheng, H.; Li, J.; Liu, B.; Guan, J. All in one: Multi-task prompting for graph neural networks. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Long Beach, CA, USA, 6–10 August 2023; pp. 2120–2131. [Google Scholar]
Zeng, H.; Zhou, H.; Srivastava, A.; Kannan, R.; Prasanna, V. GraphSAINT: Graph sampling based inductive learning method. In Proceedings of the International Conference on Learning Representations, Addis Ababa, Ethiopia, 26 April–1 May 2020. [Google Scholar]

Figure 1. Overview of the proposed CATD-GNN method and its two-stage diversification process.

Figure 2. Comparisonof the HL values of community-based and topic-based classification on the Twitter-BLM and Reddit-Ideological datasets.

Figure 3. Performance metrics as a function of recommendation list size K on the Twitter-BLM dataset.

Figure 4. Hyperparameter sensitivity analysis on the Twitter-BLM dataset with

K = 100

.

Figure 4. Hyperparameter sensitivity analysis on the Twitter-BLM dataset with

K = 100

.

Figure 5. Community distribution of the recommendations (

K = 100

) with Shannon entropy H. Top row: Twitter-BLM results showing the 20 most-frequent communities out of 69 total communities. Bottom row: Reddit-Ideological results showing the 30 most-frequent communities out of 128 total communities. Higher entropy indicates better diversity.

Figure 5. Community distribution of the recommendations (

K = 100

) with Shannon entropy H. Top row: Twitter-BLM results showing the 20 most-frequent communities out of 69 total communities. Bottom row: Reddit-Ideological results showing the 30 most-frequent communities out of 128 total communities. Higher entropy indicates better diversity.

Figure 6. RWCscores across methods. Red bars denote the proposed method (CATD-GNN); other colors denote baseline methods.

Table 1. Performance comparison on the Twitter-BLM dataset. Best results are in bold, the second best are underlined. All metrics were computed at

K = 100