Community-Aware Two-Stage Diversification for Social Media User Recommendation with Graph Neural Networks
Abstract
1. Introduction
- This paper presents a method to use identified community structures as pseudo-categories in user recommendation systems, addressing the lack of direct user attribute categorization through topology-driven community detection.
- The approach involves community-aware graph learning, integrating submodular neighbor selection and loss reweighting into GNN training to produce embeddings conducive to varied recommendations.
- Extensive empirical validation indicates that CATD-GNN significantly enhances diversity while preserving competitive accuracy in various real-world social networks, presenting an effective method for mitigating filter bubbles.
- The proposed two-stage architecture is shown to offer complementary effects, with both learning-based diversification and coverage-aware reranking contributing improvements that neither component achieves independently.
2. Related Work
2.1. GNNs for Recommendation
2.2. Diversity in Recommendation Systems
2.3. Filter Bubble Mitigation in Social Media
2.4. Community Detection and Filter Bubble Formation
3. Proposed Method
3.1. Problem Formulation
3.2. Community Detection for Pseudo-Category Generation
Validation of Community-Based Pseudo-Categories
3.3. Bipartite Graph Transformation
3.4. Stage 1: Community-Aware Graph Learning
3.4.1. LightGCN
3.4.2. Layer-Wise Attention Mechanism
3.4.3. Submodular Neighbor Selection
3.4.4. Community-Based Loss Reweighting
3.5. Stage 2: Coverage and Redundancy-Aware Reranking
3.5.1. Binomial Diversity Framework
3.5.2. Greedy Reranking Algorithm
3.6. Complexity Analysis
- Preprocessing Phase. Community detection using the Louvain algorithm requires time complexity, where m denotes the number of edges and n the number of nodes. The space complexity is for storing the graph and community assignments. This one-time cost amortizes across all subsequent training and inference operations. The algorithm typically converges within 5–10 iterations for social networks because of their inherent community structure, making the practical complexity closer to for most real-world networks.
- Training Phase. The training complexity comprises four main components that must be analyzed separately:
- Submodular neighbor selection: For each batch of size B, selecting diverse neighbors requires computing pairwise similarities among candidates. This yields time complexity per batch, where denotes the average node degree and d is the embedding dimension. The greedy selection algorithm with lazy evaluation reduces this to in practice, where is the maximum selected neighborhood size.
- Graph propagation: The modified LightGCN propagation with selected neighbors requires time per batch for L propagation layers, significantly less than the required by standard aggregation.
- Attention mechanism: Computing attention weights across L layers as in Equation (7) requires time for weight computation and normalization.
- Community-based loss reweighting: Computing weights for each sample requires time, where K is the number of communities, because it is necessary to look up community sizes and compute the weighting formula.
- Inference Phase. For each query user, generating recommendations involves three steps:
- Embedding computation: Computing the query user’s embedding through L-layer propagation with attention requires time.
- Candidate scoring: Computing relevance scores for all n candidate users via inner products requires time.
- Diversity-aware reranking: The greedy selection based on Equation (18) requires computing the marginal diversity gains. For each of N positions, up to n candidates are evaluated, computing diversity scores involving K communities. This yields time complexity. With lazy evaluation using priority queues, this reduces to when .
4. Experiments
4.1. Experimental Setup
4.1.1. Dataset
- Twitter-BLM Dataset. The Twitter-BLM dataset [64] consists of retweet interactions during a Black Lives Matter (BLM) discourse from a subset of the complete dataset spanning 1–14 June 2020. The original corpus contains 63.9 million tweets from 13.0 million users collected from 2013 to 2021, but this study considered a focused temporal window during the peak of the BLM protests following George Floyd’s death. A user–user interaction graph was constructed, where nodes represent Twitter users and directed edges indicate that user retweeted content from user . Users with fewer than five interactions were removed to ensure meaningful connectivity patterns and exclude the top 2% most active users to mitigate extreme popularity bias. The resulting network comprises 23,397 users. The Louvain algorithm automatically identified 69 communities through modularity optimization, with community sizes ranging from 12 to 1847 users.
- Reddit-Ideological Dataset. The Reddit-Ideological dataset [65] comprises user-article interactions from ideologically oriented subreddits including r/politics, r/Conservative, r/Liberal, and restricted communities. The dataset contains 377,144 articles across three ideological categories (Liberal: 72,488 articles from six subreddits, Conservative: 79,573 articles from six subreddits, and Restricted: 225,083 articles from 16 subreddits). To construct the user–user interaction graph, a co-engagement method was employed, where edges connect users who interact with common content. Specifically, an undirected edge was established when both users posted or commented on articles within the same subreddit during the same temporal window (7-day period). Edge weights were computed as , where denotes the set of subreddits where both users were active, and represents the number of posts/comments by user i in subreddit s. This logarithmic weighting prevents highly active users from dominating the graph structure while preserving engagement intensity. Edges with weights below threshold were removed to eliminate spurious connections from minimal co-occurrences. The threshold value was determined by analyzing the weight distribution, where (: mean edge weight, : standard deviation) ensures the retention of statistically significant connections while maintaining graph connectivity. The preprocessed network contains 45,231 users and 812,453 edges. Community detection via the Louvain algorithm identified 128 communities, where community sizes ranged from 23 to 2847 users.
- Train-Test Split Protocol. For both datasets, temporal splitting was employed to simulate realistic deployment scenarios. The first 70% of chronologically ordered interactions formed the training set, the next 15% constituted the validation set, and the final 15% served as the test set. Users that appeared only in the test interactions were removed to enable personalized evaluation. This temporal split is more challenging than random splitting because it requires predicting future interactions based on historical patterns.
- Dataset Generalizability. While the experiments focus on two datasets, these datasets represent different social network structures that test the method’s generalizability across diverse contexts. The Twitter-BLM dataset exemplifies organic community formation around social movements, where users naturally cluster based on ideological alignment without explicit platform mechanisms. In contrast, Reddit-Ideological demonstrates institutionalized segregation through subreddit boundaries, representing platforms with explicit community structures. These datasets collectively capture the two primary modes of community formation in social media: emergent clustering (Twitter) and designed separation (Reddit).
4.1.2. Baseline Methods
- Accuracy-Focused GNN Methods. These represent state-of-the-art approaches optimized primarily for recommendation accuracy.
- LightGCN [4]: simplifies GCN by removing feature transformation and nonlinear activation while maintaining pure neighborhood aggregation.
- NGCF [5]: models high-order connectivity by injecting collaborative signals through multiple propagation layers.
- SGL [6]: employs self-supervised learning with graph augmentation including node dropout, edge dropout, and random walk sampling.
- XSimGCL [7]: the current state-of-the-art using noise-based contrastive learning without complex graph perturbations.
- Diversity-Aware Methods. These methods explicitly incorporate diversity objectives.
- Filter Bubble Mitigation Methods. These approaches specifically target echo chamber effects in social networks.
4.1.3. Evaluation Metrics
- Accuracy Metrics. For user recommendation evaluation, accuracy metrics measure how well the system predicts actual future interactions. Given a ranked list of top-K recommended users for query user u and the ground truth test interactions ,
- Diversity Metrics. Following Tang et al. [18], community-aware diversity metrics quantify filter bubble effects in user recommendations.
4.1.4. Implementation Details
4.2. Main Results (RQ1)
Performance Across Recommendation List Sizes
4.3. Component Analysis (RQ2)
4.3.1. Ablation Studies
4.3.2. Hyperparameter Sensitivity Analysis
4.4. Filter Bubble Mitigation Visualization (RQ3)
4.4.1. Community Distribution Visualization
4.4.2. Filter Bubble Mitigation Analysis
5. Conclusions
Limitations
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Vosoughi, S.; Roy, D.; Aral, S. The spread of true and false news online. Science 2018, 359, 1146–1151. [Google Scholar] [CrossRef] [PubMed]
- Bakshy, E.; Messing, S.; Adamic, L.A. Exposure to ideologically diverse news and opinion on Facebook. Science 2015, 348, 1130–1132. [Google Scholar] [CrossRef] [PubMed]
- Cinelli, M.; De Francisci Morales, G.; Galeazzi, A.; Quattrociocchi, W.; Starnini, M. The echo chamber effect on social media. Proc. Natl. Acad. Sci. USA 2021, 118, e2023301118. [Google Scholar] [CrossRef] [PubMed]
- He, X.; Deng, K.; Wang, X.; Li, Y.; Zhang, Y.; Wang, M. LightGCN: Simplifying and powering graph convolution network for recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual, 25–30 July 2020; pp. 639–648. [Google Scholar]
- Wang, X.; He, X.; Wang, M.; Feng, F.; Chua, T.S. Neural graph collaborative filtering. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Paris, France, 21–25 July 2019; pp. 165–174. [Google Scholar]
- Wu, J.; Wang, X.; Feng, F.; He, X.; Chen, L.; Lian, J.; Xie, X. Self-supervised graph learning for recommendation. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual, 11–15 July 2021; pp. 726–735. [Google Scholar]
- Yu, J.; Xia, X.; Chen, T.; Cui, L.; Hung, N.Q.V.; Yin, H. XSimGCL: Towards extremely simple graph contrastive learning for recommendation. IEEE Trans. Knowl. Data Eng. 2024, 36, 913–926. [Google Scholar] [CrossRef]
- Pariser, E. The Filter Bubble: How the New Personalized Web Is Changing What We Read and How We Think; Penguin Books: London, UK, 2011. [Google Scholar]
- Nguyen, T.T.; Hui, P.M.; Harper, F.M.; Terveen, L.; Konstan, J.A. Exploring the filter bubble: The effect of using recommender systems on content diversity. In Proceedings of the 23rd International Conference on World Wide Web, Seoul, Korea, 7–11 April 2014; pp. 677–686. [Google Scholar]
- Chitra, U.; Musco, C. Analyzing the impact of filter bubbles on social network polarization. In Proceedings of the 13th ACM International Conference on Web Search Data Mining, Houston, DX, USA, 3–7 February 2020; pp. 115–123. [Google Scholar]
- Gao, Z.; Shen, T.; Mai, Z.; Bouadjenek, M.R.; Waller, I.; Anderson, A.; Bodkin, R.; Sanner, S. Mitigating the filter bubble while maintaining relevance: Targeted diversification with VAE-based recommender systems. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, 11–15 July 2022; pp. 2524–2531. [Google Scholar]
- Mansoury, M.; Abdollahpouri, H.; Pechenizkiy, M.; Mobasher, B.; Burke, R. Feedback loop and bias amplification in recommender systems. In Proceedings of the 29th ACM international conference on information & knowledge management, Virtual, 19–23 October 2020; pp. 2145–2148. [Google Scholar]
- Sun, W.; Khenissi, S.; Nasraoui, O.; Shafto, P. Debiasing the human-recommender system feedback loop in collaborative filtering. In Proceedings of the Companion Proceedings of The 2019 World Wide Web Conference, San Francisco, CA, USA, 13–17 May 2019; pp. 645–651. [Google Scholar]
- Ge, Y.; Zhao, S.; Zhou, H.; Pei, C.; Sun, F.; Ou, W. Understanding echo chambers in e-commerce recommender systems. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual, 25–30 July 2020; pp. 2261–2270. [Google Scholar]
- Li, N.; Gao, C.; Piao, J.; Huang, X.; Yue, A.; Zhou, L.; Liao, Q.; Li, Y. An exploratory study of information cocoon on short-form video platform. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, Atlanta, GA, USA, 17–21 October 2022; pp. 4178–4182. [Google Scholar]
- Piao, J.; Liu, J.; Zhang, F.; Su, J.; Li, Y. Human–AI adaptive dynamics drives the emergence of information cocoons. Nature Mach. Intell. 2023, 5, 1214–1224. [Google Scholar] [CrossRef]
- Garimella, K.; De Francisci Morales, G.; Gionis, A.; Mathioudakis, M. Reducing controversy by connecting opposing views. In Proceedings of the 10th ACM International Conference on Web Search Data Mining, Cambridge, UK, 6–10 February 2017; pp. 81–90. [Google Scholar]
- Tang, M.; Huang, X.; Sang, J. Mitigating filter bubble from the perspective of community detection: A universal framework. arXiv 2025, arXiv:2508.11239. [Google Scholar]
- Castells, P.; Hurley, N.; Vargas, S. Novelty and diversity in recommender systems. In Recommender Systems Handbook; Springer: New York, NY, USA, 2021; pp. 603–646. [Google Scholar]
- Wu, Q.; Liu, Y.; Miao, C.; Zhao, Y.; Guan, L.; Tang, H. Recent advances in diversified recommendation. arXiv 2019, arXiv:1905.06589. [Google Scholar]
- Ziegler, C.N.; McNee, S.M.; Konstan, J.A.; Lausen, G. Improving recommendation lists through topic diversification. In Proceedings of the 14th International Conference on World Wide Web, Chiba, Japan, 10–14 May 2005; pp. 22–32. [Google Scholar]
- Carbonell, J.G.; Goldstein, J. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st Annu. International ACM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia, 24–28 August 1998; pp. 335–336. [Google Scholar]
- Chen, L.; Zhang, G.; Zhou, E. Fast greedy MAP inference for determinantal point process to improve recommendation diversity. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montrèal, Canada, 3–8 December 2018; pp. 5622–5633. [Google Scholar]
- Wasilewski, J.; Hurley, N. Incorporating diversity in a learning to rank recommender system. In Proceedings of the Twenty-Ninth International Florida Artificial Intelligence Research Society Conference, Key Largo, FL, USA, 16–18 May 2016. [Google Scholar]
- Adomavicius, G.; Kwon, Y. Improving aggregate recommendation diversity using ranking-based techniques. IEEE Trans. Knowl. Data Eng. 2012, 24, 896–911. [Google Scholar] [CrossRef]
- Zhang, M.; Hurley, N. Novel item recommendation by user profile partitioning. In Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, Milan, Italy, 15–18 September 2009; Volume 1, pp. 508–515. [Google Scholar]
- Aytekin, T.; Karakaya, M.Ö. Clustering-based diversity improvement in top-n recommendation. J. Intell. Inf. Syst. 2014, 42, 1–18. [Google Scholar] [CrossRef]
- Ribeiro, M.T.; Lacerda, A.; Veloso, A.; Ziviani, N. Pareto-Efficient hybridization for multi-objective recommender systems. In Proceedings of the Sixth ACM Conference on Recommender Systems, Dublin, Ireland, 9–13 September 2012; pp. 19–26. [Google Scholar]
- Abdollahpouri, H.; Burke, R.; Mobasher, B. Managing popularity bias in recommender systems with personalized re-ranking. arXiv 2019, arXiv:1901.07555. [Google Scholar]
- Blondel, V.D.; Guillaume, J.L.; Lambiotte, R.; Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008, 2008, P10008. [Google Scholar] [CrossRef]
- Vargas, S.; Baltrunas, L.; Karatzoglou, A.; Castells, P. Coverage, redundancy and size-awareness in genre diversity for recommender systems. In Proceedings of the 8th ACM Conference on Recommender Systems, Silicon Valley, CA, USA, 6–10 October 2014; pp. 209–216. [Google Scholar]
- Kipf, T.N.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv 2017, arXiv:1609.02907. [Google Scholar]
- Zhang, H.; Zhu, Z.; Caverlee, J. Evolution of filter bubbles and polarization in news recommendation. In Proceedings of the Advances in Information Retrieval, Dublin, Ireland, 2–6 April 2023; pp. 685–693. [Google Scholar]
- Hurley, N.J. Personalised ranking with diversity. In Proceedings of the 7th ACM Conference on Recommender Systems, Hong Kong, China, 12–16 October 2013; pp. 379–382. [Google Scholar]
- Li, C.; Feng, H.; de Rijke, M. Cascading hybrid bandits: Online learning to rank for relevance and diversity. In Proceedings of the 14th ACM Conference on Recommender Systems, Virtual, 22–26 September 2020; pp. 33–42. [Google Scholar]
- Zhu, Z.; Shen, Y.; Zhao, Y.; Li, J. Popularity bias in dynamic recommendation. IEEE Trans. Knowl. Data Eng. 2023, 35, 3865–3877. [Google Scholar]
- Zhou, H.; Chen, H.; Dong, J.; Zha, D.; Zhou, C.; Huang, X. Adaptive popularity debiasing aggregator for graph collaborative filtering. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, Taipei, Taiwan, 23–27 July 2023; pp. 7–17. [Google Scholar]
- Wang, X.; Jin, H.; Zhang, A.; He, X.; Xu, T.; Chua, T.S. Disentangled graph collaborative filtering. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual, 25–30 July 2020; pp. 1001–1010. [Google Scholar]
- Ren, X.; Xia, L.; Zhao, J.; Yin, D.; Huang, C. Disentangled contrastive collaborative filtering. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, Taipei, Taiwan, 23–27 July 2023; pp. 1137–1146. [Google Scholar]
- Li, C.; Liu, Z.; Wu, M.; Xu, Y.; Zhao, H.; Huang, P.; Kang, G.; Chen, Q.; Li, W.; Lee, D.L. Multi-interest network with dynamic routing for recommendation at tmall. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, 3–7 November 2019; pp. 2615–2623. [Google Scholar]
- Cen, Y.; Zhang, J.; Zou, X.; Zhou, C.; Yang, H.; Tang, J. Controllable multi-interest framework for recommendation. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual, 6–10 July 2020; pp. 2942–2951. [Google Scholar]
- Yang, L.; Wang, S.; Tao, Y.; Sun, J.; Liu, X.; Yu, P.S.; Wang, T. DGRec: Graph neural network for recommendation with diversified embedding generation. In Proceedings of the 16th ACM International Conference on Web Search and Data Mining, Singapore, Singapore, 27 February–3 March 2023; pp. 661–669. [Google Scholar]
- Lunardi, G.M.; Machado, G.M.; Maran, V.; de Oliveira, J.P.M. A metric for filter bubble measurement in recommender algorithms considering the news domain. Appl. Soft Comput. 2020, 97, 106771. [Google Scholar] [CrossRef]
- Michiels, L.; Vannieuwenhuyze, J.; Leysen, J.; Verachtert, R.; Smets, A.; Goethals, B. How should we measure filter bubbles? a regression model and evidence for online news. In Proceedings of the 17th ACM Conference on Recommender Systems, Singapore, 18–22 September 2023; pp. 640–651. [Google Scholar]
- Liu, P.; Shivaram, K.; Culotta, A.; Shapiro, M.A.; Bilgic, M. The interaction between political typology and filter bubbles in news recommendation algorithms. In Proceedings of the Web Conference 2021, Ljubljana, Slovenia, 19–23 April 2021; pp. 3791–3801. [Google Scholar]
- Aridor, G.; Goncalves, D.; Sikdar, S. Deconstructing the filter bubble: User decision-making and recommender systems. In Proceedings of the 14th ACM Conference on Recommender Systems, Virtual, 22–26 September 2020; pp. 82–91. [Google Scholar]
- Wang, W.; Feng, F.; Nie, L.; Chua, T.S. User-controllable recommendation against filter bubbles. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, 11–15 July 2022; pp. 1251–1261. [Google Scholar]
- McKay, D.; Owyong, K.; Makri, S.; Lopez, M.G. Turn and face the strange: Investigating filter bubble bursting information interactions. In Proceedings of the 2022 Conference on Human Information Interaction and Retrieval, Regensburg, Germany, 14–18 March 2022; pp. 233–242. [Google Scholar]
- Gao, C.; Wang, S.; Li, S.; Chen, J.; He, X.; Lei, W.; Li, B.; Zhang, Y.; Jiang, P. CIRS: Bursting filter bubbles by counterfactual interactive recommender system. ACM Trans. Inf. Syst. 2023, 42, 1–27. [Google Scholar] [CrossRef]
- Li, Z.; Dong, Y.; Gao, C.; Zhao, Y.; Li, D.; Hao, J.; Zhang, K.; Li, Y.; Wang, Z. Breaking filter bubble: A reinforcement learning framework of controllable recommender system. In Proceedings of the ACM Web Conference 2023, Austin, DX, USA, 30 April–4 May 2023; pp. 4041–4049. [Google Scholar]
- Sanz-Cruzado, J.; Castells, P. Enhancing structural diversity in social networks by recommending weak ties. In Proceedings of the 12th ACM Conference on Recommender Systems, Vancouver, Canada, 2–7 October 2018; pp. 233–241. [Google Scholar]
- Musco, C.; Musco, C.; Tsourakakis, C.E. Minimizing polarization and disagreement in social networks. In Proceedings of the 2018 World Wide Web Conference, Lyon, France, 23–27 April 2018; pp. 369–378. [Google Scholar]
- Grossetti, Q.; du Mouza, C.; Travers, N.; Constantin, C. Reducing the filter bubble effect on Twitter by considering communities for recommendations. Int. J. Web Inf. Syst. 2021, 17, 728–752. [Google Scholar] [CrossRef]
- Newman, M.E.J. Modularity and community structure in networks. Proc. Natl. Acad. Sci. USA 2006, 103, 8577–8582. [Google Scholar] [CrossRef] [PubMed]
- Traag, V.A.; Waltman, L.; Van Eck, N.J. From Louvain to Leiden: Guaranteeing well-connected communities. Sci. Rep. 2019, 9, 5233. [Google Scholar] [CrossRef] [PubMed]
- Parés, F.; Gasulla, D.G.; Vilalta, A.; Moreno, J.; Ayguadé, E.; Labarta, J.; Cortés, U.; Suzumura, T. Fluid communities: A competitive, scalable and diverse community detection algorithm. In Proceedings of the Complex Networks & Their Applications VI, Lyon, France, 29 November–1 December 2018; pp. 229–240. [Google Scholar]
- Halberstam, Y.; Knight, B. Homophily, group size, and the diffusion of political information in social networks: Evidence from Twitter. J. Public Econ. 2016, 143, 73–88. [Google Scholar] [CrossRef]
- Matakos, A.; Terzi, E.; Tsaparas, P. Measuring and moderating opinion polarization in social networks. Data Min. Knowl. Discov. 2017, 31, 1480–1505. [Google Scholar] [CrossRef]
- McPherson, M.; Smith-Lovin, L.; Cook, J.M. Birds of a feather: Homophily in social networks. Annu. Rev. Sociol. 2001, 27, 415–444. [Google Scholar] [CrossRef]
- Kumar, S.; Hamilton, W.L.; Leskovec, J.; Jurafsky, D. Community interaction and conflict on the web. In Proceedings of the 2018 World Wide Web Conference, Lyon, France, 23–27 April 2018; pp. 933–943. [Google Scholar]
- Grootendorst, M. BERTopic: Neural topic modeling with a class-based TF-IDF procedure. arXiv 2022, arXiv:2203.05794. [Google Scholar]
- Nemhauser, G.L.; Wolsey, L.A.; Fisher, M.L. An analysis of approximations for maximizing submodular set functions—I. Math. Program. 1978, 14, 265–294. [Google Scholar] [CrossRef]
- Rendle, S.; Freudenthaler, C.; Gantner, Z.; Schmidt-Thieme, L. BPR: BAyesian personalized ranking from implicit feedback. In Proceedings of the 25th Conference on Uncertainty Artificial Intelligence, Montreal, Canada, 28–21 June 2009; pp. 452–461. [Google Scholar]
- Giorgi, S.; Guntuku, S.C.; Himelein-Wachowiak, M.; Kwarteng, A.; Hwang, S.; Rahman, M.; Curtis, B. Twitter Corpus of the #BlackLivesMatter movement and counter protests: 2013 to 2021. In Proceedings of the International AAAI Conference on Web Social Media, Atlanta, GA, USA, 6–9 June 2022; pp. 1228–1235. [Google Scholar]
- Ravi, K.; Vela, A.E. Comprehensive dataset of user-submitted articles with ideological and extreme bias from Reddit. Data Brief 2024, 56, 110849. [Google Scholar] [CrossRef] [PubMed]
- Rossi, E.; Chamberlain, B.; Frasca, F.; Eynard, D.; Monti, F.; Bronstein, M. Temporal graph networks for deep learning on dynamic graphs. In Proceedings of the ICML Workshop on Graph Representation Learning, Virtual, 13–18 July 2020. [Google Scholar]
- Sun, X.; Cheng, H.; Li, J.; Liu, B.; Guan, J. All in one: Multi-task prompting for graph neural networks. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Long Beach, CA, USA, 6–10 August 2023; pp. 2120–2131. [Google Scholar]
- Zeng, H.; Zhou, H.; Srivastava, A.; Kannan, R.; Prasanna, V. GraphSAINT: Graph sampling based inductive learning method. In Proceedings of the International Conference on Learning Representations, Addis Ababa, Ethiopia, 26 April–1 May 2020. [Google Scholar]






| Method | Accuracy | Diversity | |||||
|---|---|---|---|---|---|---|---|
| Recall | Prec. | NDCG | ILFBI↓ | CGI↓ | Cov. | Ent. | |
| LightGCN | 0. 268 ± 0.012 | 0.048 ± 0.002 | 0.108 ± 0.005 | 0.982 ± 0.007 | 0.819 ± 0.015 | 11.78 ± 1.5 | 1.43 ± 0.10 |
| NGCF | 0.261 ± 0.014 | 0.047 ± 0.003 | 0.105 ± 0.006 | 0.975 ± 0.012 | 0.807 ± 0.021 | 12.34 ± 1.8 | 1.51 ± 0.13 |
| SGL | 0.274 ± 0.013 | 0.049 ± 0.002 | 0.110 ± 0.005 | 0.968 ± 0.009 | 0.795 ± 0.017 | 13.21 ± 1.6 | 1.59 ± 0.11 |
| XSimGCL | 0.285 ± 0.011 | 0.051 ± 0.002 | 0.114 ± 0.004 | 0.956 ± 0.006 | 0.783 ± 0.014 | 14.12 ± 1.4 | 1.67 ± 0.09 |
| CAM | 0.259 ± 0.015 | 0.046 ± 0.003 | 0.104 ± 0.007 | 0.867 ± 0.018 | 0.724 ± 0.022 | 16.89 ± 2.1 | 1.89 ± 0.14 |
| Weak Ties | 0.252 ± 0.016 | 0.045 ± 0.003 | 0.101 ± 0.008 | 0.852 ± 0.020 | 0.712 ± 0.025 | 17.56 ± 2.3 | 1.97 ± 0.16 |
| CD-CGCN | 0.256 ± 0.014 | 0.046 ± 0.003 | 0.103 ± 0.007 | 0.756 ± 0.022 | 0.641 ± 0.028 | 20.23 ± 2.6 | 2.15 ± 0.18 |
| DGRec | 0.249 ± 0.013 | 0.044 ± 0.002 | 0.100 ± 0.006 | 0.843 ± 0.019 | 0.706 ± 0.023 | 18.12 ± 2.2 | 2.03 ± 0.15 |
| MMR | 0.246 ± 0.018 | 0.044 ± 0.004 | 0.098 ± 0.009 | 0.867 ± 0.024 | 0.724 ± 0.028 | 16.89 ± 2.5 | 1.89 ± 0.16 |
| DPP | 0.243 ± 0.019 | 0.043 ± 0.004 | 0.097 ± 0.010 | 0.859 ± 0.026 | 0.716 ± 0.030 | 17.34 ± 2.7 | 1.94 ± 0.18 |
| CATD-GNN | 0.257 ± 0.012 | 0.046 ± 0.002 | 0.105 ± 0.005 | 0.678 ± 0.024 | 0.587 ± 0.029 | 21.45 ± 2.8 | 2.28 ± 0.19 |
| Method | Accuracy | Diversity | |||||
|---|---|---|---|---|---|---|---|
| Recall | Prec. | NDCG | ILFBI↓ | CGI↓ | Cov. | Ent. | |
| LightGCN | 0.216 ± 0.009 | 0.039 ± 0.002 | 0.091 ± 0.004 | 0.974 ± 0.009 | 0.812 ± 0.019 | 8.21 ± 0.9 | 0.93 ± 0.06 |
| NGCF | 0.209 ± 0.012 | 0.037 ± 0.003 | 0.088 ± 0.005 | 0.968 ± 0.014 | 0.798 ± 0.025 | 8.89 ± 1.3 | 1.01 ± 0.09 |
| SGL | 0.220 ± 0.010 | 0.040 ± 0.002 | 0.093 ± 0.004 | 0.961 ± 0.011 | 0.786 ± 0.018 | 9.62 ± 1.1 | 1.15 ± 0.08 |
| XSimGCL | 0.227 ± 0.008 | 0.041 ± 0.002 | 0.095 ± 0.003 | 0.952 ± 0.008 | 0.774 ± 0.016 | 10.34 ± 1.0 | 1.22 ± 0.07 |
| CAM | 0.211 ± 0.011 | 0.038 ± 0.003 | 0.089 ± 0.005 | 0.856 ± 0.017 | 0.718 ± 0.024 | 12.89 ± 1.7 | 1.44 ± 0.11 |
| Weak Ties | 0.203 ± 0.013 | 0.036 ± 0.003 | 0.086 ± 0.006 | 0.841 ± 0.021 | 0.705 ± 0.029 | 13.93 ± 2.1 | 1.58 ± 0.14 |
| CD-CGCN | 0.207 ± 0.012 | 0.037 ± 0.003 | 0.087 ± 0.006 | 0.743 ± 0.023 | 0.631 ± 0.031 | 16.56 ± 2.4 | 1.88 ± 0.16 |
| DGRec | 0.201 ± 0.011 | 0.036 ± 0.002 | 0.085 ± 0.005 | 0.834 ± 0.019 | 0.701 ± 0.023 | 14.34 ± 1.9 | 1.66 ± 0.13 |
| MMR | 0.199 ± 0.015 | 0.035 ± 0.003 | 0.084 ± 0.007 | 0.856 ± 0.025 | 0.718 ± 0.030 | 12.89 ± 2.0 | 1.44 ± 0.12 |
| DPP | 0.195 ± 0.016 | 0.035 ± 0.004 | 0.082 ± 0.008 | 0.849 ± 0.027 | 0.709 ± 0.032 | 13.56 ± 2.2 | 1.52 ± 0.14 |
| CATD-GNN | 0.214 ± 0.010 | 0.039 ± 0.002 | 0.090 ± 0.005 | 0.671 ± 0.020 | 0.579 ± 0.025 | 17.67 ± 2.1 | 1.99 ± 0.15 |
| Configuration | Twitter-BLM | Reddit-Ideological | ||||
|---|---|---|---|---|---|---|
| NDCG | Coverage | ILFBI↓ | NDCG | Coverage | ILFBI↓ | |
| Full CATD-GNN | 0.105 ± 0.005 | 21.45 ± 2.8 | 0.678 ± 0.024 | 0.090 ± 0.005 | 17.67 ± 2.1 | 0.671 ± 0.020 |
| w/o Submodular | 0.103 ± 0.006 | 18.23 ± 2.6 | 0.742 ± 0.021 | 0.088 ± 0.005 | 15.89 ± 2.0 | 0.723 ± 0.018 |
| w/o Reweighting | 0.101 ± 0.007 | 19.62 ± 2.7 | 0.716 ± 0.020 | 0.087 ± 0.006 | 16.56 ± 2.0 | 0.695 ± 0.017 |
| w/o Stage 2 | 0.109 ± 0.004 | 16.87 ± 2.3 | 0.823 ± 0.019 | 0.093 ± 0.004 | 14.34 ± 1.8 | 0.784 ± 0.016 |
| Random rerank | 0.092 ± 0.011 | 18.78 ± 3.1 | 0.756 ± 0.032 | 0.079 ± 0.009 | 15.45 ± 2.5 | 0.738 ± 0.029 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Yoshida, S. Community-Aware Two-Stage Diversification for Social Media User Recommendation with Graph Neural Networks. Information 2026, 17, 29. https://doi.org/10.3390/info17010029
Yoshida S. Community-Aware Two-Stage Diversification for Social Media User Recommendation with Graph Neural Networks. Information. 2026; 17(1):29. https://doi.org/10.3390/info17010029
Chicago/Turabian StyleYoshida, Soh. 2026. "Community-Aware Two-Stage Diversification for Social Media User Recommendation with Graph Neural Networks" Information 17, no. 1: 29. https://doi.org/10.3390/info17010029
APA StyleYoshida, S. (2026). Community-Aware Two-Stage Diversification for Social Media User Recommendation with Graph Neural Networks. Information, 17(1), 29. https://doi.org/10.3390/info17010029

