Bias Reduction News Recommendation System
Abstract
:1. Introduction
2. Literature Overview
3. Materials and Methods
3.1. Data
- Textual Embeddings (BERT): we converted the text content of news articles into numerical vector representations using the BERT (base-uncased) model to encapsulate their semantic content.
- Topic Modeling (LDA): articles were categorized into specific genres or themes using Latent Dirichlet Allocation (LDA).
- Sentiment Analysis (VADER): we employed the VADER [32] tool to analyze the emotional tone of articles, classifying them as positive, negative, or neutral.
- Unified Input Vectors: each user–article interaction was represented as a unified vector, consolidating user behavioral data with the extracted article content features.
- Time Series Formation: we structured the data into time series to capture the temporal dynamics of user interactions, a crucial aspect of LSTM processing.
- User and Article Identifiers: unique IDs for users and articles to track interactions.
- Interaction Timestamps: capture the timing of each interaction, crucial for time series analysis.
- Interaction Types: categorized as clicks, views, and engagement duration.
- Content Features: textual embeddings, topic categories, and sentiment scores for articles from MIND-small.
- Sequential Interaction History: chronological sequence of user interactions, vital for learning user behavior patterns over time.
3.2. Contextual Dual Bias Reduction Recommendation System
- Input Gate controls how much new information flows into the cell state:
- Forget Gate determines the information to be removed from the cell state:
- Cell State Update generates new candidate values for updating the cell state:
- Output Gate outputs the next hidden state reflecting the processed information:
- (Accuracy Loss) is typically the mean squared error (MSE) between the predicted and actual user interactions. It measures how accurately the system predicts user preferences based on their interaction history and content features.
- (Item Bias Loss) aims to reduce the bias towards frequently recommended items. It is computed by measuring the deviation of the item distribution in the recommendations from a desired distribution, such as a uniform distribution.
- (Exposure Bias Loss) is designed to ensure that all items receive a fair amount of exposure in the recommendations. This is measured as the variance in the number of times different items are recommended, penalizing the model when certain items are consistently under-represented.
- The hyperparameters are used to balance these different aspects of the loss function. They are typically determined through experimentation and tuning, based on the specific characteristics of the data.
Algorithm 1 Training Procedure for the Contextual Dual Bias Reduction Recommendation System (C-DBRRS) |
|
4. Experimental Setup
4.1. Baseline Methods
- Popularity-based Recommendation (POP): this method ranks news articles based on their overall popularity, measured by the total number of user clicks.
- Content-based Recommendation (CB): this method suggests articles to users by aligning the content of articles with their past preferences.
- Collaborative Filtering (CF): this method utilizes user behavior patterns, recommending items favored by similar users.
- Matrix Factorization (MF) [3]: this method decomposes the user–item interaction matrix into lower-dimensional latent factors for inferring user interests.
- Neural Collaborative Filtering (NCF) [33]: this method combines neural network architectures with collaborative filtering to enhance recommendation accuracy.
- BERT4Rec [34]: this model employs the Bidirectional Encoder Representations from Transformers (BERT) architecture, specifically designed for sequential recommendation. It captures complex item interaction patterns and user preferences from sequential data. We used the BERT-base-uncased model.
4.2. Evaluation Metrics
- Precision@K measures the proportion of relevant articles in the top-K recommendations, reflecting accuracy.
- Recall@K indicates the fraction of relevant articles captured in the top-K recommendations, highlighting the model’s retrieval ability.
- Normalized Discounted Cumulative Gain (NDCG)@K assesses ranking quality, prioritizing the placement of relevant articles higher in the recommendation list.
- Gini Index evaluates the fairness of recommendation distribution, with lower values indicating more equitable distribution across items.
4.3. Settings and Hyerparameters
5. Results
5.1. Overall Results
5.2. Recommendation Distribution across Different News Categories
5.3. Analysis of Relevance and Fairness Trade-Off
6. Discussion
6.1. Practical and Theoratical Impact
6.2. Limitations
6.3. Recommender Systems Fairness in the Era of Large Language Models
7. Conclusions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
NRS | News Recommender Systems |
LSTM | Long Short-Term Memory |
C-DBRRS | Contextual Dual Bias Reduction Recommendation System |
BERT | Bidirectional Encoder Representations from Transformers |
POP | Popularity-Based Recommendation |
CB | Content-Based Recommendation |
CF | Collaborative Filtering |
MF | Matrix Factorization |
NCF | Neural Collaborative Filtering |
NDCG | Normalized Discounted Cumulative Gain |
References
- Raza, S.; Ding, C. News recommender system: A review of recent progress, challenges, and opportunities. Artif. Intell. Rev. 2022, 55, 749–800. [Google Scholar] [CrossRef]
- Wu, F.; Qiao, Y.; Chen, J.H.; Wu, C.; Qi, T.; Lian, J.; Liu, D.; Xie, X.; Gao, J.; Wu, W.; et al. Mind: A large-scale dataset for news recommendation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 3597–3606. [Google Scholar]
- Raza, S.; Ding, C. News recommender system considering temporal dynamics and news taxonomy. In Proceedings of the 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, 9–12 December 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 920–929. [Google Scholar]
- Wang, Y.; Ma, W.; Zhang, M.; Liu, Y.; Ma, S. A survey on the fairness of recommender systems. ACM Trans. Inf. Syst. 2023, 41, 1–43. [Google Scholar] [CrossRef]
- Raza, S.; Ding, C. Deep neural network to tradeoff between accuracy and diversity in a news recommender system. In Proceedings of the 2021 IEEE International Conference on Big Data (Big Data), Orlando, FL, USA, 15–18 December 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 5246–5256. [Google Scholar]
- Raza, S.; Garg, M.; Reji, D.J.; Bashir, S.R.; Ding, C. Nbias: A natural language processing framework for BIAS identification in text. Expert Syst. Appl. 2024, 237, 121542. [Google Scholar] [CrossRef]
- Zheng, G.; Zhang, F.; Zheng, Z.; Xiang, Y.; Yuan, N.J.; Xie, X.; Li, Z. DRN: A deep reinforcement learning framework for news recommendation. In Proceedings of the 2018 World Wide Web Conference, Lyon, France, 23–27 April 2018; pp. 167–176. [Google Scholar]
- Wang, X.; Wang, W.H. Providing Item-side Individual Fairness for Deep Recommender Systems. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, Seoul, Republic of Korea, 21–24 June 2022; pp. 117–127. [Google Scholar]
- Wu, H.; Mitra, B.; Ma, C.; Diaz, F.; Liu, X. Joint multisided exposure fairness for recommendation. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, 11–15 July 2022; pp. 703–714. [Google Scholar]
- Helberger, N. On the democratic role of news recommenders. In Algorithms, Automation, and News; Routledge: London, UK, 2021; pp. 14–33. [Google Scholar]
- Dwork, C.; Hardt, M.; Pitassi, T.; Reingold, O.; Zemel, R. Fairness through awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, Cambridge MA, USA, 8–10 January 2012; pp. 214–226. [Google Scholar]
- Dolata, M.; Feuerriegel, S.; Schwabe, G. A sociotechnical view of algorithmic fairness. Inf. Syst. J. 2022, 32, 754–818. [Google Scholar] [CrossRef]
- Wu, H.; Ma, C.; Mitra, B.; Diaz, F.; Liu, X. A multi-objective optimization framework for multi-stakeholder fairness-aware recommendation. ACM Trans. Inf. Syst. 2022, 41, 1–29. [Google Scholar] [CrossRef]
- Li, L.; Wang, D.D.; Zhu, S.Z.; Li, T. Personalized news recommendation: A review and an experimental investigation. J. Comput. Sci. Technol. 2011, 26, 754–766. [Google Scholar] [CrossRef]
- Devooght, R.; Bersini, H. Collaborative filtering with recurrent neural networks. arXiv 2016, arXiv:1608.07400. [Google Scholar]
- Herlocker, J.L.; Konstan, J.A.; Terveen, L.G.; Riedl, J.T. Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst. (TOIS) 2004, 22, 5–53. [Google Scholar] [CrossRef]
- Adomavicius, G.; Tuzhilin, A. Context-aware recommender systems. In Recommender Systems Handbook; Springer: Berlin/Heidelberg, Germany, 2010; pp. 217–253. [Google Scholar]
- Burke, R. Hybrid recommender systems: Survey and experiments. User Model. User-Adapt. Interact. 2002, 12, 331–370. [Google Scholar] [CrossRef]
- Fu, Z.; Xian, Y.; Gao, R.; Zhao, J.; Huang, Q.; Ge, Y.; Xu, S.; Geng, S.; Shah, C.; Zhang, Y.; et al. Fairness-aware explainable recommendation over knowledge graphs. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, 25–30 July 2020; pp. 69–78. [Google Scholar]
- Beutel, A.; Chen, J.; Doshi, T.; Qian, H.; Wei, L.; Wu, Y.; Heldt, L.; Zhao, Z.; Hong, L.; Chi, E.H.; et al. Fairness in recommendation ranking through pairwise comparisons. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 2212–2220. [Google Scholar]
- Yao, S.; Huang, B. Beyond parity: Fairness objectives for collaborative filtering. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
- Zehlike, M.; Bonchi, F.; Castillo, C.; Hajian, S.; Megahed, M.; Baeza-Yates, R. FA*IR: A fair top-k ranking algorithm. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, Singapore, 6–10 November 2017; pp. 1569–1578. [Google Scholar]
- Farnadi, G.; Kouki, P.; Thompson, S.K.; Srinivasan, S.; Getoor, L. A fairness-aware hybrid recommender system. arXiv 2018, arXiv:1809.09030. [Google Scholar]
- Sonboli, N.; Smith, J.J.; Cabral Berenfus, F.; Burke, R.; Fiesler, C. Fairness and transparency in recommendation: The users perspective. In Proceedings of the 29th ACM Conference on User Modeling, Adaptation and Personalization, Utrecht, The Netherlands, 21–25 June 2021; pp. 274–279. [Google Scholar]
- Mehrotra, R.; McInerney, J.; Bouchard, H.; Lalmas, M.; Diaz, F. Towards a fair marketplace: Counterfactual evaluation of the trade-off between relevance, fairness & satisfaction in recommendation systems. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, Torino, Italy, 22–26 October 2018; pp. 2243–2251. [Google Scholar]
- Cui, L.; Ou, P.; Fu, X.; Wen, Z.; Lu, N. A Novel Multi-objective Evolutionary Algorithm for Recommendation Systems. J. Parallel Distrib. Comput. 2017, 103, 53–63. [Google Scholar] [CrossRef]
- Li, H.; Zhong, Z.; Shi, J.; Li, H.; Zhang, Y. Multi-objective optimization-based recommendation for massive online learning resources. IEEE Sens. J. 2021, 21, 25274–25281. [Google Scholar] [CrossRef]
- Xu, C. A big-data oriented recommendation method based on multi-objective optimization. Knowl.-Based Syst. 2019, 177, 11–21. [Google Scholar] [CrossRef]
- Raza, S.; Ding, C. A Regularized Model to Trade-off Between Accuracy and Diversity in a News Recommender System. In Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 10–13 December 2020; pp. 551–560. [Google Scholar] [CrossRef]
- Wu, Y.; Xie, R.; Zhu, Y.; Zhuang, F.; Xiang, A.; Zhang, X.; Lin, L.; He, Q. Selective fairness in recommendation via prompts. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, 11–15 July 2022; pp. 2657–2662. [Google Scholar]
- Outbrain Click Prediction. 2023. Available online: https://www.kaggle.com/competitions/outbrain-click-prediction/data (accessed on 4 September 2023).
- Hutto, C.; Gilbert, E. Vader: A parsimonious rule-based model for sentiment analysis of social media text. In Proceedings of the International AAAI Conference on Web and Social Media, Ann Arbor, MI, USA, 1–4 June 2014; Volume 8, pp. 216–225. [Google Scholar]
- He, X.; Liao, L.; Zhang, H.; Nie, L.; Hu, X.; Chua, T.S. Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web, Perth, Australia, 3–7 May 2017; pp. 173–182. [Google Scholar]
- Sun, F.; Liu, J.; Wu, J.; Pei, C.; Lin, X.; Ou, W.; Jiang, P. BERT4Rec: Sequential recommendation with bidirectional encoder representations from transformer. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, 3–7 November 2019; pp. 1441–1450. [Google Scholar]
- Pan, S.; Erfahrungen, P.K. A survey on fairness-aware recommender systems. Inf. Fusion 2023. [Google Scholar] [CrossRef]
- Fan, W.; Zhao, Z.; Li, J.; Liu, Y.; Mei, X.; Wang, Y.; Wen, Z.; Wang, F.; Zhao, X.; Tang, J.; et al. Recommender Systems in the Era of Large Language Models (LLMs). arXiv 2023, arXiv:2307.02046. [Google Scholar] [CrossRef]
- Guo, H. Fairness Testing for Recommender Systems. In Proceedings of the International Symposium on Software Testing and Analysis, Seattle, WA, USA, 17–21 July 2023. [Google Scholar] [CrossRef]
- Wu, Y.; Cao, J.; Xu, G. Fairness in Recommender Systems: Evaluation Approaches and Assurance Strategies. ACM Trans. Knowl. Discov. Data 2023, 18, 1–37. [Google Scholar] [CrossRef]
- Carranza, A.G.; Farahani, R.; Ponomareva, N.; Kurakin, A.; Jagielski, M.; Nasr, M. Privacy-Preserving Recommender Systems with Synthetic Query Generation using Differentially Private Large Language Models. arXiv 2023, arXiv:2305.05973. [Google Scholar] [CrossRef]
Symbol | Description |
---|---|
Sequence of input data representing user interactions and news features | |
t | Time step in the input sequence |
ine | Input gate at time step t |
Forget gate at time step t | |
ine | Candidate values for cell state update at time step t |
Cell state at time step t | |
Output gate at time step t | |
ine | Hidden state at time step t |
Weight matrices connecting input to gates | |
Weight matrices connecting hidden state to gates | |
Bias terms for gates | |
Hyperparameters for balancing loss terms | |
Hyperparameter for tuning fairness in recommendations | |
Loss term for accuracy (mean squared error) | |
Loss terms for item bias and exposure bias |
Hyperparameter | Description | Values/Range |
---|---|---|
Learning Rate | Step size for updating weights | 0.001, 0.01, 0.1 |
Batch Size | Samples processed before update | 32, 64, 128 |
Num. of Epochs | Passes through entire dataset | 10, 20, 30 |
LSTM Units | Number of LSTM units | 50, 100, 150 |
Dropout Rate | Fraction of units to drop | 0.2, 0.5 |
Weight for accuracy loss | 0.3, 0.5, 0.7 | |
Weight for item bias loss | 0.1, 0.3, 0.5 | |
Weight for exposure bias loss | 0.2, 0.4, 0.6, default is 0.5 |
Dataset | Method | Precision@5 | Recall@5 | NDCG@5 | Gini Index |
---|---|---|---|---|---|
MIND-small | POP | 0.35 ± 0.05 | 0.25 ± 0.04 | 0.30 ± 0.05 | 0.45 ± 0.06 |
CB | 0.40 ± 0.05 | 0.30 ± 0.04 | 0.35 ± 0.05 | 0.40 ± 0.06 | |
CF | 0.45 ± 0.05 | 0.35 ± 0.04 | 0.40 ± 0.05 | 0.35 ± 0.06 | |
MF | 0.50 ± 0.05 | 0.40 ± 0.04 | 0.45 ± 0.05 | 0.30 ± 0.06 | |
NCF | 0.55 ± 0.05 | 0.45 ± 0.04 | 0.50 ± 0.05 | 0.25 ± 0.06 | |
BERT4Rec | 0.57 ± 0.05 | 0.47 ± 0.04 | 0.52 ± 0.05 | 0.22 ± 0.06 | |
C-DBRRS | 0.65 ± 0.04 | 0.55 ± 0.04 | 0.60 ± 0.04 | 0.18 ± 0.05 | |
Outbrain | POP | 0.32 ± 0.05 | 0.24 ± 0.04 | 0.29 ± 0.05 | 0.43 ± 0.06 |
CB | 0.38 ± 0.05 | 0.29 ± 0.04 | 0.33 ± 0.05 | 0.39 ± 0.06 | |
CF | 0.42 ± 0.05 | 0.33 ± 0.04 | 0.38 ± 0.05 | 0.34 ± 0.06 | |
MF | 0.48 ± 0.05 | 0.37 ± 0.04 | 0.42 ± 0.05 | 0.30 ± 0.06 | |
NCF | 0.52 ± 0.05 | 0.42 ± 0.04 | 0.47 ± 0.05 | 0.26 ± 0.06 | |
BERT4Rec | 0.54 ± 0.05 | 0.44 ± 0.04 | 0.49 ± 0.05 | 0.23 ± 0.06 | |
C-DBRRS | 0.62 ± 0.04 | 0.52 ± 0.04 | 0.57 ± 0.04 | 0.19 ± 0.05 |
Category | Gini (before) | Gini (after) |
---|---|---|
Politics | 0.55 | 0.25 |
Sports | 0.45 | 0.23 |
Arts | 0.60 | 0.28 |
Science | 0.50 | 0.26 |
Environment | 0.65 | 0.30 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Raza, S. Bias Reduction News Recommendation System. Digital 2024, 4, 92-103. https://doi.org/10.3390/digital4010003
Raza S. Bias Reduction News Recommendation System. Digital. 2024; 4(1):92-103. https://doi.org/10.3390/digital4010003
Chicago/Turabian StyleRaza, Shaina. 2024. "Bias Reduction News Recommendation System" Digital 4, no. 1: 92-103. https://doi.org/10.3390/digital4010003
APA StyleRaza, S. (2024). Bias Reduction News Recommendation System. Digital, 4(1), 92-103. https://doi.org/10.3390/digital4010003