Next Article in Journal
Statistical Analysis of Current Financial Instrument Quotes in the Conditions of Market Chaos
Previous Article in Journal
A Surrogate Measure for Time-Varying Biomarkers in Randomized Clinical Trials
Article

TB-BCG: Topic-Based BART Counterfeit Generator for Fake News Detection

1
State Key Laboratory of Communication Content Cognition, People’s Daily Online, Beijing 100733, China
2
School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
*
Author to whom correspondence should be addressed.
Academic Editors: Andrea Prati, Carlos A. Iglesias, Luis Javier García Villalba and Vincent A. Cicirello
Mathematics 2022, 10(4), 585; https://doi.org/10.3390/math10040585
Received: 22 December 2021 / Revised: 6 February 2022 / Accepted: 10 February 2022 / Published: 14 February 2022
(This article belongs to the Topic Machine and Deep Learning)
Fake news has been spreading intentionally and misleading society to believe unconfirmed information; this phenomenon makes it challenging to identify fake news based on shared content. Fake news circulation is not only a current issue, but it has been disseminated for centuries. Dealing with fake news is a challenging task because it spreads massively. Therefore, automatic fake news detection is urgently needed. We introduced TB-BCG, Topic-Based BART Counterfeit Generator, to increase detection accuracy using deep learning. This approach plays an essential role in selecting impacted data rows and adding more training data. Our research implemented Latent Dirichlet Allocation (Topic-based), Bidirectional and Auto-Regressive Transformers (BART), and Cosine Document Similarity as the main tools involved in Constraint @ AAAI2021-COVID19 Fake News Detection dataset shared task. This paper sets forth this simple yet powerful idea by selecting a dataset based on topic and sorting based on distinctive data, generating counterfeit training data using BART, and comparing counterfeit-generated text toward source text using cosine similarity. If the comparison value between counterfeit-generated text and source text is more than 95%, then add that counterfeit-generated text into the dataset. In order to prove the resistance of precision and the robustness in various numbers of data training, we used 30%, 50%, 80%, and 100% from the total dataset and trained it using simple Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN). Compared to baseline, our method improved the testing performance for both LSTM and CNN, and yields are only slightly different. View Full-Text
Keywords: fake news detection; Latent Dirichlet Allocation (LDA); Bidirectional and Auto-Regressive Transformers (BART); cosine document similarity; AAAI2021-COVID19 Fake News Detection dataset fake news detection; Latent Dirichlet Allocation (LDA); Bidirectional and Auto-Regressive Transformers (BART); cosine document similarity; AAAI2021-COVID19 Fake News Detection dataset
Show Figures

Figure 1

MDPI and ACS Style

Karnyoto, A.S.; Sun, C.; Liu, B.; Wang, X. TB-BCG: Topic-Based BART Counterfeit Generator for Fake News Detection. Mathematics 2022, 10, 585. https://doi.org/10.3390/math10040585

AMA Style

Karnyoto AS, Sun C, Liu B, Wang X. TB-BCG: Topic-Based BART Counterfeit Generator for Fake News Detection. Mathematics. 2022; 10(4):585. https://doi.org/10.3390/math10040585

Chicago/Turabian Style

Karnyoto, Andrea S., Chengjie Sun, Bingquan Liu, and Xiaolong Wang. 2022. "TB-BCG: Topic-Based BART Counterfeit Generator for Fake News Detection" Mathematics 10, no. 4: 585. https://doi.org/10.3390/math10040585

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop