Next Article in Journal
Electroencephalography Dataset of Young Drivers and Non-Drivers Under Visual and Auditory Distraction Using a Go/No-Go Paradigm
Previous Article in Journal
Method for Detecting Low-Intensity DDoS Attacks Based on a Combined Neural Network and Its Application in Law Enforcement Activities
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Web Scraping Chilean News Media: A Dataset for Analyzing Social Unrest Coverage (2019–2023)

1
Department of Systems and Computing Engineering, Universidad Católica del Norte, Antofagasta 1270398, Chile
2
School of Journalism, Universidad Católica del Norte, Antofagasta 1270398, Chile
*
Author to whom correspondence should be addressed.
Data 2025, 10(11), 174; https://doi.org/10.3390/data10110174 (registering DOI)
Submission received: 10 September 2025 / Revised: 28 October 2025 / Accepted: 30 October 2025 / Published: 31 October 2025

Abstract

This paper presents a dataset of Chilean news media coverage during the social unrest and constitutional processes from 2019 to 2023. Using Python-based web scraping with BeautifulSoup and Selenium, we collected articles from 15 Chilean news outlets between 15 November 2019 and 17 December 2023. The initial collection of 1254 articles was filtered to 931 usable data points after removing non-relevant content, duplicates, and articles unrelated to the Chilean social outburst. Each news outlet required specific extraction approaches due to varying HTML structures, with some outlets inaccessible due to paywalls or anti-scraping mechanisms. The dataset is structured in JSON format with standardized fields including title, content, date, author, and source metadata. This resource supports research on media coverage during political events and provides data for Spanish-language processing tasks. The dataset and extraction code are publicly available on GitHub.
Keywords: web scraping; Chilean social outburst; news media dataset; data collection; estallido social web scraping; Chilean social outburst; news media dataset; data collection; estallido social

Share and Cite

MDPI and ACS Style

Molina, I.; Morales, J.; Keith, B. Web Scraping Chilean News Media: A Dataset for Analyzing Social Unrest Coverage (2019–2023). Data 2025, 10, 174. https://doi.org/10.3390/data10110174

AMA Style

Molina I, Morales J, Keith B. Web Scraping Chilean News Media: A Dataset for Analyzing Social Unrest Coverage (2019–2023). Data. 2025; 10(11):174. https://doi.org/10.3390/data10110174

Chicago/Turabian Style

Molina, Ignacio, José Morales, and Brian Keith. 2025. "Web Scraping Chilean News Media: A Dataset for Analyzing Social Unrest Coverage (2019–2023)" Data 10, no. 11: 174. https://doi.org/10.3390/data10110174

APA Style

Molina, I., Morales, J., & Keith, B. (2025). Web Scraping Chilean News Media: A Dataset for Analyzing Social Unrest Coverage (2019–2023). Data, 10(11), 174. https://doi.org/10.3390/data10110174

Article Metrics

Back to TopTop