Skip Content
You are currently on the new version of our website. Access the old version .

Data, Volume 10, Issue 11

2025 November - 26 articles

Cover Story: The Cramér–Von Mises statistic determines if certain data follow a theoretical distribution. An accurate probability is obtained from a Monte Carlo simulation. Here, for sample sizes from 2 to 30, 21 replicates of large sizes (5,120,000,000) have been generated, allowing us to obtain accurate permilles of the CM statistic. There is an increase in the variability from smaller to bigger values of the CM statistic obtained from the MC experiment. However, the standard deviation shows that the estimation noise is below 10−4 most of the time. The permille-level precision enables precise critical values and p-values, improving hypothesis testing confidence in quality control, bioinformatics, and financial modeling, to give only some examples. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list .
  • You may sign up for email alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.

Articles (26)

  • Data Descriptor
  • Open Access
589 Views
13 Pages

20 November 2025

When the Colebrook equation is used in its original implicit form, the unknown pipe flow friction factor can only be obtained through time-consuming and computationally demanding iterative calculations. The empirical Colebrook equation relates the un...

  • Article
  • Open Access
839 Views
16 Pages

A Data-Driven Analysis of Cognitive Learning and Illusion Effects in University Mathematics

  • Rodolfo Bojorque,
  • Fernando Moscoso,
  • Miguel Arcos-Argudo and
  • Fernando Pesántez

19 November 2025

The increasing adoption of video-based instruction and digital assessment in higher education has reshaped how students interact with learning materials. However, it also introduces cognitive and behavioral biases that challenge the accuracy of self-...

  • Data Descriptor
  • Open Access
694 Views
26 Pages

A Mexican Enhanced Dataset of Pollutant Releases and Transfers (2004 to 2022) with IARC Cancer Classifications

  • Hugo G. Reyes-Anastacio,
  • Ivan Lopez-Arevalo,
  • Jose L. Gonzalez-Compean,
  • Melesio Crespo-Sanchez,
  • Jaqueline Calderon and
  • Heriberto Aguirre-Meneses

19 November 2025

As a member of the North American Free Trade Agreement, the Mexican Ministry of Environment and Natural Resources publishes the Pollutant Releases and Transfers Registry of Substances annually, in accordance with the Official Mexican Norm Standard NO...

  • Data Descriptor
  • Open Access
1,490 Views
16 Pages

Increasing the Usability of the American Time Use Survey: IPUMS ATUS

  • Kari C. W. Williams,
  • Sarah M. Flood,
  • Liana C. Sayer and
  • Julia A. Rivera Drew

14 November 2025

This paper describes IPUMS ATUS, which simplifies the use of time diary data by disseminating a harmonized and enhanced version of the American Time Use Survey (ATUS). The ATUS time diary data capture the detailed activities over a 24 h period for th...

  • Article
  • Open Access
917 Views
22 Pages

Building Data Literacy for Sustainable Development: A Framework for Effective Training

  • Raed A. T. Said,
  • Kassim S. Mwitondi,
  • Leila Benseddik and
  • Laroussi Chemlali

11 November 2025

As the transformative influence of novel technologies sweeps across industries, organisations are called upon to position their staff in the equally dynamic operational environment, which includes embedding technical and legal communication skills in...

  • Article
  • Open Access
693 Views
19 Pages

Perspectives on Research and Personalized Healthcare in the Context of Federated FAIR Data Based on an Exploratory Study by Medical Researchers

  • Elena Poenaru,
  • Monica Dugăeşescu,
  • Călin Poenaru,
  • Iulia Andrei-Bitere,
  • Livia-Cristiana Băicoianu-Niţescu,
  • Traian-Vasile Constantin,
  • Aurelian Zugravu,
  • Brandusa Bitel,
  • Maria Magdalena Constantin and
  • Smaranda Stoleru

11 November 2025

Background: Research in personalized medicine, with applications in oncology, dermatology, cardiology, urology, and general healthcare, requires facile and safe access to accurate data. Due to its particularly sensitive character, obtaining health-re...

  • Data Descriptor
  • Open Access
1,589 Views
11 Pages

11 November 2025

We present a real-world dataset capturing thirty consecutive days of malicious HTTP traffic filtered and blocked by the OWASP ModSecurity Web Application Firewall (WAF) on a live production server. Each entry corresponds to a request that triggered o...

  • Article
  • Open Access
733 Views
13 Pages

Photodissociation Processes Involving the SiH+ Molecular Ion: New Datasets for Modeling

  • V. A. Srećković,
  • H. Delibašić-Marković,
  • L. M. Ignjatović,
  • V. Petrović and
  • V. Vujčić

7 November 2025

This paper investigates the photodissociation of the SiH+ molecular ion, a non-symmetric diatomic species composed of silicon and hydrogen. We provide calculated molecular data and characterize electronic states, deriving cross-sections and spectral...

  • Article
  • Open Access
982 Views
29 Pages

Resilience of Scientific Collaboration Networks in Young Universities Based on Bibliometric and Network Analysis

  • Oleksandr Kuchanskyi,
  • Yurii Andrashko,
  • Andrii Biloshchytskyi,
  • Aidos Mukhatayev,
  • Svitlana Biloshchytska and
  • Firuza Numanova

7 November 2025

The resilience of scientific collaboration networks is a key factor in ensuring the long-term academic development of young universities. This study examines the resilience of scientific collaboration networks among young universities based on biblio...

  • Data Descriptor
  • Open Access
927 Views
11 Pages

6 November 2025

Diabetes is a global and local epidemic, with an exponential growth trend in prevalence rates. This article presents data collected through a survey administered to a probabilistic sample of patients enrolled in a diabetes control program within a ne...

  • Data Descriptor
  • Open Access
486 Views
7 Pages

5 November 2025

Along with other order statistics, the Cramér–von Mises (CM) statistic can assess the goodness of fit. CM does not have an explicit formula for the cumulative distribution function and the alternate way is to obtain its critical value fr...

  • Article
  • Open Access
900 Views
22 Pages

NutritionVerse3D2D: Large 3D Object and 2D Image Food Dataset for Dietary Intake Estimation

  • Chi-en Amy Tai,
  • Matthew Keller,
  • Saeejith Nair,
  • Yuhao Chen,
  • Yifan Wu,
  • Olivia Markham,
  • Krish Parmar,
  • Pengcheng Xi and
  • Alexander Wong

4 November 2025

Elderly populations often face significant challenges when it comes to dietary intake tracking, often exacerbated by health complications. Unfortunately, conventional diet assessment techniques such as food frequency questionnaires, food diaries, and...

  • Data Descriptor
  • Open Access
3,129 Views
13 Pages

CBIS-DDSM-R: A Curated Radiomic Feature Dataset for Breast Cancer Classification

  • Erika Sánchez-Femat,
  • Carlos E. Galván-Tejada,
  • Jorge I. Galván-Tejada,
  • Hamurabi Gamboa-Rosales,
  • Huizilopoztli Luna-García,
  • Luis Alberto Flores-Chaires,
  • Javier Saldívar-Pérez,
  • Rafael Reveles-Martínez and
  • José M. Celaya-Padilla

4 November 2025

Early and accurate breast cancer detection is critical for patient outcomes. The Curated Breast Imaging Subset of the Digital Database for Screening Mammography (CBIS-DDSM) has been instrumental for computer-aided diagnosis (CAD) systems. However, th...

  • Technical Note
  • Open Access
645 Views
10 Pages

ncPick: A Lightweight Toolkit for Extracting, Analyzing, and Visualizing ECMWF ERA5 NetCDF Data

  • Sreten Jevremović,
  • Filip Arnaut,
  • Aleksandra Kolarski and
  • Vladimir A. Srećković

2 November 2025

The European Centre for Medium-Range Weather Forecasts (ECMWF) Reanalysis v5 (ERA5) datasets provide a rich source of climatological data. However, their Network Common Data Form (NetCDF) structure can be a barrier for researchers who are not experie...

  • Article
  • Open Access
1 Citations
805 Views
26 Pages

Machine and Deep Learning Framework for Sargassum Detection and Fractional Cover Estimation Using Multi-Sensor Satellite Imagery

  • José Manuel Echevarría-Rubio,
  • Guillermo Martínez-Flores and
  • Rubén Antelmo Morales-Pérez

1 November 2025

Over the past decade, recurring influxes of pelagic Sargassum have posed significant environmental and economic challenges in the Caribbean Sea. Effective monitoring is crucial for understanding bloom dynamics and mitigating their impacts. This study...

  • Data Descriptor
  • Open Access
488 Views
7 Pages

1 November 2025

Extracted peatlands experience strong hydrological fluctuations due to drainage, vegetation succession, and climatic variability, yet long-term, high-frequency groundwater data remain scarce in Northern Europe. Our dataset presents two years (June 20...

  • Data Descriptor
  • Open Access
1,192 Views
11 Pages

1 November 2025

Electroencephalography (EEG) provides insights into the neural mechanisms underlying attention, response inhibition, and distraction in cognitive tasks. This dataset was collected to examine neural activity in young drivers and non-drivers performing...

  • Article
  • Open Access
1 Citations
1,249 Views
19 Pages

31 October 2025

This paper presents a dataset of Chilean news media coverage during the social unrest and constitutional processes from 2019 to 2023. Using Python-based web scraping with BeautifulSoup and Selenium, we collected articles from 15 Chilean news outlets...

  • Article
  • Open Access
1,064 Views
55 Pages

Method for Detecting Low-Intensity DDoS Attacks Based on a Combined Neural Network and Its Application in Law Enforcement Activities

  • Serhii Vladov,
  • Oksana Mulesa,
  • Victoria Vysotska,
  • Petro Horvat,
  • Nataliia Paziura,
  • Oleksandra Kolobylina,
  • Oleh Mieshkov,
  • Oleksandr Ilnytskyi and
  • Oleh Koropatov

30 October 2025

The article presents a method for detecting low-intensity DDoS attacks, focused on identifying difficult-to-detect “low-and-slow” scenarios that remain undetectable by traditional defence systems. The key feature of the developed method i...

  • Article
  • Open Access
7,899 Views
31 Pages

30 October 2025

The rapid adoption of generative AI raises questions not only about its transformative potential but also about its cognitive and societal risks. This study contributes to the debate by presenting cross-country experimental data (n = 150; Germany, Sw...

  • Article
  • Open Access
942 Views
27 Pages

28 October 2025

Drought events exacerbated by global climate change occur frequently in China. Currently, high-spatiotemporal-resolution gridded meteorological drought index datasets are generally available for single time scales (e.g., 30, 60, 90, and 150 days) and...

  • Data Descriptor
  • Open Access
1 Citations
1,354 Views
14 Pages

Electrical Measurement Dataset from a University Laboratory for Smart Energy Applications

  • Sergio D. Saldarriaga-Zuluaga,
  • José Ricardo Velasco-Méndez,
  • Carlos Mario Moreno-Paniagua,
  • Bayron Alvarez-Arboleda and
  • Sergio Andres Estrada-Mesa

26 October 2025

Continuous monitoring of electrical parameters is essential for understanding energy consumption, assessing power quality, and analyzing load behavior. This paper presents a dataset comprising measurements of three-phase voltages and currents, active...

  • Data Descriptor
  • Open Access
1,003 Views
12 Pages

24 October 2025

Modern engineering increasingly operates within socio-technical networks, such as the interdependence of energy grids, transport systems, and building codes, where decisions must be reliable and transparent. Large language models (LLMs) such as GPT p...

  • Article
  • Open Access
1 Citations
1,367 Views
28 Pages

23 October 2025

Understanding visitor sentiment is essential for developing effective tourism strategies, particularly as Google Maps reviews have become a key channel for public feedback on tourist attractions. Yet, the unstructured format and dialectal diversity o...

Get Alerted

Add your email address to receive forthcoming issues of this journal.

XFacebookLinkedIn
Data - ISSN 2306-5729