Next Issue
Volume 9, October
Previous Issue
Volume 9, August
 
 

Data, Volume 9, Issue 9 (September 2024) – 8 articles

Cover Story (view full-size image): As chatbots and voice assistants become integral to human–computer interaction, understanding conversational dynamics, particularly interruptions, is crucial. Despite advancements, there is a notable lack of datasets focused on audio-based interruptions. This study addresses this gap by presenting a dataset of 200 annotated interruptions from a broader set of overlapping utterances. Expanding the Group Affect and Performance dataset, this collection includes audio and transcript data. Our dataset aims to support the development of interruption prediction models, fostering research into multi-modal classification and the dynamics of interruptions. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
17 pages, 1246 KB  
Data Descriptor
Data on Economic Analysis: 2017 Social Accounting Matrices (SAMs) for South Africa
by Ramigo Pfunzo, Yonas T. Bahta and Henry Jordaan
Data 2024, 9(9), 109; https://doi.org/10.3390/data9090109 - 20 Sep 2024
Cited by 1 | Viewed by 1907
Abstract
The purpose of the Social Accounting Matrix (SAM) is to improve the quality of the database for modelling, including, but not limited to, policy analysis, multiplier analysis, price analysis, and Computable General Equilibrium. This article contributes to constructing the 2017 national SAM for [...] Read more.
The purpose of the Social Accounting Matrix (SAM) is to improve the quality of the database for modelling, including, but not limited to, policy analysis, multiplier analysis, price analysis, and Computable General Equilibrium. This article contributes to constructing the 2017 national SAM for South Africa, incorporating regional accounts. Only in Limpopo Province of South Africa are agricultural industries, labour, and households captured at the district level, while agricultural industry, labour, and household accounts in other provinces remain unchanged. The main data sources for constructing a SAM are found from different sources, such as Supply and Use Tables, National Accounts, Census of Commercial Agriculture, Quarterly Labour Force Survey, South Africa Revenue Service, Global Insight (regional explorer), and South Africa Reserve Bank. The dataset recorded that land returns for irrigation agriculture were highest (18.2%) in the Northern Cape Province of South Africa compared to other provinces, whereas the Free State Province of South Africa rainfed agriculture had the largest shares (22%) for payment to land. Regarding intermediate inputs, rainfed agriculture in the Western Cape, Free State, and Kwazulu-Natal Provinces paid approximately 0.4% for using intermediate inputs. In terms of the districts, land returns for irrigation were highest in the Vhembe district of Limpopo Province of South Africa with 0.3%. Despite Mopani district of Limpopo Province of South Africa having the lowest land returns for irrigation agriculture, it has the highest share (1.6%) of payment to land from rainfed agriculture. The manufacturing and community service sectors had a trade deficit, whereas other sectors experienced a trade surplus. The main challenges found in developing a SAM are scarcity of data to attain the information needed for disaggregation for the sub-matrices and insufficient information from different data sources for estimating missing information to ensure the row and column totals of the SAM are consistent and complete. Full article
Show Figures

Figure 1

20 pages, 1147 KB  
Data Descriptor
Dataset on the Validation and Standardization of the Questionnaire for the Self-Assessment of Service-Learning Experiences in Higher Education (QaSLu)
by Roberto Sánchez-Cabrero, Elena López-de-Arana Prado, Pilar Aramburuzabala and Rosario Cerrillo
Data 2024, 9(9), 108; https://doi.org/10.3390/data9090108 - 19 Sep 2024
Cited by 1 | Viewed by 2317
Abstract
This dataset shows the original validation and standardization of the Questionnaire for the Self-Assessment of Service-Learning Experiences in Higher Education (QaSLu). The QaSLu is the first instrument to measure university service-learning (USL), validated following a strict qualitative and quantitative process by a sample [...] Read more.
This dataset shows the original validation and standardization of the Questionnaire for the Self-Assessment of Service-Learning Experiences in Higher Education (QaSLu). The QaSLu is the first instrument to measure university service-learning (USL), validated following a strict qualitative and quantitative process by a sample of experts in USL and generating rating scales for different profiles of professors. The Delphi method was used for the qualitative validation by 16 academic experts, who evaluated the relevance and clarity of the items. After two consultation rounds, 45 items were qualitatively validated, generating the QaSLu-45. Then, 118 instructors from 43 universities took part as the sample in the quantitative validation procedure. Quantitative validation was carried out through goodness-of-fit measures using confirmatory factor analysis and the final configuration optimized using one-factor robust exploratory factor analysis, determining the most optimal version of the questionnaire under the law of parsimony, the QaSLu-27, with only 27 items and better psychometric properties. Finally, rating scales were calculated to compare different profiles of USL professors. These findings offer a valid, strong, and trustworthy instrument. The QaSLu-27 may be helpful for the design of USL experiences, in addition to facilitating the assessment of such programs to enhance teaching and learning processes. Full article
Show Figures

Figure 1

13 pages, 3866 KB  
Data Descriptor
OSBA: An Open Neonatal Neuroimaging Atlas and Template for Spina Bifida Aperta
by Anna Speckert, Hui Ji, Kelly Payette, Patrice Grehten, Raimund Kottke, Samuel Ackermann, Beth Padden, Luca Mazzone, Ueli Moehrlen, Spina Bifida Study Group Zurich and Andras Jakab
Data 2024, 9(9), 107; https://doi.org/10.3390/data9090107 - 17 Sep 2024
Cited by 1 | Viewed by 2325
Abstract
We present the Open Spina Bifida Aperta (OSBA) atlas, an open atlas and set of neuroimaging templates for spina bifida aperta (SBA). Traditional brain atlases may not adequately capture anatomical variations present in pediatric or disease-specific cohorts. The OSBA atlas fills this gap [...] Read more.
We present the Open Spina Bifida Aperta (OSBA) atlas, an open atlas and set of neuroimaging templates for spina bifida aperta (SBA). Traditional brain atlases may not adequately capture anatomical variations present in pediatric or disease-specific cohorts. The OSBA atlas fills this gap by representing the computationally averaged anatomy of the neonatal brain with SBA after fetal surgical repair. The OSBA atlas was constructed using structural T2-weighted and diffusion tensor MRIs of 28 newborns with SBA who underwent prenatal surgical correction. The corrected gestational age at MRI was 38.1 ± 1.1 weeks (mean ± SD). The OSBA atlas consists of T2-weighted and fractional anisotropy templates, along with nine tissue prior maps and region of interest (ROI) delineations. The OSBA atlas offers a standardized reference space for spatial normalization and anatomical ROI definition. Our image segmentation and cortical ribbon definition are based on a human-in-the-loop approach, which includes manual segmentation. The precise alignment of the ROIs was achieved by a combination of manual image alignment and automated, non-linear image registration. From the clinical and neuroimaging perspective, the OSBA atlas enables more accurate spatial standardization and ROI-based analyses and supports advanced analyses such as diffusion tractography and connectomic studies in newborns affected by this condition. Full article
Show Figures

Figure 1

13 pages, 8984 KB  
Data Descriptor
Analysis of Split-System Air Conditioner Faults through Electrical Measurement Data
by Anderson Carlos de Oliveira, Abel Cavalcante Lima Filho, Francisco Antonio Belo and André Victor Oliveira Cadena
Data 2024, 9(9), 106; https://doi.org/10.3390/data9090106 - 13 Sep 2024
Viewed by 2069
Abstract
This work presents an electrical measurement dataset from a split-system air conditioner in normal operating conditions and with specific faults, such as incrustation in the condenser and evaporator air inlet with different levels of blocking, which often occurs in this type of equipment. [...] Read more.
This work presents an electrical measurement dataset from a split-system air conditioner in normal operating conditions and with specific faults, such as incrustation in the condenser and evaporator air inlet with different levels of blocking, which often occurs in this type of equipment. We also added compressor capacitor degradation, which is a very common fault in this type of equipment, although it is scarcely addressed in research. The data were obtained through a non-invasive current sensor and a grain-oriented voltage sensor containing the values of the current and voltage of equipment that was installed in the field and tested at different levels for these fault conditions. This work not only explains how the entire data collection process was carried out but also presents two examples of fast Fourier transform (FFT) applications for the detection and diagnosis of faults through the electrical measurements analyzed in our studies, which had good effectiveness. Full article
Show Figures

Figure 1

17 pages, 6352 KB  
Data Descriptor
Experimental Data in a Greenhouse with and without Cultivation of Stringless Blue Lake Beans
by Sebastian-Camilo Vanegas-Ayala, Julio Barón-Velandia, Oscar-Mauricio Garcia-Chavez, Adrian Romero-Palencia and Daniel-David Leal-Lara
Data 2024, 9(9), 105; https://doi.org/10.3390/data9090105 - 4 Sep 2024
Cited by 1 | Viewed by 1795
Abstract
Greenhouse cultivation is one of the current strategies to address the challenges of food production, sustainability, and food quality. Similarly, the use of technological tools to automate greenhouse environments through a set of sensors and actuators allows for the control and improvement of [...] Read more.
Greenhouse cultivation is one of the current strategies to address the challenges of food production, sustainability, and food quality. Similarly, the use of technological tools to automate greenhouse environments through a set of sensors and actuators allows for the control and improvement of processes within this environment. This document presents data collected from the sensors and actuators of two identical greenhouse environments, one with the cultivation of stringless blue lake beans and the other without cultivation. The aim is that this dataset will provide a broader characterization of the behavior of climatic variables inside greenhouse environments and how they are impacted by control actions, subsequently contributing to the development of new research on implementations of or improvements to control, supervision, management, and automation actions in greenhouse environments. Full article
Show Figures

Figure 1

8 pages, 339 KB  
Data Descriptor
Interruption Audio & Transcript: Derived from Group Affect and Performance Dataset
by Daniel Doyle and Ovidiu Şerban
Data 2024, 9(9), 104; https://doi.org/10.3390/data9090104 - 31 Aug 2024
Cited by 1 | Viewed by 2233
Abstract
Despite the widespread development and use of chatbots, there is a lack of audio-based interruption datasets. This study provides a dataset of 200 manually annotated interruptions from a broader set of 355 data points of overlapping utterances. The dataset is derived from the [...] Read more.
Despite the widespread development and use of chatbots, there is a lack of audio-based interruption datasets. This study provides a dataset of 200 manually annotated interruptions from a broader set of 355 data points of overlapping utterances. The dataset is derived from the Group Affect and Performance dataset managed by the University of the Fraser Valley, Canada. It includes both audio files and transcripts, allowing for multi-modal analysis. Given the extensive literature and the varied definitions of interruptions, it was necessary to establish precise definitions. The study aims to provide a comprehensive dataset for researchers to build and improve interruption prediction models. The findings demonstrate that classification models can generalize well to identify interruptions based on this dataset’s audio. This opens up research avenues with respect to interruption-related topics, ranging from multi-modal interruption classification using text and audio modalities to the analysis of group dynamics. Full article
Show Figures

Figure 1

10 pages, 1662 KB  
Data Descriptor
TM–IoV: A First-of-Its-Kind Multilabeled Trust Parameter Dataset for Evaluating Trust in the Internet of Vehicles
by Yingxun Wang, Adnan Mahmood, Mohamad Faizrizwan Mohd Sabri and Hushairi Zen
Data 2024, 9(9), 103; https://doi.org/10.3390/data9090103 - 31 Aug 2024
Cited by 2 | Viewed by 2454
Abstract
The emerging and promising paradigm of the Internet of Vehicles (IoV) employ vehicle-to-everything communication for facilitating vehicles to not only communicate with one another but also with the supporting roadside infrastructure, vulnerable pedestrians, and the backbone network in a bid to primarily address [...] Read more.
The emerging and promising paradigm of the Internet of Vehicles (IoV) employ vehicle-to-everything communication for facilitating vehicles to not only communicate with one another but also with the supporting roadside infrastructure, vulnerable pedestrians, and the backbone network in a bid to primarily address a number of safety-critical vehicular applications. Nevertheless, owing to the inherent characteristics of IoV networks, in particular, of being (a) highly dynamic in nature and which results in a continual change in the network topology and (b) non-deterministic owing to the intricate nature of its entities and their interrelationships, they are susceptible to a number of malicious attacks. Such kinds of attacks, if and when materialized, jeopardizes the entire IoV network, thereby putting human lives at risk. Whilst the cryptographic-based mechanisms are capable of mitigating the external attacks, the internal attacks are extremely hard to tackle. Trust, therefore, is an indispensable tool since it facilitates in the timely identification and eradication of malicious entities responsible for launching internal attacks in an IoV network. To date, there is no dataset pertinent to trust management in the context of IoV networks and the same has proven to be a bottleneck for conducting an in-depth research in this domain. The manuscript-at-hand, accordingly, presents a first of its kind trust-based IoV dataset encompassing 96,707 interactions amongst 79 vehicles at different time instances. The dataset involves nine salient trust parameters, i.e., packet delivery ratio, similarity, external similarity, internal similarity, familiarity, external familiarity, internal familiarity, reward/punishment, and context, which play a considerable role in ascertaining the trust of a vehicle within an IoV network. Full article
Show Figures

Figure 1

9 pages, 878 KB  
Article
An Expected Goals on Target (xGOT) Metric as a New Metric for Analyzing Elite Soccer Player Performance
by Anselmo Ruiz-de-Alarcón-Quintero and Blanca De-la-Cruz-Torres
Data 2024, 9(9), 102; https://doi.org/10.3390/data9090102 - 28 Aug 2024
Cited by 7 | Viewed by 11466
Abstract
Introduction: Football analysis is an applied research area that has seen a huge upsurge in recent years. More complex analysis to understand the soccer players’ or teams’ performances during matches is required. The objective of this study was to prove the usefulness of [...] Read more.
Introduction: Football analysis is an applied research area that has seen a huge upsurge in recent years. More complex analysis to understand the soccer players’ or teams’ performances during matches is required. The objective of this study was to prove the usefulness of the expected goals on target (xGOT) metric, as a good indicator of a soccer team’s performance in professional Spanish football leagues, both in the women’s and men’s categories. Method: The data for the Spanish teams were collected from the statistical website Football Reference. The 2023/24 season was analyzed for Spanish leagues, both in the women’s and men’s categories (LigaF and LaLiga, respectively). For all teams, the following variables were calculated: goals, possession value (PV), expected goals (xG) and xGOT. All data obtained for each variable were normalized by match (90 min). A descriptive and correlational statistical analysis was carried out. Results: In the men’s league, this study found a high correlation between goals per match and xGOT (R2 = 0.9248) while in the women’s league, there was a high correlation between goals per match (R2 = 0.9820) and xG and between goals per match and xGOT (R2 = 0.9574). Conclusions: In the LaLiga, the xGOT was the best metric that represented the match result while in the LigaF, the xG and the xGOT were the best metrics that represented the match score. Full article
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop