Journal Description
Data
Data
is a peer-reviewed, open access journal on data in science, with the aim of enhancing data transparency and reusability. The journal publishes in two sections: a section on the collection, treatment and analysis methods of data in science; a section publishing descriptions of scientific and scholarly datasets (one dataset per paper). The journal is published monthly online by MDPI.
- Open Access— free for readers, with article processing charges (APC) paid by authors or their institutions.
- High Visibility: indexed within Scopus, ESCI (Web of Science), dblp, Inspec, RePEc, and other databases.
- Journal Rank: CiteScore - Q2 (Information Systems and Management)
- Rapid Publication: manuscripts are peer-reviewed and a first decision is provided to authors approximately 22 days after submission; acceptance to publication is undertaken in 3.9 days (median values for papers published in this journal in the second half of 2023).
- Recognition of Reviewers: reviewers who provide timely, thorough peer-review reports receive vouchers entitling them to a discount on the APC of their next publication in any MDPI journal, in appreciation of the work done.
Impact Factor:
2.6 (2022);
5-Year Impact Factor:
3.0 (2022)
Latest Articles
WEA-Acceptance Data—A Dataset of Acoustic, Meteorological, and Operational Wind Turbine Measurements
Data 2024, 9(3), 46; https://doi.org/10.3390/data9030046 - 15 Mar 2024
Abstract
►
Show Figures
In this article, a dataset is described which combines wind turbine supervisory control and data acquisition (SCADA), meteorological and acoustical data and thus gives a detailed description of a wind farm and its atmospheric and acoustic environment. The data were collected during different
[...] Read more.
In this article, a dataset is described which combines wind turbine supervisory control and data acquisition (SCADA), meteorological and acoustical data and thus gives a detailed description of a wind farm and its atmospheric and acoustic environment. The data were collected during different seasons for several weeks at a time, such that a multitude of environmental and operational conditions are covered. In five measurement campaigns, in total three different locations with similar surroundings were captured. The raw data were enhanced with derived values such as atmospheric stability or direction of sound propagation. Data of one month including all time series measurements as well as monophonic audio recordings are now published. The dataset also contains three exemplary use cases along with documents that describe the data pre-processing.
Full article
Open AccessData Descriptor
A Dataset of Benthic Species from Mesophotic Bioconstructions on the Apulian Coast (Southeastern Italy, Mediterranean Sea)
by
Maria Mercurio, Guadalupe Giménez, Giorgio Bavestrello, Frine Cardone, Giuseppe Corriero, Jacopo Giampaoletti, Maria Flavia Gravina, Cataldo Pierri, Caterina Longo, Adriana Giangrande and Carlotta Nonnis Marzano
Data 2024, 9(3), 45; https://doi.org/10.3390/data9030045 - 08 Mar 2024
Abstract
►▼
Show Figures
Marine bioconstructions are complex habitats that represent a hotspot of biodiversity. Among Mediterranean bioconstructions, those thriving on mesophotic bottoms on southeastern Italian coasts are of particular interest due to their horizontal and vertical extension. In general, the communities that develop in the Mediterranean
[...] Read more.
Marine bioconstructions are complex habitats that represent a hotspot of biodiversity. Among Mediterranean bioconstructions, those thriving on mesophotic bottoms on southeastern Italian coasts are of particular interest due to their horizontal and vertical extension. In general, the communities that develop in the Mediterranean twilight zone encompassed within the first 30 m of depth are better known, while relatively few data are available on those at greater depths. By further investigating the diversity and structure of mesophotic bioconstructions in the southern Adriatic, we can improve our understanding of Mediterranean biodiversity while developing effective conservation strategies to preserve these habitats of particular interest. The dataset reported here comprises records of benthic marine taxa from algae and invertebrate mesophotic bioconstructions investigated at six sites along the southern Adriatic coast of Italy, at depths between approximately 25 and 65 m. The dataset contains a total of 1718 records, covering 11 phyla and 648 benthic taxa, of which 580 were recognized at the species level. These data could provide a reference point for further investigations with descriptive or management purposes, including the possible assessment of mesophotic bioconstructions as refuges for shallow-water species.
Full article
Figure 1
Open AccessData Descriptor
Subjective Well-Being and Mental Health among College Students: Two Datasets for Diagnosis and Program Evaluation
by
Lina Martínez, Esteban Robles, Valeria Trofimoff, Nicolás Vidal, Andrés David Espada, Nayith Mosquera, Bryan Franco, Víctor Sarmiento and María Isabel Zafra
Data 2024, 9(3), 44; https://doi.org/10.3390/data9030044 - 06 Mar 2024
Abstract
►▼
Show Figures
This paper presents two datasets about college students’ subjective well-being and mental health in a developing country. The first data set of this report offers a diagnosis of the prevalence of self-reported symptoms associated with stress, anxiety, depression, and overall evaluation of subjective
[...] Read more.
This paper presents two datasets about college students’ subjective well-being and mental health in a developing country. The first data set of this report offers a diagnosis of the prevalence of self-reported symptoms associated with stress, anxiety, depression, and overall evaluation of subjective well-being. The study uses validated scales to measure self-reported symptoms related to mental health conditions. To measure stress, the study used the Perceived Stress Scale (PSS-10) and the 7-item Generalized Anxiety Disorder Scale (GAD-7) to measure symptoms associated with anxiety (GAD-7), and the 9-item Patient Health Questionnaire (PHQ-9) to measure symptoms associated with depression. This diagnosis was collected in a college student sample of 3052 undergrad students in 2022 at a medium-sized university in Colombia. The second dataset reports the evaluation of a positive education intervention implemented in the same university. The Colombian Minister of Science and Technology financed the intervention to promote strategies to mitigate the consequences on college students’ well-being and mental health after the pandemic. The program evaluation data cover two years (2020–2022) with 193 college students in the treatment group (students enrolled in a class teaching evidence-based interventions to promote well-being and mental health awareness) and 135 students in the control group. Data for evaluation include a broad array of variables of life satisfaction, happiness, negative emotions, COVID-19 effects, relationships valuations, and habits and the measurement of three scales: The Satisfaction with Life Scale (SWLS), a brief measurement of depressive symptomatology (CESD-7), and the Brief Strengths Scale (BSS).
Full article
Figure 1
Open AccessData Descriptor
Pupil Data Upon Stimulation by Auditory Stimuli
by
Davide La Rosa, Luca Bruschini, Maria Paola Tramonti Fantozzi, Paolo Orsini, Mario Milazzo and Antonino Crivello
Data 2024, 9(3), 43; https://doi.org/10.3390/data9030043 - 05 Mar 2024
Abstract
►▼
Show Figures
Evaluating hearing in newborns and uncooperative patients can pose a considerable challenge. One potential solution might be to employ the Pupil Dilation Response (PDR) as an objective physiological metric. In this dataset descriptor paper, we present a collection of data showing changes in
[...] Read more.
Evaluating hearing in newborns and uncooperative patients can pose a considerable challenge. One potential solution might be to employ the Pupil Dilation Response (PDR) as an objective physiological metric. In this dataset descriptor paper, we present a collection of data showing changes in pupil dimension and shape upon presentation of auditory stimuli. In particular, we collected pupil data from 16 subjects, with no known hearing loss, upon different lighting conditions, measured in response to a series of 60–100 audible tones, all of the same frequency and amplitude, which may serve to further investigate any relationship between hearing capabilities and PDRs.
Full article
Figure 1
Open AccessData Descriptor
A Set of Ground Penetrating Radar Measures from Quarries
by
Stefano Bonduà, André Monteiro Klen, Massimiliano Pilone, Laurentiu Asimopolos and Natalia-Silvia Asimopolos
Data 2024, 9(3), 42; https://doi.org/10.3390/data9030042 - 03 Mar 2024
Abstract
►▼
Show Figures
This paper presents a set of Ground Penetrating Radar (GPR) data obtained from in situ measurements conducted in four ornamental stone quarries located in Italy (Botticino quarry) and Romania (Ruschita, Carpinis, and Pietroasa quarries). The GPR is a Non-Destructive Testing (NDT) technique that
[...] Read more.
This paper presents a set of Ground Penetrating Radar (GPR) data obtained from in situ measurements conducted in four ornamental stone quarries located in Italy (Botticino quarry) and Romania (Ruschita, Carpinis, and Pietroasa quarries). The GPR is a Non-Destructive Testing (NDT) technique that enables the detection and localization of fractures without damage to the surface, among other capabilities. In this study, two instruments of ground-coupled GPR were used to detect and locate the fractures, discontinuities, or weakened zones. The GPR data contains radargrams for discontinuities and fracture detection, besides the geographic location of the measures. For each measurement site, a set of radargrams has been acquired in two orthogonal directions, allowing for a 3D reconstruction of the investigated site.
Full article
Figure 1
Open AccessArticle
Defining the Balearic Islands’ Tourism Data Space: An Approach to Functional and Data Requirements
by
Dolores Ordóñez-Martínez, Joana M. Seguí-Pons and Maurici Ruiz-Pérez
Data 2024, 9(3), 41; https://doi.org/10.3390/data9030041 - 29 Feb 2024
Abstract
The definition of a tourism data space (TDS) in the Balearic Islands is a complex process that involves identifying the types of questions to be addressed, including analytical tools, and determining the type of information to be incorporated. This study delves into the
[...] Read more.
The definition of a tourism data space (TDS) in the Balearic Islands is a complex process that involves identifying the types of questions to be addressed, including analytical tools, and determining the type of information to be incorporated. This study delves into the functional requirements of a Balearic Islands’ TDS based on the study of scientific research carried out in the field of tourism in the Balearic Islands and drawing comparisons with international scientific research in the field of tourism information. Utilizing a bibliometric analysis of the scientific literature, this study identifies the scientific requirements that should be met for the development of a robust, rigorous, and efficient TDS. The goal is to support excellent scientific research in tourism and facilitate the transfer of research results to the productive sector to maintain and improve the competitiveness of the Balearic Islands as a tourist destination. The results of the analysis provide a structured framework for the construction of the Balearic Islands’ TDS, outlining objectives, methods to be implemented, and information to be considered.
Full article
(This article belongs to the Section Information Systems and Data Management)
►▼
Show Figures
Figure 1
Open AccessData Descriptor
Draft Genome Sequence of Bacillus thuringiensis INTA 103-23 Reveals Its Insecticidal Properties: Insights from the Genomic Sequence
by
Leopoldo Palma, Leila Ortiz, José Niz, Marcelo Berretta and Diego Sauka
Data 2024, 9(3), 40; https://doi.org/10.3390/data9030040 - 28 Feb 2024
Abstract
The genome of Bacillus thuringiensis strain INTA 103-23 was sequenced, revealing a high-quality draft assembly comprising 243 contigs with a total size of 6.30 Mb and a completeness of 99%. Phylogenetic analysis classified INTA 103-23 within the Bacillus cereus sensu stricto cluster. Genome
[...] Read more.
The genome of Bacillus thuringiensis strain INTA 103-23 was sequenced, revealing a high-quality draft assembly comprising 243 contigs with a total size of 6.30 Mb and a completeness of 99%. Phylogenetic analysis classified INTA 103-23 within the Bacillus cereus sensu stricto cluster. Genome annotation identified 6993 genes, including 2476 hypothetical proteins. Screening for pesticidal proteins unveiled 10 coding sequences with significant similarity to known pesticidal proteins, showcasing a potential efficacy against various insect orders. AntiSMASH analysis predicted 13 biosynthetic gene clusters (BGCs), including clusters with 100% similarity to petrobactin and anabaenopeptin NZ857/nostamide A. Notably, fengycin exhibited a 40% similarity within the identified clusters. Further exploration involved a comparative genomic analysis with ten phylogenetically closest genomes. The ANI values, calculated using fastANI, confirmed the closest relationships with strains classified under Bacillus cereus sensu stricto. This comprehensive genomic analysis of B. thuringiensis INTA 103-23 provides valuable insights into its genetic makeup, potential pesticidal activity, and biosynthetic capabilities. The identified BGCs and pesticidal proteins contribute to our understanding of the strain’s biocontrol potential against diverse agricultural pests.
Full article
(This article belongs to the Special Issue Genome Sequence of Novel Bacteria Showing Potential Biotechnological Applications)
►▼
Show Figures
Figure 1
Open AccessArticle
CybAttT: A Dataset of Cyberattack News Tweets for Enhanced Threat Intelligence
by
Huda Lughbi, Mourad Mars and Khaled Almotairi
Data 2024, 9(3), 39; https://doi.org/10.3390/data9030039 - 23 Feb 2024
Cited by 1
Abstract
►▼
Show Figures
The continuous developments in information technologies have resulted in a significant rise in security concerns, including cybercrimes, unauthorized access, and cyberattacks. Recently, researchers have increasingly turned to social media platforms like X to investigate cyberattacks. Analyzing and collecting news about cyberattacks from tweets
[...] Read more.
The continuous developments in information technologies have resulted in a significant rise in security concerns, including cybercrimes, unauthorized access, and cyberattacks. Recently, researchers have increasingly turned to social media platforms like X to investigate cyberattacks. Analyzing and collecting news about cyberattacks from tweets can efficiently provide crucial insights into the attacks themselves, including their impacts, occurrence regions, and potential mitigation strategies. However, there is a shortage of labeled datasets related to cyberattacks. This paper describes CybAttT, a dataset of 36,071 English cyberattack-related tweets. These tweets are manually labeled into three classes: high-risk news, normal news, and not news. Our final overall Inner Annotation agreement was 0.99 (Fleiss kappa), which represents high agreement. To ensure dataset reliability and accuracy, we conducted rigorous experiments using different supervised machine learning algorithms and various fine-tuned language models to assess its quality and suitability for its intended purpose. A high F1-score of 87.6% achieved using the CybAttT dataset not only demonstrates the potential of our approach but also validates the high quality and thoroughness of its annotations. We have made our CybAttT dataset accessible to the public for research purposes.
Full article
Figure 1
Open AccessArticle
Multimodal Hinglish Tweet Dataset for Deep Pragmatic Analysis
by
Pratibha, Amandeep Kaur, Meenu Khurana and Robertas Damaševičius
Data 2024, 9(2), 38; https://doi.org/10.3390/data9020038 - 15 Feb 2024
Abstract
Wars, conflicts, and peace efforts have become inherent characteristics of regions, and understanding the prevailing sentiments related to these issues is crucial for finding long-lasting solutions. Twitter/‘X’, with its vast user base and real-time nature, provides a valuable source to assess the raw
[...] Read more.
Wars, conflicts, and peace efforts have become inherent characteristics of regions, and understanding the prevailing sentiments related to these issues is crucial for finding long-lasting solutions. Twitter/‘X’, with its vast user base and real-time nature, provides a valuable source to assess the raw emotions and opinions of people regarding war, conflict, and peace. This paper focuses on collecting and curating hinglish tweets specifically related to wars, conflicts, and associated taxonomy. The creation of said dataset addresses the existing gap in contemporary literature, which lacks comprehensive datasets capturing the emotions and sentiments expressed by individuals regarding wars, conflicts, and peace efforts. This dataset holds significant value and application in deep pragmatic analysis as it enables future researchers to identify the flow of sentiments, analyze the information architecture surrounding war, conflict, and peace effects, and delve into the associated psychology in this context. To ensure the dataset’s quality and relevance, a meticulous selection process was employed, resulting in the inclusion of explanable 500 carefully chosen search filters. The dataset currently has 10,040 tweets that have been validated with the help of human expert to make sure they are correct and accurate.
Full article
(This article belongs to the Special Issue Sentiment Analysis in Social Media Data)
►▼
Show Figures
Figure 1
Open AccessData Descriptor
Digital Elevation Models and Orthomosaics of the Dutch Noordwest Natuurkern Foredune Restoration Project
by
Gerben Ruessink, Dick Groenendijk and Bas Arens
Data 2024, 9(2), 37; https://doi.org/10.3390/data9020037 - 15 Feb 2024
Abstract
►▼
Show Figures
Coastal dunes worldwide are increasingly under pressure from the adverse effects of human activities. Therefore, more and more restoration measures are being taken to create conditions that help disturbed coastal dune ecosystems regenerate or recover naturally. However, many projects lack the (open-access) monitoring
[...] Read more.
Coastal dunes worldwide are increasingly under pressure from the adverse effects of human activities. Therefore, more and more restoration measures are being taken to create conditions that help disturbed coastal dune ecosystems regenerate or recover naturally. However, many projects lack the (open-access) monitoring observations needed to signal whether further actions are needed, and hence lack the opportunity to “learn by doing”. This submission presents an open-access data set of 37 high-resolution digital elevation models and 24 orthomosaics collected before and after the excavation of five artificial foredune trough blowouts (“notches”) in winter 2012/2013 in the Dutch Zuid-Kennemerland National Park, one of the largest coastal dune restoration projects in northwest Europe. These high-resolution data provide a valuable resource for improving understanding of the biogeomorphic processes that determine the evolution of restored dune systems as well as developing guidelines to better design future restoration efforts with foredune notching.
Full article
Figure 1
Open AccessData Descriptor
AriAplBud: An Aerial Multi-Growth Stage Apple Flower Bud Dataset for Agricultural Object Detection Benchmarking
by
Wenan Yuan
Data 2024, 9(2), 36; https://doi.org/10.3390/data9020036 - 11 Feb 2024
Abstract
►▼
Show Figures
As one of the most important topics in contemporary computer vision research, object detection has received wide attention from the precision agriculture community for diverse applications. While state-of-the-art object detection frameworks are usually evaluated against large-scale public datasets containing mostly non-agricultural objects, a
[...] Read more.
As one of the most important topics in contemporary computer vision research, object detection has received wide attention from the precision agriculture community for diverse applications. While state-of-the-art object detection frameworks are usually evaluated against large-scale public datasets containing mostly non-agricultural objects, a specialized dataset that reflects unique properties of plants would aid researchers in investigating the utility of newly developed object detectors within agricultural contexts. This article presents AriAplBud: a close-up apple flower bud image dataset created using an unmanned aerial vehicle (UAV)-based red–green–blue (RGB) camera. AriAplBud contains 3600 images of apple flower buds at six growth stages, with 110,467 manual bounding box annotations as positive samples and 2520 additional empty orchard images containing no apple flower bud as negative samples. AriAplBud can be directly deployed for developing object detection models that accept Darknet annotation format without additional preprocessing steps, serving as a potential benchmark for future agricultural object detection research. A demonstration of developing YOLOv8-based apple flower bud detectors is also presented in this article.
Full article
Figure 1
Open AccessData Descriptor
COVID-19 Lockdown Effects on Sleep, Immune Fitness, Mood, Quality of Life, and Academic Functioning: Survey Data from Turkish University Students
by
Pauline A. Hendriksen, Sema Tan, Evi C. van Oostrom, Agnese Merlo, Hilal Bardakçi, Nilay Aksoy, Johan Garssen, Gillian Bruce and Joris C. Verster
Data 2024, 9(2), 35; https://doi.org/10.3390/data9020035 - 10 Feb 2024
Abstract
►▼
Show Figures
Previous studies from the Netherlands, Germany, and Argentina revealed that the 2019 coronavirus disease (COVID-19) pandemic and associated lockdown periods had a significant negative impact on the wellbeing and quality of life of students. The negative impact of lockdown periods on health correlates
[...] Read more.
Previous studies from the Netherlands, Germany, and Argentina revealed that the 2019 coronavirus disease (COVID-19) pandemic and associated lockdown periods had a significant negative impact on the wellbeing and quality of life of students. The negative impact of lockdown periods on health correlates such as immune fitness, alcohol consumption, and mood were reflected in their academic functioning. As both the duration and intensity of lockdown measures differed between countries, it is important to replicate these findings in different countries and cultures. Therefore, the purpose of the current study was to examine the impact of the COVID-19 pandemic on immune fitness, mood, academic functioning, sleep, smoking, alcohol consumption, healthy diet, and quality of life among Turkish students. Turkish students in the age range of 18 to 30 years old were invited to complete an online survey. Data were collected from n = 307 participants and included retrospective assessments for six time periods: (1) BP (before the COVID-19 pandemic, 1 January 2020–10 March 2020), (2) NL1 (the first no lockdown period, 11 March 2020–28 April 2021), (3) the lockdown period (29 April 2021–17 May 2021), (4) NL2 (the second no lockdown period, 18 May 2021–31 December 2021), (5) NL3 (the third no lockdown period, 1 January 2022–December 2022), and (6) for the past month. In this data descriptor article, the content of the survey and the dataset are described.
Full article
Figure 1
Open AccessData Descriptor
Draft Genome Sequencing of the Bacillus thuringiensis var. Thuringiensis Highly Insecticidal Strain 800/15
by
Anton E. Shikov, Iuliia A. Savina, Maria N. Romanenko, Anton A. Nizhnikov and Kirill S. Antonets
Data 2024, 9(2), 34; https://doi.org/10.3390/data9020034 - 10 Feb 2024
Abstract
►▼
Show Figures
The Bacillus thuringiensis serovar thuringiensis strain 800/15 has been actively used as an agent in biopreparations with high insecticidal activity against the larvae of the Colorado potato beetle Leptinotarsa decemlineata and gypsy moth Lymantria dispar. In the current study, we present the
[...] Read more.
The Bacillus thuringiensis serovar thuringiensis strain 800/15 has been actively used as an agent in biopreparations with high insecticidal activity against the larvae of the Colorado potato beetle Leptinotarsa decemlineata and gypsy moth Lymantria dispar. In the current study, we present the first draft genome of the 800/15 strain coupled with a comparative genomic analysis of its closest reference strains. The raw sequence data were obtained by Illumina technology on the HiSeq X platform and de novo assembled with the SPAdes v3.15.4 software. The genome reached 6,524,663 bp. in size and carried 6771 coding sequences, 3 of which represented loci encoding insecticidal toxins, namely, Spp1Aa1, Cry1Ab9, and Cry1Ba8 active against the orders Lepidoptera, Blattodea, Hemiptera, Diptera, and Coleoptera. We also revealed the biosynthetic gene clusters responsible for the synthesis of secondary metabolites, including fengycin, bacillibactin, and petrobactin with predicted antibacterial, fungicidal, and growth-promoting properties. Further comparative genomics suggested the strain is not enriched with genes linked with biological activities implying that agriculturally important properties rely more on the composition of loci rather than their abundance. The obtained genomic sequence of the strain with the experimental metadata could facilitate the computational prediction of bacterial isolates’ potency from genomic data.
Full article
Figure 1
Open AccessData Descriptor
Conflicting Marks Archive Dataset: A Dataset of Conflicting Marks from the Brazilian Intellectual Property Office
by
Igor Bezerra Reis, Rafael Ângelo Santos Leite, Mateus Miranda Torres, Alcides Gonçalves da Silva Neto, Francisco José da Silva e Silva and Ariel Soares Teles
Data 2024, 9(2), 33; https://doi.org/10.3390/data9020033 - 09 Feb 2024
Abstract
►▼
Show Figures
A registered trademark represents one of a company’s most valuable intellectual assets, acting as a safeguard against possible reputational damage and financial losses resulting from infringements of this intellectual property. To be registered, a mark must be unique and distinctive in relation to
[...] Read more.
A registered trademark represents one of a company’s most valuable intellectual assets, acting as a safeguard against possible reputational damage and financial losses resulting from infringements of this intellectual property. To be registered, a mark must be unique and distinctive in relation to other trademarks which are already registered. In this paper, we describe the CMAD, an acronym for Conflicting Marks Archive Dataset. This dataset has been meticulously organized into pairs of marks (Number of pairs = 18,355) involved in copyright infringement across word, figurative and mixed marks. Organizations sought to register these marks with the National Institute of Industrial Property (INPI) in Brazil, and had their applications denied after analysis by intellectual property specialists. The robustness of this dataset is ensured by the intrinsic similarity of the conflicting marks, since the decisions were made by INPI specialists. This characteristic provides a reliable basis for the development and testing of tools designed to analyze similarity between marks, thus contributing to the evolution of practices and computer-based solutions in the field of intellectual property.
Full article
Figure 1
Open AccessEditorial
Data in Astrophysics and Geophysics: Novel Research and Applications
by
Vladimir A. Srećković, Milan S. Dimitrijević and Zoran R. Mijić
Data 2024, 9(2), 32; https://doi.org/10.3390/data9020032 - 08 Feb 2024
Abstract
Rapid development of communication technologies and constant technological improvements as a result of scientific discoveries require the establishment of specific databases [...]
Full article
(This article belongs to the Section Spatial Data Science and Digital Earth)
Open AccessArticle
The Yinshan Mountains Record over 10,000 Landslides
by
Jingjing Sun, Chong Xu, Liye Feng, Lei Li, Xuewei Zhang and Wentao Yang
Data 2024, 9(2), 31; https://doi.org/10.3390/data9020031 - 08 Feb 2024
Abstract
China boasts a vast expanse of mountainous terrain, characterized by intricate geological conditions and structural features, resulting in frequent geological disasters. Among these, landslides, as prototypical geological hazards, pose significant threats to both lives and property. Consequently, conducting a comprehensive landslide inventory in
[...] Read more.
China boasts a vast expanse of mountainous terrain, characterized by intricate geological conditions and structural features, resulting in frequent geological disasters. Among these, landslides, as prototypical geological hazards, pose significant threats to both lives and property. Consequently, conducting a comprehensive landslide inventory in mountainous regions is imperative for current research. This study concentrates on the Yinshan Mountains, an ancient fault-block mountain range spanning east–west in the central Inner Mongolia Autonomous Region, extending from Langshan Mountains in the west to Damaqun Mountains in the east, with the narrow sense Xiao–Yin Mountains District in between. Employing multi-temporal high-resolution remote sensing images from Google Earth, this study conducted visual interpretation, identifying 10,968 landslides in the Yinshan area, encompassing a total area of 308.94 km2. The largest landslide occupies 2.95 km2, while the smallest covers 84.47 m2. Specifically, the Langshan area comprises 331 landslides with a total area of 11.96 km2, the narrow sense Xiao–Yin Mountains include 3393 landslides covering 64.13 km2, and the Manhan Mountains, Damaqun Mountains, and adjacent areas account for 7244 landslides over a total area of 232.85 km2. This research not only contributes to global landslide cataloging initiatives but also serves as a robust foundation for future geohazard prevention and management efforts.
Full article
(This article belongs to the Section Spatial Data Science and Digital Earth)
►▼
Show Figures
Figure 1
Open AccessData Descriptor
Expanded Brain CT Dataset for the Development of AI Systems for Intracranial Hemorrhage Detection and Classification
by
Anna N. Khoruzhaya, Tatiana M. Bobrovskaya, Dmitriy V. Kozlov, Dmitriy Kuligovskiy, Vladimir P. Novik, Kirill M. Arzamasov and Elena I. Kremneva
Data 2024, 9(2), 30; https://doi.org/10.3390/data9020030 - 06 Feb 2024
Abstract
Intracranial hemorrhage (ICH) is a dangerous life-threatening condition leading to disability. Timely and high-quality diagnosis plays a huge role in the course and outcome of this disease. The gold standard in determining ICH is computed tomography. This method requires a prompt involvement of
[...] Read more.
Intracranial hemorrhage (ICH) is a dangerous life-threatening condition leading to disability. Timely and high-quality diagnosis plays a huge role in the course and outcome of this disease. The gold standard in determining ICH is computed tomography. This method requires a prompt involvement of highly qualified personnel, which is not always possible, for example, in case of a staff shortage or increased workload. In such a situation, every minute counts, and time can be lost. The solution to this problem seems to be a set of diagnostic decisions, including the use of artificial intelligence, which will help to identify patients with ICH in a timely manner and provide prompt and quality medical care. However, the main obstacle to the development of artificial intelligence is a lack of high-quality datasets for training and testing. In this paper, we present a dataset including 800 brain CT scans consisting of multiple series of DICOM images with and without signs of ICH, enriched with clinical and technical parameters, as well as the methodology of its generation utilizing natural language processing tools. The dataset is publicly available, which contributes to increased competition in the development of artificial intelligence systems and their advancement and quality improvement.
Full article
(This article belongs to the Section Computational Biology, Bioinformatics, and Biomedical Data Science)
►▼
Show Figures
Figure 1
Open AccessArticle
A Comprehensive Data Pipeline for Comparing the Effects of Momentum on Sports Leagues
by
Jordan Truman Paul Noel, Vinicius Prado da Fonseca and Amilcar Soares
Data 2024, 9(2), 29; https://doi.org/10.3390/data9020029 - 01 Feb 2024
Abstract
►▼
Show Figures
Momentum has been a consistently studied aspect of sports science for decades. Among the established literature, there has, at times, been a discrepancy between conclusions. However, if momentum is indeed an actual phenomenon, it would affect all aspects of sports, from player evaluation
[...] Read more.
Momentum has been a consistently studied aspect of sports science for decades. Among the established literature, there has, at times, been a discrepancy between conclusions. However, if momentum is indeed an actual phenomenon, it would affect all aspects of sports, from player evaluation to pre-game prediction and betting. Therefore, using momentum-based features that quantify a team’s linear trend of play, we develop a data pipeline that uses a small sample of recent games to assess teams’ quality of play and measure the predictive power of momentum-based features versus the predictive power of more traditional frequency-based features across several leagues using several machine learning techniques. More precisely, we use our pipeline to determine the differences in the predictive power of momentum-based features and standard statistical features for the National Hockey League (NHL), National Basketball Association (NBA), and five major first-division European football leagues. Our findings show little evidence that momentum has superior predictive power in the NBA. Still, we found some instances of the effects of momentum on the NHL that produced better pre-game predictors, whereas we view a similar trend in European football/soccer. Our results indicate that momentum-based features combined with frequency-based features could improve pre-game prediction models and that, in the future, momentum should be studied more from a feature/performance indicator point-of-view and less from the view of the dependence of sequential outcomes, thus attempting to distance momentum from the binary view of winning and losing.
Full article
Figure 1
Open AccessData Descriptor
Organ-On-A-Chip (OOC) Image Dataset for Machine Learning and Tissue Model Evaluation
by
Valērija Movčana, Arnis Strods, Karīna Narbute, Fēlikss Rūmnieks, Roberts Rimša, Gatis Mozoļevskis, Maksims Ivanovs, Roberts Kadiķis, Kārlis Gustavs Zviedris, Laura Leja, Anastasija Zujeva, Tamāra Laimiņa and Arturs Abols
Data 2024, 9(2), 28; https://doi.org/10.3390/data9020028 - 01 Feb 2024
Abstract
Organ-on-a-chip (OOC) technology has emerged as a groundbreaking approach for emulating the physiological environment, revolutionizing biomedical research, drug development, and personalized medicine. OOC platforms offer more physiologically relevant microenvironments, enabling real-time monitoring of tissue, to develop functional tissue models. Imaging methods are the
[...] Read more.
Organ-on-a-chip (OOC) technology has emerged as a groundbreaking approach for emulating the physiological environment, revolutionizing biomedical research, drug development, and personalized medicine. OOC platforms offer more physiologically relevant microenvironments, enabling real-time monitoring of tissue, to develop functional tissue models. Imaging methods are the most common approach for daily monitoring of tissue development. Image-based machine learning serves as a valuable tool for enhancing and monitoring OOC models in real-time. This involves the classification of images generated through microscopy contributing to the refinement of model performance. This paper presents an image dataset, containing cell images generated from OOC setup with different cell types. There are 3072 images generated by an automated brightfield microscopy setup. For some images, parameters such as cell type, seeding density, time after seeding and flow rate are provided. These parameters along with predefined criteria can contribute to the evaluation of image quality and identification of potential artifacts. This dataset can be used as a basis for training machine learning classifiers for automated data analysis generated from an OOC setup providing more reliable tissue models, automated decision-making processes within the OOC framework and efficient research in the future.
Full article
(This article belongs to the Section Computational Biology, Bioinformatics, and Biomedical Data Science)
►▼
Show Figures
Figure 1
Open AccessArticle
Understanding Data Breach from a Global Perspective: Incident Visualization and Data Protection Law Review
by
Gabriel Arquelau Pimenta Rodrigues, André Luiz Marques Serrano, Amanda Nunes Lopes Espiñeira Lemos, Edna Dias Canedo, Fábio Lúcio Lopes de Mendonça, Robson de Oliveira Albuquerque, Ana Lucila Sandoval Orozco and Luis Javier García Villalba
Data 2024, 9(2), 27; https://doi.org/10.3390/data9020027 - 31 Jan 2024
Abstract
►▼
Show Figures
Data breaches result in data loss, including personal, health, and financial information that are crucial, sensitive, and private. The breach is a security incident in which personal and sensitive data are exposed to unauthorized individuals, with the potential to incur several privacy concerns.
[...] Read more.
Data breaches result in data loss, including personal, health, and financial information that are crucial, sensitive, and private. The breach is a security incident in which personal and sensitive data are exposed to unauthorized individuals, with the potential to incur several privacy concerns. As an example, the French newspaper Le Figaro breached approximately 7.4 billion records that included full names, passwords, and e-mail and physical addresses. To reduce the likelihood and impact of such breaches, it is fundamental to strengthen the security efforts against this type of incident and, for that, it is first necessary to identify patterns of its occurrence, primarily related to the number of data records leaked, the affected geographical region, and its regulatory aspects. To advance the discussion in this regard, we study a dataset comprising 428 worldwide data breaches between 2018 and 2019, providing a visualization of the related statistics, such as the most affected countries, the predominant economic sector targeted in different countries, and the median number of records leaked per incident in different countries, regions, and sectors. We then discuss the data protection regulation in effect in each country comprised in the dataset, correlating key elements of the legislation with the statistical findings. As a result, we have identified an extensive disclosure of medical records in India and government data in Brazil in the time range. Based on the analysis and visualization, we find some interesting insights that researchers seldom focus on before, and it is apparent that the real dangers of data leaks are beyond the ordinary imagination. Finally, this paper contributes to the discussion regarding data protection laws and compliance regarding data breaches, supporting, for example, the decision process of data storage location in the cloud.
Full article
Figure 1
Journal Menu
► ▼ Journal Menu-
- Data Home
- Aims & Scope
- Editorial Board
- Reviewer Board
- Topical Advisory Panel
- Instructions for Authors
- Guidelines for Reviewers
- Special Issues
- Topics
- Sections & Collections
- Article Processing Charge
- Indexing & Archiving
- Editor’s Choice Articles
- Most Cited & Viewed
- Journal Statistics
- Journal History
- Journal Awards
- Editorial Office
Journal Browser
► ▼ Journal BrowserHighly Accessed Articles
Latest Books
E-Mail Alert
News
Topics
Topic in
Data, Future Internet, Information, Mathematics, Symmetry
Application of Deep Learning Method in 6G Communication Technology
Topic Editors: Mohamed Abouhawwash, K. VenkatachalamDeadline: 31 March 2024
Topic in
Applied Sciences, Batteries, Buildings, Data, Electricity, Electronics, Energies, Smart Cities
Smart Energy Systems, 2nd Edition
Topic Editors: Hugo Morais, Rui Castro, Cindy GuzmanDeadline: 31 May 2024
Topic in
Algorithms, Data, Information, Mathematics, Symmetry
Decision-Making and Data Mining for Sustainable Computing
Topic Editors: Sunil Jha, Malgorzata Rataj, Xiaorui ZhangDeadline: 30 November 2024
Topic in
BDCC, Data, MAKE, Mathematics
Big Data Intelligence: Methodologies and Applications
Topic Editors: Liang Zhao, Liang Zou, Boxiang DongDeadline: 31 December 2024
Conferences
Special Issues
Special Issue in
Data
Advances in Text Mining Techniques and Applications for Knowledge Discovery
Guest Editors: Edson Talamini, Letícia De Oliveira, Filipe PortelaDeadline: 31 March 2024
Special Issue in
Data
Data Mining and Computational Intelligence for E-Learning and Education—2nd Edition
Guest Editor: Antonio Sarasa CabezueloDeadline: 20 April 2024
Special Issue in
Data
Genome Sequence of Novel Bacteria Showing Potential Biotechnological Applications
Guest Editors: Leopoldo Palma, Diego Herman Sauka, Baltasar EscricheDeadline: 5 July 2024
Special Issue in
Data
Machine Learning and Data Mining in Exercise, Sports and Health Research
Guest Editor: Daniel Rojas-ValverdeDeadline: 31 October 2024
Topical Collections
Topical Collection in
Data
Modern Geophysical and Climate Data Analysis: Tools and Methods
Collection Editors: Vladimir Sreckovic, Zoran Mijic