Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

Search Results (124)

Search Parameters:
Keywords = manual data curation

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 527 KB  
Systematic Review
Knowledge Graph Applications in Cultural Heritage: A ROSES-Based Systematic Review
by Liangbing Zhu, Safawi Abdul Rahman and Hazila Timan
Information 2026, 17(3), 269; https://doi.org/10.3390/info17030269 - 9 Mar 2026
Viewed by 381
Abstract
Knowledge Graphs (KGs) are increasingly adopted in cultural heritage research to address challenges of semantic heterogeneity, data fragmentation, and cross-institutional knowledge integration. Despite the rapid growth of KG-based heritage systems, a comprehensive and methodologically rigorous synthesis of existing applications remains limited. To address [...] Read more.
Knowledge Graphs (KGs) are increasingly adopted in cultural heritage research to address challenges of semantic heterogeneity, data fragmentation, and cross-institutional knowledge integration. Despite the rapid growth of KG-based heritage systems, a comprehensive and methodologically rigorous synthesis of existing applications remains limited. To address this gap, this study conducts a ROSES-based systematic review of KG applications in cultural heritage, aiming to examine prevailing application domains, methodological patterns, and emerging research trends. Following the Reporting Standards for Systematic Evidence Syntheses (ROSES), a structured search was conducted in Scopus, Web of Science, and IEEE Xplore. After duplicate removal, screening, eligibility assessment, and quality appraisal, 248 peer-reviewed studies published between 2015 and 2024 were retained for final synthesis. A mixed-method approach combining descriptive analysis and thematic synthesis was employed to analyze KG construction strategies, technological components, application contexts, and reported outcomes. The results indicate that KGs are primarily applied in five interconnected areas: digital recording and preservation, knowledge management and integration, protection and restoration support, cultural transmission and education, and research and innovation. Methodologically, the literature reveals a transition from ontology-driven and manually curated knowledge models toward hybrid approaches integrating artificial intelligence techniques such as natural language processing and machine learning. However, persistent challenges remain, including ontology alignment, scalability, evaluation inconsistency, and limited cross-project interoperability. This review contributes a consolidated and transparent evidence base for KG applications in cultural heritage and advances a conceptual understanding of KGs as socio-technical infrastructures that mediate cultural knowledge representation and interpretation. The findings offer methodological insights and practical implications for researchers, heritage professionals, and system designers, while highlighting directions for future interdisciplinary research. Full article
(This article belongs to the Section Information Applications)
Show Figures

Figure 1

24 pages, 2114 KB  
Article
An Integrated Framework for Automated Identification of Workers’ Safety Violation Based on Knowledge Graph
by Yifan Zhu, Yewei Ouyang, Rui Pan, Zhanhui Sun, Yang Zhou, Rui Ma, Baoquan Cheng and Wen Wang
Buildings 2026, 16(5), 1037; https://doi.org/10.3390/buildings16051037 - 6 Mar 2026
Viewed by 264
Abstract
Automatic identification of worker safety violations can substantially strengthen construction-site safety management by enabling continuous, real-time monitoring. Although recent advances have made automated detection feasible, many existing systems still suffer from poor adaptability and limited extensibility. To address these limitations, this study proposes [...] Read more.
Automatic identification of worker safety violations can substantially strengthen construction-site safety management by enabling continuous, real-time monitoring. Although recent advances have made automated detection feasible, many existing systems still suffer from poor adaptability and limited extensibility. To address these limitations, this study proposes an integrated, knowledge graph-based framework for automatic identification of workers’ safety violations. The framework comprises two principal components: (1) a knowledge graph construction module that encodes domain knowledge (safety regulations, task–hazard relationships, and contextual constraints) into a machine-readable graph structure and (2) a graph-enabled violation identification module that maps structured scene descriptions of worker and environmental states to the knowledge graph and performs semantic inference to detect violations. In this study, these structured scene descriptions are manually specified and simulated as subject–predicate–object triplets; integration with raw sensing data is left for future work. For validation, we construct a knowledge graph containing 1200 safety rules and evaluate the violation identification module on 500 annotated examples representing realistic worker scenarios. Using this curated knowledge graph and structured inputs, the proposed approach achieves an identification accuracy of 97.6% for unsafe worker behaviors. Experimental analysis shows that the knowledge graph representation substantially improves the system’s expandability and interpretability compared with traditional hard-coded rules, facilitating easier incorporation of new rules and multimodal sensing inputs. The results indicate that knowledge graph-driven reasoning offers a practical, scalable pathway for robust, context-aware safety violation detection in varied construction environments. Full article
Show Figures

Figure 1

30 pages, 2658 KB  
Article
Sustainable Smart Urban Governance Enabled by Context-Aware QR Codes: A Scalable Framework for Property Visualisation in Saudi Arabia
by Mohammed Ali R. Alzahrani
Sustainability 2026, 18(5), 2374; https://doi.org/10.3390/su18052374 - 28 Feb 2026
Viewed by 306
Abstract
The digitisation of urban governance requires a context-sensitive method that balances operational efficiency, data security and transparency. This study proposes a context-sensitive QR code system as a conceptual framework for smart urban governance and real estate visualisation in Saudi Arabia, aligned with the [...] Read more.
The digitisation of urban governance requires a context-sensitive method that balances operational efficiency, data security and transparency. This study proposes a context-sensitive QR code system as a conceptual framework for smart urban governance and real estate visualisation in Saudi Arabia, aligned with the strategic objectives of Vision 2030. Unlike traditional static QR code applications, the proposed system acts as a smart urban interface dynamically linking physical buildings to structured digital records and delivering role-specific information through a single scan. This system enables municipal authorities to retrieve compliance and regulatory data and allows emergency response teams to access real-time occupancy data with geographic coordinates. The proposed system enables visitors to explore curated heritage and site-based information, with each interface subject to policy-defined access rules. The proposed QR code system is evaluated by using a scenario-based computational simulation across three representative Saudi cities (Riyadh, Jeddah, and Dammam), and the results show that it significantly reduces service response time compared to manual processes while maintaining data integrity through role-based dynamic filtering. The proposed system enhances administrative efficiency and supports heritage preservation in sensitive areas such as the Al-Balad district in Jeddah city. By integrating governance, visualisation, and cultural sustainability within a simple, scalable and interactive model, the study provides an important framework for emerging smart cities in Saudi Arabia. Full article
Show Figures

Figure 1

22 pages, 1247 KB  
Article
An Integrated Text Mining Approach for Discovering Pharmacological Effects, Drug Combinations, and Repurposing Opportunities of ACE Inhibitors
by Nadezhda Yu. Biziukova, Polina I. Savosina, Dmitry S. Druzhilovskiy, Olga A. Tarasova and Vladimir V. Poroikov
Int. J. Mol. Sci. 2026, 27(4), 2044; https://doi.org/10.3390/ijms27042044 - 22 Feb 2026
Viewed by 307
Abstract
The rapidly expanding body of biomedical literature encompasses a wealth of information concerning the pharmacological effects, mechanisms of action, adverse reactions, and repurposing potential of small-molecule therapeutics. Nevertheless, the systematic extraction and integration of this knowledge continue to pose substantial challenges. In this [...] Read more.
The rapidly expanding body of biomedical literature encompasses a wealth of information concerning the pharmacological effects, mechanisms of action, adverse reactions, and repurposing potential of small-molecule therapeutics. Nevertheless, the systematic extraction and integration of this knowledge continue to pose substantial challenges. In this study, we propose an integrated text-mining framework for the automated extraction and structured representation of information on the biological activities of low-molecular-weight compounds, exemplified by angiotensin-converting enzyme (ACE) inhibitors as a representative pharmacological class. A corpus comprising over 20,000 PubMed titles and abstracts reporting in vitro, in vivo, and clinical investigations of ACE inhibitors was assembled. Chemical compounds, proteins/genes, and diseases were recognized using a previously developed named entity recognition model based on conditional random fields. Entity-level associations were extracted at the sentence level through a rule-based approach employing manually curated pattern phrases, followed by normalization via automated queries to PubChem, UniProt, and the Human Disease Ontology. The proposed methodology facilitated the extraction of approximately 22,000 unique and normalized associations encompassing drug-target, drug-disease, and drug-drug relationships. In addition to confirming well-established therapeutic effects and clinically recognized drug combinations, the analysis identified underexplored pharmacological activities of ACE inhibitors, including antineoplastic, antifibrotic, and neuropsychiatric properties, along with mechanistic associations involving matrix metalloproteinases and neurotrophic signaling pathways. Collectively, these findings underscore the potential of automated literature mining to advance systematic knowledge integration and data-driven hypothesis generation in the contexts of drug repurposing and safety evaluation. Full article
Show Figures

Figure 1

9 pages, 1859 KB  
Brief Report
The Ultimate Micro-Exon: A Single Nucleotide Exon Is Required to Assemble Cytochrome P450 CYP621A Orthologs from Fusarium Species
by David R. Nelson and Khajamohiddin Syed
Int. J. Mol. Sci. 2026, 27(4), 1979; https://doi.org/10.3390/ijms27041979 - 19 Feb 2026
Viewed by 280
Abstract
Cytochrome P450 monooxygenases (CYPs/P450s) play a key role in organisms’ primary and secondary metabolism in species across all domains of life. Accurate annotation of P450 genes is crucial for identifying their functions, evolution, and, consequently, their biotechnological potential. In this study, we report [...] Read more.
Cytochrome P450 monooxygenases (CYPs/P450s) play a key role in organisms’ primary and secondary metabolism in species across all domains of life. Accurate annotation of P450 genes is crucial for identifying their functions, evolution, and, consequently, their biotechnological potential. In this study, we report the identification of an unprecedented one-nucleotide exon required for the correct assembly of CYP621A P450 genes from multiple Fusarium species. Through comparative genomic analysis of 20 orthologous CYP621A genes, supported by an intronless CYP621B1 gene from Aspergillus clavatus, we demonstrate that omission of this single-nucleotide exon disrupts exon phase compatibility and prevents reconstruction of a full-length, functional P450 protein. The micro-exon encodes the central nucleotide of the glycine codon in the highly conserved PKG motif, which is essential for maintaining the structural integrity between the EXXR and PERF motifs, a characteristic of P450 enzymes. Importantly, transcriptomic evidence from sequence read archive (SRA) data confirms accurate splicing of this one-nucleotide exon in Fusarium solani and F. acuminatum under multiple growth conditions. This work presents the second example of the smallest exon reported to date for a gene, and the first for a P450 gene or a fungal gene. The study’s findings have broad implications for genome annotation pipelines, underscoring the need for careful manual curation and improved algorithms to detect ultra-small exons in functionally constrained regions of eukaryotic genes. Full article
(This article belongs to the Section Molecular Genetics and Genomics)
Show Figures

Figure 1

27 pages, 3347 KB  
Article
Generative AI Accelerates Genotype–Phenotype Characterization of a 1600-Case Leigh Syndrome Virtual Cohort from Published Literature
by Lishuang Shen
Biology 2026, 15(4), 334; https://doi.org/10.3390/biology15040334 - 14 Feb 2026
Viewed by 479
Abstract
Leigh Syndrome Spectrum (LSS) is a rare and heterogeneous disease continuum with most published cohorts in small sizes that limit the statistical power. Large-scale meta-analyses with published case-level clinical data extracted from the literature are essential for robust population analysis but are hindered [...] Read more.
Leigh Syndrome Spectrum (LSS) is a rare and heterogeneous disease continuum with most published cohorts in small sizes that limit the statistical power. Large-scale meta-analyses with published case-level clinical data extracted from the literature are essential for robust population analysis but are hindered by the burden of manually standardizing the unstructured, heterogeneous, and sparse case-level data from the literature. We developed a novel workflow which is among the first to combine Generative AI (GenAI) with human-in-the-loop curation to overcome this barrier. This pipeline utilized Google’s Gemini-2.5-pro and rapidly processed over 2300 cases from published case data tables in two weeks and achieved >90% accuracy in mapping raw clinical data to Human Phenotype Ontology (HPO) terms. This process rapidly yielded a harmonized LSS virtual cohort of 1679 data-rich cases, which is the largest LSS virtual cohort reported so far, and thus enables characterization of LSS phenotypic and genetic architectures, revealing that autosomal recessive (932 cases) and mitochondrial (752 cases) inheritance are the most common. The most frequently mutated genes were SURF1 (240 cases), MT-ATP6 (199), and MT-ND3 (183). HPO term consolidation identified common hallmark phenotypes, including lactic acidosis, hypotonia, bilateral basal ganglia lesions, and mitochondrial respiratory chain deficiency. The cohort’s scale enabled large-scale survival analysis, revealing that defects in mitochondrial translation are associated with the poorest prognosis (84% mortality in this group) and early onset (0.23 years). Among the deceased group, patients with Complex V mutations were linked to a significantly shorter mean survival time (1.77 years) than those with Complex I (3.70 years) or IV (3.57 years) mutations. This GenAI-driven methodology establishes a scalable framework for rapidly creating analysis-ready virtual cohorts from heterogeneous literature and accelerating population-level study for rare diseases including Leigh Syndrome and other mitochondrial diseases. Full article
(This article belongs to the Section Bioinformatics)
Show Figures

Figure 1

18 pages, 6606 KB  
Data Descriptor
Annotated IoT Dataset of Waste Collection Events
by Peter Tarábek, Andrej Michalek, Roman Hriník, Ľubomír Králik and Karol Decsi
Data 2026, 11(2), 38; https://doi.org/10.3390/data11020038 - 11 Feb 2026
Viewed by 439
Abstract
This work presents a curated dataset of multimodal sensor measurements from Internet of Things (IoT) units mounted on waste collection vehicles. Each unit records multiple data streams including GPS position, vehicle velocity, radar-based container presence, accelerometer readings of the lifting arm, and RFID [...] Read more.
This work presents a curated dataset of multimodal sensor measurements from Internet of Things (IoT) units mounted on waste collection vehicles. Each unit records multiple data streams including GPS position, vehicle velocity, radar-based container presence, accelerometer readings of the lifting arm, and RFID tag identifiers of the bins. The dataset provides two complementary forms of annotation: (1) algorithmically generated events that were manually cleaned through visual inspection of sensor signals, offering large-scale coverage across 5 vehicles over a total of 25 collection days, and (2) manually validated events derived from synchronized video recordings, representing ground truth for 3 vehicles over 8 collection days. In total, the dataset contains 12,391 annotated waste collection events. The dataset spans diverse operational conditions with varying container sizes and includes both RFID-equipped and non-RFID bins. It can be used to train and evaluate machine learning models for event detection, anomaly recognition, or explainability studies, and to support practical applications such as Pay-as-you-throw (PAYT) waste management schemes. By combining multimodal sensor signals with reliable annotations, the dataset represents a unique resource for advancing research in smart waste collection and the broader field of IoT-enabled urban services. Full article
(This article belongs to the Section Information Systems and Data Management)
Show Figures

Figure 1

10 pages, 12951 KB  
Proceeding Paper
A Forest Mapping Model for Algeria Using Noisy Labels and Few Clean Data
by Lilia Ammar Khodja, Meziane Iftene and Mohammed El Amin Larabi
Eng. Proc. 2026, 124(1), 19; https://doi.org/10.3390/engproc2026124019 - 6 Feb 2026
Viewed by 326
Abstract
This study proposes a forest mapping framework for Algeria that addresses the challenge of limited clean data and noisy global land cover labels. The approach combines a small set of manually curated annotations with noisy ESA WorldCover data, leveraging Sentinel-2 multispectral imagery and [...] Read more.
This study proposes a forest mapping framework for Algeria that addresses the challenge of limited clean data and noisy global land cover labels. The approach combines a small set of manually curated annotations with noisy ESA WorldCover data, leveraging Sentinel-2 multispectral imagery and Digital Elevation Model (DEM) features such as slope, aspect, and the Normalized Difference Vegetation Index (NDVI). A modified ResNet-18 architecture was fine-tuned using both clean and pseudo-labeled noisy data, enabling the model to effectively mitigate label noise. The framework achieved an overall accuracy of 98.5%, demonstrating strong generalization across Algeria’s diverse forest ecosystems. These results highlight the potential of semi-supervised deep learning to improve large-scale forest monitoring, with applications in conservation, sustainable resource management, and climate change mitigation. Full article
(This article belongs to the Proceedings of The 6th International Electronic Conference on Applied Sciences)
Show Figures

Figure 1

25 pages, 11437 KB  
Article
Enhancing the Extraction of GHG Emission-Reduction Targets from Sustainability Reports Using Vision Language Models
by Lars Wilhelmi, Christian Bruns and Matthias Schumann
Mach. Learn. Knowl. Extr. 2026, 8(2), 37; https://doi.org/10.3390/make8020037 - 5 Feb 2026
Viewed by 642
Abstract
This study investigates how Vision Language Models (VLMs) can be used and methodically configured to extract Environmental, Social, and Governance (ESG) metrics from corporate sustainability reports, addressing the limitations of existing text-only and manual ESG data-extraction approaches. Using the Design Science Research Methodology, [...] Read more.
This study investigates how Vision Language Models (VLMs) can be used and methodically configured to extract Environmental, Social, and Governance (ESG) metrics from corporate sustainability reports, addressing the limitations of existing text-only and manual ESG data-extraction approaches. Using the Design Science Research Methodology, we developed an extraction artifact comprising a curated page-level dataset containing greenhouse gas (GHG) emission-reduction targets, an automated evaluation pipeline, model and text-preprocessing comparisons, and iterative prompt and few-shot refinement. Pages from oil and gas sustainability reports were processed directly by VLMs to preserve visual–textual structure, enabling a controlled comparison of text, image, and combined input modalities, with extraction quality assessed at page and attribute level using F1-scores. Among tested models, Mistral Small 3.2 demonstrated the most stable performance and was used to evaluate image, text, and combined modalities. Combined text + image modality performed best (F1 = 0.82), particularly on complex page layouts. The findings demonstrate how to effectively integrate visual and textual cues for ESG metric extraction with VLMs, though challenges remain for visually dense layouts and avoiding inference-based hallucinations. Full article
Show Figures

Graphical abstract

41 pages, 2850 KB  
Article
Automated Classification of Humpback Whale Calls Using Deep Learning: A Comparative Study of Neural Architectures and Acoustic Feature Representations
by Jack C. Johnson and Yue Rong
Sensors 2026, 26(2), 715; https://doi.org/10.3390/s26020715 - 21 Jan 2026
Viewed by 449
Abstract
Passive acoustic monitoring (PAM) using hydrophones enables collecting acoustic data to be collected in large and diverse quantities, necessitating the need for a reliable automated classification system. This paper presents a data-processing pipeline and a set of neural networks designed for a humpback-whale-detection [...] Read more.
Passive acoustic monitoring (PAM) using hydrophones enables collecting acoustic data to be collected in large and diverse quantities, necessitating the need for a reliable automated classification system. This paper presents a data-processing pipeline and a set of neural networks designed for a humpback-whale-detection system. A collection of audio segments is compiled using publicly available audio repositories and extensively curated via manual methods, undertaking thorough examination, editing and clipping to produce a dataset minimizing bias or categorization errors. An array of standard data-augmentation techniques are applied to the collected audio, diversifying and expanding the original dataset. Multiple neural networks are designed and trained using TensorFlow 2.20.0 and Keras 3.13.1 frameworks, resulting in a custom curated architecture layout based on research and iterative improvements. The pre-trained model MobileNetV2 is also included for further analysis. Model performance demonstrates a strong dependence on both feature representation and network architecture. Mel spectrogram inputs consistently outperformed MFCC (Mel-Frequency Cepstral Coefficients) features across all model types. The highest performance was achieved by the pretrained MobileNetV2 using mel spectrograms without augmentation, reaching a test accuracy of 99.01% with balanced precision and recall of 99% and a Matthews correlation coefficient of 0.98. The custom CNN with mel spectrograms also achieved strong performance, with 98.92% accuracy and a false negative rate of only 0.75%. In contrast, models trained with MFCC representations exhibited consistently lower robustness and higher false negative rates. These results highlight the comparative strengths of the evaluated feature representations and network architectures for humpback whale detection. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

18 pages, 1651 KB  
Article
The Penetration of Digital Methods into Historical Scholarship: A Text-Mining Analysis of Russian Publications
by Zinaida Sokova, Valery Kruzhinov and Anna Glazkova
Publications 2026, 14(1), 8; https://doi.org/10.3390/publications14010008 - 20 Jan 2026
Viewed by 676
Abstract
The integration of digital technologies into historical research is a global trend; however, its manifestation varies across national academic traditions. This study investigates the explicit articulation and terminological adoption of digital methods in Russian historical science by analyzing the prevalence and dynamics of [...] Read more.
The integration of digital technologies into historical research is a global trend; however, its manifestation varies across national academic traditions. This study investigates the explicit articulation and terminological adoption of digital methods in Russian historical science by analyzing the prevalence and dynamics of specific technological terms in a large corpus of publications. We first constructed a controlled thesaurus of 166 digital technologies by manually curating keyphrases from Russia’s primary specialized journal in the field (“Istoricheskaya Informatika”, Historical Informatics). This vocabulary was then used to perform text-mining on two distinct corpora: a broad sample of 95K Russian-language history articles from various journals (2004–2024) and a focused sample of publications on the Great Patriotic War History from the Russian Science Citation Index (RSCI, 2014–2023). Our quantitative analysis reveals the frequency, trends, and thematic context of digital method mentions. The findings highlight a significant disparity between the specialized discourse of “Istoricheskaya Informatika” and the mainstream historical publications, while also identifying specific areas (such as archaeological studies) where certain technologies have gained traction. This research offers a novel, data-driven perspective on the “digital turn” in Russian historiography and contributes to the comparative study of digital humanities’ global development. Full article
Show Figures

Figure 1

25 pages, 6664 KB  
Article
CornViT: A Multi-Stage Convolutional Vision Transformer Framework for Hierarchical Corn Kernel Analysis
by Sai Teja Erukude, Jane Mascarenhas and Lior Shamir
Computers 2026, 15(1), 2; https://doi.org/10.3390/computers15010002 - 20 Dec 2025
Viewed by 630
Abstract
Accurate grading of corn kernels is critical for seed certification, directional seeding, and breeding, yet it is still predominantly performed by manual inspection. This work introduces CornViT, a three-stage Convolutional Vision Transformer (CvT) framework that emulates the hierarchical reasoning of human seed analysts [...] Read more.
Accurate grading of corn kernels is critical for seed certification, directional seeding, and breeding, yet it is still predominantly performed by manual inspection. This work introduces CornViT, a three-stage Convolutional Vision Transformer (CvT) framework that emulates the hierarchical reasoning of human seed analysts for single-kernel evaluation. Three sequential CvT-13 classifiers operate on 384×384 RGB images: Stage 1 distinguishes pure from impure kernels; Stage 2 categorizes pure kernels into flat and round morphologies; and Stage 3 determines the embryo orientation (up vs. down) for pure, flat kernels. Starting from a public corn seed image collection, we manually relabeled and filtered images to construct three stage-specific datasets: 7265 kernels for purity, 3859 pure kernels for morphology, and 1960 pure–flat kernels for embryo orientation, all released as benchmarks. Head-only fine-tuning of ImageNet-22k pretrained CvT-13 backbones yields test accuracies of 93.76% for purity, 94.11% for shape, and 91.12% for embryo-orientation detection. Under identical training conditions, ResNet-50 reaches only 76.56 to 81.02 percent, whereas DenseNet-121 attains 86.56 to 89.38 percent accuracy. These results highlight the advantages of convolution-augmented self-attention for kernel analysis. To facilitate adoption, we deploy CornViT in a Flask-based web application that performs stage-wise inference and exposes interpretable outputs through a browser interface. Together, the CornViT framework, curated datasets, and web application provide a deployable solution for automated corn kernel quality assessment in seed quality workflows. Source code and data are publicly available. Full article
Show Figures

Graphical abstract

14 pages, 2983 KB  
Article
Lightweight Multimodal Fusion for Urban Tree Health and Ecosystem Services
by Abror Buriboev, Djamshid Sultanov, Ilhom Rahmatullaev, Ozod Yusupov, Erali Eshonqulov, Dilshod Bekmuradov, Nodir Egamberdiev and Andrew Jaeyong Choi
Sensors 2026, 26(1), 7; https://doi.org/10.3390/s26010007 - 19 Dec 2025
Viewed by 611
Abstract
Rapid urban expansion has heightened the demand for accurate, scalable, and real-time methods to assess tree health and the provision of ecosystem services. Urban trees are the major contributors to air-quality improvement and climate change mitigation; however, their monitoring is mostly constrained to [...] Read more.
Rapid urban expansion has heightened the demand for accurate, scalable, and real-time methods to assess tree health and the provision of ecosystem services. Urban trees are the major contributors to air-quality improvement and climate change mitigation; however, their monitoring is mostly constrained to inherently subjective and inefficient manual inspections. In order to break this barrier, we put forward a lightweight multimodal deep-learning framework that fuses RGB imagery with environmental and biometric sensor data for a combined evaluation of tree-health condition as well as the estimation of the daily oxygen production and CO2 absorption. The proposed architecture features an EfficientNet-B0 vision encoder upgraded with Mobile Inverted Bottleneck Convolutions (MBConv) and a squeeze-and-excitation attention mechanism, along with a small multilayer perceptron for sensor processing. A common multimodal representation facilitates a three-task learning set-up, thus allowing simultaneous classification and regression within a single model. Our experiments with a carefully curated dataset of segmented tree images accompanied by synchronized sensor measurements show that our method attains a health-classification accuracy of 92.03% while also lowering the regression error for O2 (MAE = 1.28) and CO2 (MAE = 1.70) in comparison with unimodal and multimodal baselines. The proposed architecture, with its 5.4 million parameters and an inference latency of 38 ms, can be readily deployed on edge devices and real-time monitoring platforms. Full article
Show Figures

Figure 1

12 pages, 697 KB  
Data Descriptor
Computational Dataset for Polymer–Pharmaceutical Interactions: MD/MM-PBSA and DFT Resources for Molecularly Imprinted Polymer (MIP) Design
by David Visentin, Mario Lovrić, Dejan Milenković, Robert Vianello, Željka Maglica, Kristina Tolić Čop and Dragana Mutavdžić Pavlović
Data 2025, 10(12), 205; https://doi.org/10.3390/data10120205 - 10 Dec 2025
Cited by 1 | Viewed by 860
Abstract
Molecularly imprinted polymers (MIPs) are promising sorbents for selectively capturing pharmaceutically active compounds (PhACs), but design remains slow because candidate screening is largely experimental or based on computationally expensive methods. We present MIP–PhAC, an open, curated resource of polymer–pharmaceutical interaction energies generated from [...] Read more.
Molecularly imprinted polymers (MIPs) are promising sorbents for selectively capturing pharmaceutically active compounds (PhACs), but design remains slow because candidate screening is largely experimental or based on computationally expensive methods. We present MIP–PhAC, an open, curated resource of polymer–pharmaceutical interaction energies generated from molecular dynamics (MD) followed by MM/PBSA analysis, with a small DFT subset for cross-method comparison. This resource is comprised of two complementary datasets: MIP–PhAC-Calibrated, a benchmark set with manually verified pH-7 microstates that reports both monomeric (pre-polymerized) and polymeric (short-chain) MD/MMPBSA energies and includes a DFT subset; and MIP–PhAC-Screen, a broader, high-throughput collection produced under a uniform automated workflow (including automated protonation) for rapid within-polymer ranking and machine learning development. For each MIP—PhAC pair we provide ΔG* components (electrostatics, van der Waals, polar and non-polar solvation; −TΔS omitted), summary statistics from post-convergence frames, simulation inputs, and chemical metadata. To our knowledge, MIP–PhAC is the largest open, curated dataset of polymer–pharmaceutical interaction energies to date. It enables benchmarking of end-point methods, reproducible protocol evaluation, data-driven ranking of polymer–pharmaceutical combinations, and training/validation of machine learning (ML) models for MIP design on modest compute budgets. Full article
Show Figures

Figure 1

28 pages, 20766 KB  
Article
CAFE-Dance: A Culture-Aware Generative Framework for Chinese Folk and Ethnic Dance Synthesis via Self-Supervised Cultural Learning
by Bin Niu, Rui Yang, Qiuyu Zhang, Yani Zhang and Ying Fan
Big Data Cogn. Comput. 2025, 9(12), 307; https://doi.org/10.3390/bdcc9120307 - 2 Dec 2025
Viewed by 704
Abstract
As a vital carrier of human intangible culture, dance plays an important role in cultural transmission through digital generation. However, existing dance generation methods rely heavily on high-precision motion capture and manually annotated datasets, and they fail to effectively model the culturally distinctive [...] Read more.
As a vital carrier of human intangible culture, dance plays an important role in cultural transmission through digital generation. However, existing dance generation methods rely heavily on high-precision motion capture and manually annotated datasets, and they fail to effectively model the culturally distinctive movements of Chinese ethnic folk dance, resulting in semantic distortion and cross-modal mismatch. Building on the Chinese traditional ethnic Helou Dance, this paper proposes a culture-aware Chinese ethnic folk dance generation framework, CAFE-Dance, which dispenses with manual annotation and automatically generates dance sequences that achieve high cultural fidelity, precise music synchronization, and natural, fluent motion. To address the high cost and poor scalability of cultural annotation, we introduce a Zero-Manual-Label Cultural Data Construction Module (ZDCM) that performs self-supervised cultural learning from raw dance videos, using cross-modal semantic alignment and a knowledge-base-guided automatic annotation mechanism to construct a high-quality dataset of Chinese ethnic folk dance covering 108 classes of curated cultural attributes without any frame-level manual labels. To address the difficulty of modeling cultural semantics and the weak interpretability, we propose a Culture-Aware Attention Mechanism (CAAM) that incorporates cultural gating and co-attention to adaptively enhance culturally key movements. To address the challenge of aligning the music–motion–culture tri-modalities, we propose a Tri-Modal Alignment Network (TMA-Net) that achieves dynamic coupling and temporal synchronization of tri-modal semantics under weak supervision. Experimental results show that our framework improves Beat Alignment and Cultural Accuracy by 4.0–5.0 percentage points and over 30 percentage points, respectively, compared with the strongest baseline (Music2Dance), and it reveals an intrinsic coupling between cultural embedding density and motion stability. The code and the curated Helouwu dataset are publicly available. Full article
(This article belongs to the Topic Generative AI and Interdisciplinary Applications)
Show Figures

Figure 1

Back to TopTop