Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

Search Results (135)

Search Parameters:
Keywords = manual data curation

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 6730 KB  
Article
TCN-AE with CUSUM Control Chart for Online Anomaly Detection in Hydraulic Support Pressure Data
by Cong Wang, Wei Xin, Jun Li, Xigui Zheng, Yu Zhao and Zhongguo He
Mathematics 2026, 14(13), 2285; https://doi.org/10.3390/math14132285 (registering DOI) - 26 Jun 2026
Abstract
Hydraulic supports in coal mining faces require continuous pressure monitoring to detect anomalies indicative of roof instability or equipment failure. Existing reconstruction-based methods rely on standard convolutional or recurrent encoders whose limited receptive fields or coarse temporal representations restrict detection accuracy; static per-window [...] Read more.
Hydraulic supports in coal mining faces require continuous pressure monitoring to detect anomalies indicative of roof instability or equipment failure. Existing reconstruction-based methods rely on standard convolutional or recurrent encoders whose limited receptive fields or coarse temporal representations restrict detection accuracy; static per-window thresholding further discards temporal continuity during online deployment. This study proposes a temporal convolutional network autoencoder (TCN-AE) coupled with a Cumulative Sum (CUSUM) control chart for online anomaly detection in hydraulic support pressure data. The TCN encoder uses dilated convolutions with symmetric padding and residual connections, producing an exponentially expanding receptive field that captures temporal patterns at multiple scales. The CUSUM chart accumulates sustained positive deviations in the reconstruction error sequence, improving detection sensitivity while suppressing isolated false alarms. Component analysis experiments on synthetic anomalies show TCN-AE achieves an AUC of 0.811, outperforming CNN, LSTM, GRU, and fully connected autoencoder variants, along with Isolation Forest and One-Class SVM. On a manually curated real fault test set, where per-window reconstruction scores carry negligible discriminative information (AUC = 0.586, near chance), the CUSUM strategy exploits temporal continuity to improve F1 from 0.213 to 0.905 for TCN-AE. This +0.692 gain is driven entirely by temporal accumulation rather than model discriminability, demonstrating that the CUSUM framework is most valuable precisely when per-window signals are weakest. Full article
Show Figures

Figure 1

30 pages, 42422 KB  
Article
Bi-Level Meta-Learning for Reliable Remote Sensing Image Registration
by Lin Shi, Renzhen Wang, Xiaofeng Zhu, Cong An, Kai Zhao, Jun Shu, Dongfang Yang and Deyu Meng
Remote Sens. 2026, 18(12), 2007; https://doi.org/10.3390/rs18122007 - 16 Jun 2026
Viewed by 144
Abstract
Unmanned aerial vehicle (UAV) visual navigation relies critically on robust image matching between UAV-acquired aerial imagery and pre-existing satellite reference maps. However, extreme cross-domain heterogeneity—encompassing temporal, radiometric, viewpoint, and sensor variations—causes severe performance degradation in existing deep learning-based matchers trained on conventional benchmarks. [...] Read more.
Unmanned aerial vehicle (UAV) visual navigation relies critically on robust image matching between UAV-acquired aerial imagery and pre-existing satellite reference maps. However, extreme cross-domain heterogeneity—encompassing temporal, radiometric, viewpoint, and sensor variations—causes severe performance degradation in existing deep learning-based matchers trained on conventional benchmarks. Furthermore, manual annotation of ground-truth correspondences is prohibitively expensive. This paper proposes a semi-supervised saliency-aware image matching framework with bi-level meta-learning. Our approach comprises two synergistic stages: (1) automated dense correspondence generation via parameterized geometric synthesis, which constructs a large-scale coarse dataset Dc (approximately 50,000 pairs) without dense manual point annotation, serving as the primary training corpus for the feature matching network; (2) expert-validated meta-data curation producing a high-quality meta-dataset Dm (500 pairs) that supervises the training of a Saliency Judgment Network through bi-level meta-optimization, enabling the network to identify and prioritize geometrically reliable correspondences. Experimental results on the proposed RS-Hetero-50K benchmark and cross-domain FuJian-Mountain dataset demonstrate substantial improvements over representative sparse and detector-free matchers, including LoFTR, SuperGlue, and LightGlue. The complete CNN-attention and saliency-aware framework achieves 95.4% matching precision, which is consistent with the best result reported in the experimental section. The plug-and-play experiments further confirm that the proposed saliency module consistently improves representative sparse and detector-free matchers, indicating that the performance gain stems from both stronger feature representation and saliency-guided correspondence selection. The largest terrain-specific gain is observed in gobi scenes, where the AUC@5 px improves by 16.8% relative to the LoFTR baseline, demonstrating improved robustness in weakly textured remote sensing environments. Full article
Show Figures

Figure 1

28 pages, 19369 KB  
Article
Nomenclatural Noise Amplifies Linnean and Wallacean Shortfalls in an Endemic Chilean Plant: The Case of Lardizabala biternata
by Jaime Herrera, Janice Zúñiga and Leonardo D. Fernández
Diversity 2026, 18(6), 331; https://doi.org/10.3390/d18060331 - 31 May 2026
Viewed by 580
Abstract
Nomenclatural noise is a major source of uncertainty in biodiversity research and can affect the accuracy of species distribution data. This study examines the nomenclatural history and geographical distribution of the Chilean endemic Lardizabala biternata, a culturally significant climbing plant that has [...] Read more.
Nomenclatural noise is a major source of uncertainty in biodiversity research and can affect the accuracy of species distribution data. This study examines the nomenclatural history and geographical distribution of the Chilean endemic Lardizabala biternata, a culturally significant climbing plant that has accumulated numerous synonyms, orthographic variants and misapplied names since the 19th century. This accumulation has fragmented available information, limiting current knowledge of its ecology, spatial distribution patterns and conservation status. We reviewed historical literature and digital repositories to clarify its taxonomic identity and compiled occurrence records from Global Biodiversity Information Facility (GBIF), iNaturalist and virtual herbaria. Records were manually verified based on concordance between coordinates, locality descriptions and the known distributional range, excluding duplicates, spatial inconsistencies and likely misidentifications. From an initial dataset of 1320 records, 632 (48%) were removed, resulting in 688 validated occurrences used to generate a distribution map and a Kernel Density Estimation (KDE) heatmap in Quantum Geographic Information System (QGIS). Nomenclatural inconsistencies (sixteen names identified) can introduce erroneous records and distort spatial patterns, generating apparent gaps and potentially spurious clusters in raw data. After curation, we found that the distribution of L. biternata is restricted to central Chile, with concentrations in coastal and pre-Andean areas. These results show how nomenclatural noise contributes to Linnean and Wallacean shortfalls and provide a baseline for future research on this species. Full article
(This article belongs to the Section Plant Diversity)
Show Figures

Figure 1

27 pages, 1161 KB  
Article
From PDF to RAG-Ready: Evaluating Document Conversion Frameworks for Domain-Specific Question Answering
by José Guilherme Marques dos Santos, Ricardo Yang, Rui Humberto Pereira, Alexandre Sousa, Brígida Mónica Faria, Henrique Lopes-Cardoso, José Duarte, José Luís Reis, Luís Paulo Reis, Pedro Pimenta and José Paulo Marques dos Santos
Appl. Sci. 2026, 16(10), 5069; https://doi.org/10.3390/app16105069 - 19 May 2026
Viewed by 515
Abstract
Retrieval-Augmented Generation (RAG) systems depend critically on the quality of document preprocessing, yet no prior study has evaluated PDF processing frameworks by their impact on downstream question-answering accuracy. We address this gap through a systematic comparison of four open-source PDF-to-Markdown conversion frameworks, Docling, [...] Read more.
Retrieval-Augmented Generation (RAG) systems depend critically on the quality of document preprocessing, yet no prior study has evaluated PDF processing frameworks by their impact on downstream question-answering accuracy. We address this gap through a systematic comparison of four open-source PDF-to-Markdown conversion frameworks, Docling, MinerU, Marker, and DeepSeek OCR, across 21 pipeline configurations, varying the conversion tool, cleaning transformations, splitting strategy, and metadata enrichment. Evaluation was performed using a 50-question benchmark over a corpus of 36 Portuguese administrative documents (1706 pages, ~492K words), with LLM-as-judge scoring over 50 independent runs per configuration. Statistical significance was assessed via Wilcoxon signed-rank tests with Cohen’s d effect sizes. Two baselines bounded the results: naïve PDFLoader (86.2%) and manually curated Markdown (91.3%). Docling with hierarchical splitting and image descriptions achieved the highest automated accuracy (94.1 ± 1.6%), surpassing even manual curation. A per-question-type analysis revealed that table-dependent questions drive the largest accuracy differences, with a 33-percentage-point gap between basic and hierarchical splitting. Metadata enrichment and hierarchy-aware chunking contributed more to accuracy than the conversion framework alone. An exploratory GraphRAG implementation underperformed basic RAG (82% vs. 94.1%). These findings demonstrate that data preparation quality is the dominant factor in RAG system performance. Full article
Show Figures

Figure 1

13 pages, 2012 KB  
Article
YoyoMut: Interactive Exploration of SARS-CoV-2 Mutation Fixation and Reversion Through Time
by Jana Penic, Tommaso Alfonsi, Giovanni Chillemi, Ingrid Guarnetti Prandi, Fabrizio Maggi, Anna Bernasconi and Daniele Focosi
Life 2026, 16(5), 776; https://doi.org/10.3390/life16050776 - 6 May 2026
Viewed by 425
Abstract
Reversion of amino acid mutations in structural proteins is common in viral evolution. SARS-CoV-2 provides an unprecedented opportunity for ecological studies, thanks to the abundance of available whole genome sequences. YoyoMut allows regular scanning of open SARS-CoV-2 data, reporting on all cyclic and [...] Read more.
Reversion of amino acid mutations in structural proteins is common in viral evolution. SARS-CoV-2 provides an unprecedented opportunity for ecological studies, thanks to the abundance of available whole genome sequences. YoyoMut allows regular scanning of open SARS-CoV-2 data, reporting on all cyclic and reverting mutations within all proteins (including Spike), with fine-grained trend visualization distinguishing non-mutated from mutated positions (either fixated or cyclically reversed). In the whole CoVSpectrum database, order of 100 reverting and 50 fixated mutations were identified on Spike. Classification is determined using alternative algorithms (based on threshold or slope inversion); finally, a 3D-protein structure allows us to identify spatial clustering of adjacent mutated positions. Systematic, automated monitoring of these behaviors aids immunologists and structuralists in their manual curation. By generating informative reports, our tool supports daily activities that have practical implications for vaccine and therapeutic anti-Spike monoclonal antibody design: prioritizing analysis of cyclic mutation and reversion models could help avoid the recent failures in their development and inform future strategies. Full article
(This article belongs to the Section Biodiversity, Ecology and Evolution)
Show Figures

Figure 1

27 pages, 4488 KB  
Article
A Neuro-Symbolic Bioinformatics Framework for Unlocking Chordate Physiological Dark Data and Validating Allometric Scaling
by Zhiyao Duan, Guihu Zhao, Changyun Li and Bo Liu
Biology 2026, 15(9), 708; https://doi.org/10.3390/biology15090708 - 30 Apr 2026
Viewed by 555
Abstract
Animal functional trait data are essential for macroecology, but massive datasets remain locked in unstructured scientific literature. Traditional manual extraction is inefficient, and general-purpose artificial intelligence (AI) systems struggle with complex biological tables and numerical accuracy. To address this bioinformatics challenge, we propose [...] Read more.
Animal functional trait data are essential for macroecology, but massive datasets remain locked in unstructured scientific literature. Traditional manual extraction is inefficient, and general-purpose artificial intelligence (AI) systems struggle with complex biological tables and numerical accuracy. To address this bioinformatics challenge, we propose a multimodal neuro-symbolic framework combining visual-language perception and code-based reasoning. This approach reconstructs complex document layouts and delegates biostatistical calculations, such as unit normalization and thermodynamic energy conversion, to an isolated programming environment to ensure mathematical and statistical consistency. By mining literature spanning 117 years, we constructed a high-fidelity physiological database for 1632 chordate species. Our method achieved a macro-averaged F1 score of 0.935 in extracting biophysical fields. External benchmarking against a curated mammalian trait database showed strong concordance for shared body-mass and metabolic-rate traits, while our database retained record-level provenance and physiological context. Furthermore, the extracted data reproduced classic allometric scaling relationships for basal metabolic rate and brain volume while preserving physiological adaptations, supporting the biological plausibility of the dataset. This study validates a reproducible bioinformatics pipeline that minimizes extraction artifacts and substantially reduces downstream mathematical and statistical conversion errors, while providing a scalable, complementary resource for building physiology-oriented trait databases from historical literature. Full article
(This article belongs to the Section Bioinformatics)
Show Figures

Figure 1

14 pages, 7605 KB  
Article
Automated Morphological Profiling via Deep Learning-Based Segmentation for High-Throughput Phenotypic Screening
by Bendegúz H. Zováthi and Philipp Kainz
J. Imaging 2026, 12(4), 179; https://doi.org/10.3390/jimaging12040179 - 21 Apr 2026
Viewed by 534
Abstract
Reproducible morphological profiling, particularly for drug discovery, has become an important tool for compound evaluation. Established workflows such as CellProfiler provide a widely adopted foundation for Cell Painting analysis. However, conventional pipelines often require substantial manual configuration and technical expertise, which can limit [...] Read more.
Reproducible morphological profiling, particularly for drug discovery, has become an important tool for compound evaluation. Established workflows such as CellProfiler provide a widely adopted foundation for Cell Painting analysis. However, conventional pipelines often require substantial manual configuration and technical expertise, which can limit scalability and accessibility. In this study, a fully automated deep learning-based workflow is presented for segmentation-driven morphological profiling from raw microscopy data. Using a curated subset of the JUMP Cell Painting pilot dataset, ground-truth masks were generated and used to train a U-net–based segmentation model in the IKOSA platform. Post-processing strategies were introduced to improve instance separation and reduce segmentation artifacts. The final model achieved strong segmentation performance (precision/recall/AP up to 0.98/0.94/0.92 for nuclei), with an average runtime of 2.2 s per 1080 × 1080 image. Segmentation outputs enabled large-scale feature extraction, yielding 3664 morphological descriptors that showed high correlation with CellProfiler-derived measurements (normalized MAE: 0.0298). Feature prioritization further reduced redundancy to 1145 informative descriptors. These results demonstrate that automated deep learning pipelines can complement established Cell Painting workflows by reducing configuration overhead while maintaining compatibility with validated morphological profiling standards. The proposed workflow may help improve resource efficiency in drug discovery and personalized medicine. Full article
(This article belongs to the Special Issue Imaging in Healthcare: Progress and Challenges)
Show Figures

Figure 1

18 pages, 5095 KB  
Article
Evaluation of MassFrontier, MetFrag, MS-FINDER, and SIRIUS for Metabolite Annotation Using an Experimental LC–HRMS Dataset
by Dmitrii A. Leonov, Irina A. Mednova and Alexander A. Chernonosov
Biomedicines 2026, 14(4), 872; https://doi.org/10.3390/biomedicines14040872 - 10 Apr 2026
Viewed by 883
Abstract
Background: Untargeted metabolomics enables comprehensive profiling of biological systems, but accurate metabolite annotation remains a critical bottleneck due to incomplete spectral libraries and structural isomerism. The use of in silico annotation tools can increase the coverage of annotated compounds, but it remains [...] Read more.
Background: Untargeted metabolomics enables comprehensive profiling of biological systems, but accurate metabolite annotation remains a critical bottleneck due to incomplete spectral libraries and structural isomerism. The use of in silico annotation tools can increase the coverage of annotated compounds, but it remains unclear whether these tools, in the absence of reference standards, can reliably annotate real-world experimental LC-HRMS data and whether they are sufficient for this task. Methods: This study assesses the performance and limitations of four widely used in silico structure prediction tools (MassFrontier, MetFrag, MS-FINDER, and SIRIUS/CSI:FingerID) when applied to an experimentally acquired feature set previously used to differentiate patients with depressive disorders from healthy controls. To ensure uniform evaluation across tools under realistic but optimized conditions, the quality of MS/MS data was improved using a parallel reaction monitoring method, allowing acquisition of interpretable fragmentation spectra for 26 of the 28 detected features. Results: For most features, all tools were able to suggest structure candidates. However, none of the tools proved sufficient as a standalone solution for reliable metabolite annotation. Due to their different algorithms, each tool had strengths and weaknesses in fragmentation interpretation, candidate generation, and ranking, resulting in incomplete or inconsistent annotations. While the combined application of all four tools provided a substantial improvement in putative annotation over conventional spectral library matching, the in silico structure prediction tools often prioritized chemically implausible, biologically irrelevant, or artifactual candidates. Consequently, manual expert evaluation was required to assess the chemical plausibility and biological relevance of the proposed structures. This ultimately reduced the number of biologically plausible metabolites putatively associated with disease to ten. Conclusions: Overall, these results demonstrate that existing in silico annotation tools can substantially support the annotation of experimental metabolomics data, but are insufficient on their own. Reliable identification of metabolites in complex biological matrices still depends on high-quality MS/MS data acquisition, the combined use of complementary tools, and mandatory post-annotation expert curation. Full article
(This article belongs to the Special Issue Applications of Mass Spectrometry in Biomedical Research)
Show Figures

Figure 1

12 pages, 1857 KB  
Article
PEPlife2: An Updated Repository of the Half-Life of Peptides and Proteins
by Urooj Alam, Kunal Chaudhary, Nishant Kumar, Ritu Tomer, Sumeet Patiyal and Gajendra P. S. Raghava
Immuno 2026, 6(2), 26; https://doi.org/10.3390/immuno6020026 - 8 Apr 2026
Viewed by 1923
Abstract
This manuscript presents an updated version of PEPlife, a manually curated database that provides extensive information on peptide half-life. The updated version, PEPlife2, contains 4500 total entries, including 2300 newly curated entries and 2200 entries from the previous PEPlife database. These entries correspond [...] Read more.
This manuscript presents an updated version of PEPlife, a manually curated database that provides extensive information on peptide half-life. The updated version, PEPlife2, contains 4500 total entries, including 2300 newly curated entries and 2200 entries from the previous PEPlife database. These entries correspond to 1673 unique peptide sequences and 257 unique protein sequences where different entries may refer to the same peptide/protein sequence, the half-life of which was evaluated using different experimental assays. Each entry contains detailed information, including experimental methods used to determine half-life, chemical modifications, biological activity, routes of administration, and other relevant data. In addition to unmodified peptide sequences, PEPlife2 includes cyclic peptides and chemically modified peptides, such as those with N- and C-terminal modifications. To provide structural insights, peptide and protein structures were sourced from the Protein Data Bank (PDB) or predicted using PEPstrMOD. PEPlife2 integrates advanced analytical tools including BLAST (version 2.7.1), Smith–Waterman and CLUSTALW. This database provides a valuable resource for peptide and protein therapeutics research, particularly in the design of immunotherapeutics and vaccines. Full article
Show Figures

Figure 1

35 pages, 11787 KB  
Article
A Data-Driven Framework for Predicting PHBV Biodegradation-Induced Weight Loss Based on Laboratory and Real-Environment Condition Tests
by Marianna I. Kotzabasaki, Leonidas Mindrinos, Nikolaos P. Sotiropoulos, Konstantina V. Filippou and Chrysanthos Maraveas
Polymers 2026, 18(7), 897; https://doi.org/10.3390/polym18070897 - 7 Apr 2026
Cited by 2 | Viewed by 650
Abstract
Polyhydroxyalkanoates (PHAs) emerge as promising biodegradable polymers for sustainable applications, yet predicting their biodegradation behavior under different environmental conditions remains challenging. In this study, we propose a novel data-driven computational framework for predicting biodegradation-induced weight/mass loss in PHA-based materials. A comprehensive database of [...] Read more.
Polyhydroxyalkanoates (PHAs) emerge as promising biodegradable polymers for sustainable applications, yet predicting their biodegradation behavior under different environmental conditions remains challenging. In this study, we propose a novel data-driven computational framework for predicting biodegradation-induced weight/mass loss in PHA-based materials. A comprehensive database of poly(3-hydroxybutyrate-co-3-hydroxyvalerate) (PHBV)-based formulations was manually curated by systematically collecting and harmonizing material descriptors, environmental parameters, and experimental biodegradation outcomes from laboratory- and large-scale studies conducted in soil, marine, freshwater, and compost environments. Multiple regression-based quantitative structure–activity relationship (QSAR) models were developed and rigorously validated, demonstrating high predictive performance and strong correlations between polymer structure, environmental conditions and degradation behavior. “Exposure time”, “degradation environment” and “hydroxybutyrate (HB) ratio” were identified as the most important features for weight loss. Finally, the predictive model was integrated into the Jaqpot computational platform, enabling open access and facilitating data-driven assessment and design of biodegradable polymer systems. Full article
(This article belongs to the Special Issue Advances in Modeling and Simulations of Polymers)
Show Figures

Figure 1

23 pages, 49319 KB  
Article
iLog 2.2: Volume and Nutrition Estimation for Mixed Foods via Mask R-CNN and Federated Learning
by Indira Devi Siripurapu, Laavanya Rachakonda, Saraju P. Mohanty and Elias Kougianos
Electronics 2026, 15(7), 1460; https://doi.org/10.3390/electronics15071460 - 1 Apr 2026
Viewed by 502
Abstract
Accurately estimating calorie intake and nutrient composition from what we eat remains one of the most practical challenges in maintaining a healthy lifestyle. Manual food logging and database-based estimations are often inaccurate because ingredient proportions and preparation styles vary widely. This paper presents [...] Read more.
Accurately estimating calorie intake and nutrient composition from what we eat remains one of the most practical challenges in maintaining a healthy lifestyle. Manual food logging and database-based estimations are often inaccurate because ingredient proportions and preparation styles vary widely. This paper presents a lightweight, privacy-preserving framework that estimates calories and detailed nutrient values from a single image. The model uses a Mask R-CNN-based segmentation network to identify visible food components, measure their area, estimate their volume using preset height values, and map them to nutritional information obtained from reliable datasets such as USDA and Food-a-pedia. The system integrates federated learning (FL) to ensure privacy by allowing the model to improve collaboratively without sharing raw user data. The proposed architecture achieved a mean Average Precision (mAP) of 96% for detection and 92% for segmentation, confirming its precision and efficiency. The model is trained and evaluated on a curated pizza dataset consisting of 1107 images across 50 topping categories, using a standard train-validation-test split (666/219/222) to ensure reliable performance assessment. The proposed system also achieves low nutrition estimation error, with calorie and nutrient deviations remaining within approximately 3.8% to 11.1% across evaluated metrics. A lightweight mobile interface is demonstrated through a Figma-based prototype mockup to illustrate potential real-world deployment and user interaction. Full article
Show Figures

Figure 1

23 pages, 527 KB  
Systematic Review
Knowledge Graph Applications in Cultural Heritage: A ROSES-Based Systematic Review
by Liangbing Zhu, Safawi Abdul Rahman and Hazila Timan
Information 2026, 17(3), 269; https://doi.org/10.3390/info17030269 - 9 Mar 2026
Viewed by 1519
Abstract
Knowledge Graphs (KGs) are increasingly adopted in cultural heritage research to address challenges of semantic heterogeneity, data fragmentation, and cross-institutional knowledge integration. Despite the rapid growth of KG-based heritage systems, a comprehensive and methodologically rigorous synthesis of existing applications remains limited. To address [...] Read more.
Knowledge Graphs (KGs) are increasingly adopted in cultural heritage research to address challenges of semantic heterogeneity, data fragmentation, and cross-institutional knowledge integration. Despite the rapid growth of KG-based heritage systems, a comprehensive and methodologically rigorous synthesis of existing applications remains limited. To address this gap, this study conducts a ROSES-based systematic review of KG applications in cultural heritage, aiming to examine prevailing application domains, methodological patterns, and emerging research trends. Following the Reporting Standards for Systematic Evidence Syntheses (ROSES), a structured search was conducted in Scopus, Web of Science, and IEEE Xplore. After duplicate removal, screening, eligibility assessment, and quality appraisal, 248 peer-reviewed studies published between 2015 and 2024 were retained for final synthesis. A mixed-method approach combining descriptive analysis and thematic synthesis was employed to analyze KG construction strategies, technological components, application contexts, and reported outcomes. The results indicate that KGs are primarily applied in five interconnected areas: digital recording and preservation, knowledge management and integration, protection and restoration support, cultural transmission and education, and research and innovation. Methodologically, the literature reveals a transition from ontology-driven and manually curated knowledge models toward hybrid approaches integrating artificial intelligence techniques such as natural language processing and machine learning. However, persistent challenges remain, including ontology alignment, scalability, evaluation inconsistency, and limited cross-project interoperability. This review contributes a consolidated and transparent evidence base for KG applications in cultural heritage and advances a conceptual understanding of KGs as socio-technical infrastructures that mediate cultural knowledge representation and interpretation. The findings offer methodological insights and practical implications for researchers, heritage professionals, and system designers, while highlighting directions for future interdisciplinary research. Full article
(This article belongs to the Section Information Applications)
Show Figures

Figure 1

24 pages, 2114 KB  
Article
An Integrated Framework for Automated Identification of Workers’ Safety Violation Based on Knowledge Graph
by Yifan Zhu, Yewei Ouyang, Rui Pan, Zhanhui Sun, Yang Zhou, Rui Ma, Baoquan Cheng and Wen Wang
Buildings 2026, 16(5), 1037; https://doi.org/10.3390/buildings16051037 - 6 Mar 2026
Viewed by 582
Abstract
Automatic identification of worker safety violations can substantially strengthen construction-site safety management by enabling continuous, real-time monitoring. Although recent advances have made automated detection feasible, many existing systems still suffer from poor adaptability and limited extensibility. To address these limitations, this study proposes [...] Read more.
Automatic identification of worker safety violations can substantially strengthen construction-site safety management by enabling continuous, real-time monitoring. Although recent advances have made automated detection feasible, many existing systems still suffer from poor adaptability and limited extensibility. To address these limitations, this study proposes an integrated, knowledge graph-based framework for automatic identification of workers’ safety violations. The framework comprises two principal components: (1) a knowledge graph construction module that encodes domain knowledge (safety regulations, task–hazard relationships, and contextual constraints) into a machine-readable graph structure and (2) a graph-enabled violation identification module that maps structured scene descriptions of worker and environmental states to the knowledge graph and performs semantic inference to detect violations. In this study, these structured scene descriptions are manually specified and simulated as subject–predicate–object triplets; integration with raw sensing data is left for future work. For validation, we construct a knowledge graph containing 1200 safety rules and evaluate the violation identification module on 500 annotated examples representing realistic worker scenarios. Using this curated knowledge graph and structured inputs, the proposed approach achieves an identification accuracy of 97.6% for unsafe worker behaviors. Experimental analysis shows that the knowledge graph representation substantially improves the system’s expandability and interpretability compared with traditional hard-coded rules, facilitating easier incorporation of new rules and multimodal sensing inputs. The results indicate that knowledge graph-driven reasoning offers a practical, scalable pathway for robust, context-aware safety violation detection in varied construction environments. Full article
Show Figures

Figure 1

30 pages, 2658 KB  
Article
Sustainable Smart Urban Governance Enabled by Context-Aware QR Codes: A Scalable Framework for Property Visualisation in Saudi Arabia
by Mohammed Ali R. Alzahrani
Sustainability 2026, 18(5), 2374; https://doi.org/10.3390/su18052374 - 28 Feb 2026
Viewed by 739
Abstract
The digitisation of urban governance requires a context-sensitive method that balances operational efficiency, data security and transparency. This study proposes a context-sensitive QR code system as a conceptual framework for smart urban governance and real estate visualisation in Saudi Arabia, aligned with the [...] Read more.
The digitisation of urban governance requires a context-sensitive method that balances operational efficiency, data security and transparency. This study proposes a context-sensitive QR code system as a conceptual framework for smart urban governance and real estate visualisation in Saudi Arabia, aligned with the strategic objectives of Vision 2030. Unlike traditional static QR code applications, the proposed system acts as a smart urban interface dynamically linking physical buildings to structured digital records and delivering role-specific information through a single scan. This system enables municipal authorities to retrieve compliance and regulatory data and allows emergency response teams to access real-time occupancy data with geographic coordinates. The proposed system enables visitors to explore curated heritage and site-based information, with each interface subject to policy-defined access rules. The proposed QR code system is evaluated by using a scenario-based computational simulation across three representative Saudi cities (Riyadh, Jeddah, and Dammam), and the results show that it significantly reduces service response time compared to manual processes while maintaining data integrity through role-based dynamic filtering. The proposed system enhances administrative efficiency and supports heritage preservation in sensitive areas such as the Al-Balad district in Jeddah city. By integrating governance, visualisation, and cultural sustainability within a simple, scalable and interactive model, the study provides an important framework for emerging smart cities in Saudi Arabia. Full article
Show Figures

Figure 1

22 pages, 1247 KB  
Article
An Integrated Text Mining Approach for Discovering Pharmacological Effects, Drug Combinations, and Repurposing Opportunities of ACE Inhibitors
by Nadezhda Yu. Biziukova, Polina I. Savosina, Dmitry S. Druzhilovskiy, Olga A. Tarasova and Vladimir V. Poroikov
Int. J. Mol. Sci. 2026, 27(4), 2044; https://doi.org/10.3390/ijms27042044 - 22 Feb 2026
Viewed by 548
Abstract
The rapidly expanding body of biomedical literature encompasses a wealth of information concerning the pharmacological effects, mechanisms of action, adverse reactions, and repurposing potential of small-molecule therapeutics. Nevertheless, the systematic extraction and integration of this knowledge continue to pose substantial challenges. In this [...] Read more.
The rapidly expanding body of biomedical literature encompasses a wealth of information concerning the pharmacological effects, mechanisms of action, adverse reactions, and repurposing potential of small-molecule therapeutics. Nevertheless, the systematic extraction and integration of this knowledge continue to pose substantial challenges. In this study, we propose an integrated text-mining framework for the automated extraction and structured representation of information on the biological activities of low-molecular-weight compounds, exemplified by angiotensin-converting enzyme (ACE) inhibitors as a representative pharmacological class. A corpus comprising over 20,000 PubMed titles and abstracts reporting in vitro, in vivo, and clinical investigations of ACE inhibitors was assembled. Chemical compounds, proteins/genes, and diseases were recognized using a previously developed named entity recognition model based on conditional random fields. Entity-level associations were extracted at the sentence level through a rule-based approach employing manually curated pattern phrases, followed by normalization via automated queries to PubChem, UniProt, and the Human Disease Ontology. The proposed methodology facilitated the extraction of approximately 22,000 unique and normalized associations encompassing drug-target, drug-disease, and drug-drug relationships. In addition to confirming well-established therapeutic effects and clinically recognized drug combinations, the analysis identified underexplored pharmacological activities of ACE inhibitors, including antineoplastic, antifibrotic, and neuropsychiatric properties, along with mechanistic associations involving matrix metalloproteinases and neurotrophic signaling pathways. Collectively, these findings underscore the potential of automated literature mining to advance systematic knowledge integration and data-driven hypothesis generation in the contexts of drug repurposing and safety evaluation. Full article
Show Figures

Figure 1

Back to TopTop