Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (3,696)

Search Parameters:
Keywords = synthetic data generation

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 318 KB  
Article
The Semantic Web of Retail: A Taxonomic Integration of Web 3.0, Decentralized E-Commerce, and Agentic Commerce
by Arturs Bernovskis and Deniss Sceulovs
J. Risk Financial Manag. 2026, 19(5), 330; https://doi.org/10.3390/jrfm19050330 - 3 May 2026
Abstract
This is a conceptual paper on next-generation digital trade that proposes a multi-layered taxonomic integration of Web 3.0, decentralized e-commerce, and the emerging paradigm of Agentic Commerce. While current literature often conflates technological infrastructure with institutional governance, this paper utilizes a bibliometric diagnostics [...] Read more.
This is a conceptual paper on next-generation digital trade that proposes a multi-layered taxonomic integration of Web 3.0, decentralized e-commerce, and the emerging paradigm of Agentic Commerce. While current literature often conflates technological infrastructure with institutional governance, this paper utilizes a bibliometric diagnostics and Natural Language Processing (NLP) BERT clustering of 25 core empirical studies to delineate these boundaries. We introduce the “Semantic Web of Retail” as a foundational data layer, arguing that it is a structural necessity for the Machine-to-Machine (M2M) economy, where autonomous AI agents, or “synthetic shoppers,” execute transactions on behalf of human principals. Our results indicate that while Web 3.0 provides the technological toolkit for programmable ownership, decentralized e-commerce dictates the institutional logic required for trustless verification. Furthermore, we identify a “Shopper Schism” in consumer behavior, where the delegation of economic power to algorithms introduces novel financial risks, including oracle vulnerabilities and principal–agent moral hazards. The study concludes that integrating semantic interoperability with decentralized transaction rails is essential for mitigating systemic risks and enabling secure, autonomous digital markets, and it formalizes the ‘Shopper Schism’ as a novel principal–agent configuration unique to agentic markets. Full article
27 pages, 3290 KB  
Article
Neural Network Copulas for Generating Synthetic Test Data Preserving Psychometric Properties
by Juyoung Jung, Minho Lee and Won-Chan Lee
J. Intell. 2026, 14(5), 77; https://doi.org/10.3390/jintelligence14050077 - 2 May 2026
Abstract
In intelligence research, the sharing of item response data from cognitive ability assessments is often restricted by privacy concerns, while traditional parametric simulation methods frequently fail to capture complex response dependencies. This study proposes a neural network copula (NNC) framework for generating synthetic [...] Read more.
In intelligence research, the sharing of item response data from cognitive ability assessments is often restricted by privacy concerns, while traditional parametric simulation methods frequently fail to capture complex response dependencies. This study proposes a neural network copula (NNC) framework for generating synthetic dichotomous item response data that preserves essential psychometric properties without revealing sensitive examinee information. By decoupling the modeling of marginal item probabilities from the dependence structure using a deep autoencoder and kernel density estimation, the framework accommodates the discrete nature of binary item response data while minimizing distributional assumptions. Validation against large-scale empirical data demonstrated high correspondence across multiple facets. At the data consistency level, the NNC-based synthetic data reproduced total score distributions and inter-item correlations. Psychometrically, the method yielded consistent item characteristic curve parameter estimates, item fit statistics, and test information functions. Furthermore, Monte Carlo replications demonstrated algorithmic stability and inferential precision. Full article
Show Figures

Figure 1

47 pages, 2688 KB  
Article
Integrating Veterinary Public Health Data into EPCIS-Based Digital Traceability for Dairy Supply Chains
by Stavroula Chatzinikolaou, Giannis Vassiliou, Mary Gianniou, Michalis Vassalos and Nikolaos Papadakis
Foods 2026, 15(9), 1566; https://doi.org/10.3390/foods15091566 - 1 May 2026
Viewed by 36
Abstract
Dairy foods—particularly cheeses produced from raw or minimally processed milk—remain vulnerable to hazards such as Listeria monocytogenes, where delayed laboratory confirmation can expand recalls, increase food waste, and delay outbreak containment. This study proposes a veterinary-aware digital traceability framework that embeds herd health [...] Read more.
Dairy foods—particularly cheeses produced from raw or minimally processed milk—remain vulnerable to hazards such as Listeria monocytogenes, where delayed laboratory confirmation can expand recalls, increase food waste, and delay outbreak containment. This study proposes a veterinary-aware digital traceability framework that embeds herd health data, milk-quality testing, and inspection outcomes directly into batch-level EPCIS event records. By representing veterinary public health controls as structured, machine-actionable traceability elements, the framework enables automatic logging of mandatory control points, systematic compliance verification, and rule-based risk state transitions within standard EPCIS infrastructures. Using regulation-consistent dairy simulations modeling delayed Listeria detection during maturation, we evaluate the operational impact of event-level causal traceability within the proposed architecture. Compared with conventional time-window recall strategies, provenance-based trace-forward queries reduced recall scope under the evaluated synthetic scenarios. Integrating structured veterinary controls into EPCIS-based traceability systems supports automated regulatory evidence generation and more targeted recall decisions, contributing to improved auditability and reduced food waste in dairy supply chains. Full article
(This article belongs to the Section Food Security and Sustainability)
27 pages, 2447 KB  
Article
A Sequential Cooperative Inversion Framework of DC Resistivity and Frequency-Domain Electromagnetic Data to Enhance Subsurface Imaging in Geoscience and Engineering
by Ramin Varfinezhad, Saeed Parnow, Francois Daniel Fourie and Fabio Tosti
Remote Sens. 2026, 18(9), 1404; https://doi.org/10.3390/rs18091404 - 1 May 2026
Viewed by 142
Abstract
The characterisation of subsurface electrical resistivity is a fundamental requirement for geoscientific and engineering applications, including groundwater exploration and structural assessments. This study examines the sequential cooperative inversion of direct current resistivity and frequency-domain electromagnetic data and compares the results to the inverse [...] Read more.
The characterisation of subsurface electrical resistivity is a fundamental requirement for geoscientific and engineering applications, including groundwater exploration and structural assessments. This study examines the sequential cooperative inversion of direct current resistivity and frequency-domain electromagnetic data and compares the results to the inverse models obtained from separate (individual) inversions of the datasets. The proposed cooperative framework is applied to both synthetic datasets generated through forward modelling and field data acquired at the Morgenzon Farm site, South Africa, to delineate a dolerite dyke of hydrogeological significance. Individual inversions identified distinct features but exhibit limitations: direct current resistivity highlights a two-layered medium with minor anomalies, while frequency-domain electromagnetic data identify a resistive anomaly. In contrast, the sequential cooperative inversion approach, which uses the output of one dataset to constrain the other, provides improved subsurface imaging results, reduces ambiguity, and enables the integration of complementary information from both methods. The results indicate that resistivity models constrained by inverse frequency-domain electromagnetic data provide improved representation of subsurface geometry and amplitude compared to individual approaches. These findings support the use of a non-destructive testing approach for improved subsurface imaging, facilitating better-informed decision-making in infrastructure projects and resource management Full article
38 pages, 888 KB  
Article
Data-Centric AI Manifesto: How Data Quality Drives Modern AI
by Donato Malerba, Antonella Poggi, Mario Alviano, Tommaso Boccali, Maria Teresa Camerlingo, Roberto Maria Delfino, Domenico Diacono, Domenico Elia, Vincenzo Pasquadibisceglie, Mara Sangiovanni, Vincenzo Spinoso and Gioacchino Vino
Electronics 2026, 15(9), 1913; https://doi.org/10.3390/electronics15091913 - 1 May 2026
Viewed by 149
Abstract
Artificial Intelligence (AI) has traditionally been developed according to a model-centric paradigm, in which progress is driven by increasingly sophisticated learning architectures applied to largely fixed datasets. However, this paradigm exhibits well-known limitations, including sensitivity to label noise, distribution shifts, adversarial perturbations, and [...] Read more.
Artificial Intelligence (AI) has traditionally been developed according to a model-centric paradigm, in which progress is driven by increasingly sophisticated learning architectures applied to largely fixed datasets. However, this paradigm exhibits well-known limitations, including sensitivity to label noise, distribution shifts, adversarial perturbations, and limited transparency and reproducibility. These issues indicate that many of the current bottlenecks of AI systems arise from deficiencies in data rather than from model design. In this paper, we adopt and formalize the Data-Centric Artificial Intelligence (DCAI) paradigm, which places data quality, semantic consistency, and representativeness at the core of the AI lifecycle. From this perspective, performance, robustness, interpretability, and regulatory compliance are primarily achieved through systematic data engineering, including data curation, enrichment, validation, and continuous monitoring, rather than through repeated model re-engineering. The contributions of this work are threefold. First, a conceptual framework is provided to clarify the epistemic and methodological foundations of DCAI and distinguish it from traditional model-centric approaches. Second, a data-centric lifecycle is presented, covering training data development, inference data design, and data maintenance and integrating techniques such as semantic data representation, active learning, synthetic data generation, and drift-aware quality control. Third, the role of DCAI in the context of Generative AI is analyzed, showing how data-centric practices are essential to ensure robustness, accountability, and responsible deployment of large-scale generative models. Overall, this work positions DCAI as a coherent methodological and technological framework for the development of trustworthy, resilient, and sustainable AI systems, making a research contribution and providing a reference model for industrial and regulatory contexts. Full article
15 pages, 739 KB  
Technical Note
Large Language Models for Clinical Narrative Processing: Methods, Applications, and Challenges
by Achilleas Livieratos, Junjing Lin, Paraskevi Chasani, Mina Gaga, Fotios S. Fousekis, Charalambos Gogos, Karolina Akinosoglou, Konstantinos H. Katsanos and Margaret Gamalo
Methods Protoc. 2026, 9(3), 69; https://doi.org/10.3390/mps9030069 - 1 May 2026
Viewed by 158
Abstract
Large language models (LLMs) have rapidly advanced natural language processing and are increasingly used to analyze clinical narratives. Their ability to extract information, summarize records, and support clinical workflows makes them potential tools for enhancing documentation efficiency and the secondary application in the [...] Read more.
Large language models (LLMs) have rapidly advanced natural language processing and are increasingly used to analyze clinical narratives. Their ability to extract information, summarize records, and support clinical workflows makes them potential tools for enhancing documentation efficiency and the secondary application in the analysis of electronic health record (EHR) data. The aim of this work is to synthesize recent evidence on methodological approaches and applications of LLMs for clinical narrative processing, and to assess their performance, benefits, limitations, and implications for clinical practice. Across 2022–2026 studies, LLMs demonstrated strong performance in information extraction, summarization, triage prediction, section classification, and synthetic text generation, often surpassing traditional machine-learning models. Overall, LLMs improved the conversion of unstructured notes into actionable clinical insights, reduced documentation burden, and supported decision-making tasks. Key challenges included hallucinations, variable reproducibility, sensitivity to prompting, domain adaptation gaps, and limited transparency. Our findings indicate that LLMs show substantial promise for transforming clinical narrative processing, but safe adoption requires rigorous evaluation and continuous model auditing. This work provides a structured, non-systematic synthesis of representative studies and is intended as a high-level overview of emerging applications rather than a comprehensive systematic review. Full article
(This article belongs to the Section Public Health Research)
Show Figures

Figure 1

25 pages, 3787 KB  
Review
Implementation of Generative AI in Biomedical Research and Healthcare
by Anastasios Nikolopoulos and Vangelis D. Karalis
Appl. Biosci. 2026, 5(2), 34; https://doi.org/10.3390/applbiosci5020034 - 1 May 2026
Viewed by 87
Abstract
Artificial intelligence has evolved to generative AI (GenAI), a paradigm shift that has shifted the emphasis away from the evaluation of existing patterns to the generation of novel biological and medical material. This study examines GenAI achievements in biosciences and medical fields the [...] Read more.
Artificial intelligence has evolved to generative AI (GenAI), a paradigm shift that has shifted the emphasis away from the evaluation of existing patterns to the generation of novel biological and medical material. This study examines GenAI achievements in biosciences and medical fields the last five years in these fields using databases such as PubMed and Scopus. The paper highlights the recent evolution in biomedical research from virtual screening to de novo design. It illustrates how models like RFdiffusion and ProteinMPNN leverage “inverse folding” to assemble novel of proteins and drugs. Ultimately, these generative methods yield candidate with enhanced binding affinity and structural stability. For example, exploratory studies suggest GenAI has the potential to address inefficiencies via automatic documentation in the therapeutic sector, and it may enhance research capabilities by using Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs) to generate synthetic clinical trial data that preserves confidentiality. In addition, the review argues that though GenAI democratizes medical education through scalable simulations, it raises questions about long-term knowledge retention. Finally, GenAI also offers a transformative “write” capability for biology, but its responsible application will require addressing model “hallucinations” and building Explainable AI (XAI) and robust ethical frameworks. Full article
(This article belongs to the Special Issue Feature Reviews for Applied Biosciences)
Show Figures

Figure 1

9 pages, 1210 KB  
Data Descriptor
Preferred Colleague Dataset: A Human-Annotated Dataset of Perceived Colleague Preference
by Deepu Krishnareddy, Bakir Hadžić, Hamid Gazerpour, Michael Danner, Zhuoqi Zeng and Matthias Rätsch
Data 2026, 11(5), 100; https://doi.org/10.3390/data11050100 - 1 May 2026
Viewed by 124
Abstract
Recruitment is a time-consuming process, and AI systems are increasingly being used to support the decision-making process. However, machine learning models used in such systems can inherit bias if the underlying training data reflects biased human preferences. It is essential to analyze and [...] Read more.
Recruitment is a time-consuming process, and AI systems are increasingly being used to support the decision-making process. However, machine learning models used in such systems can inherit bias if the underlying training data reflects biased human preferences. It is essential to analyze and quantify these biases in order to develop fairer AI systems. To address this issue, we collected human judgments of colleague preference for 2200 face images. The face image set includes images of different ethnicities and genders, as well as both real and synthetically generated faces. The images were annotated by humans from diverse backgrounds in terms of age, gender, and ethnicity. Annotators were shown series of pairs of face images and asked to select which individual they would prefer as a colleague. We gathered responses from 451 annotators and aggregated the annotations to compute a preference score for each image. This dataset provides a basis for understanding human bias in colleague preference and can support the development of fair and unbiased AI models for use in recruitment settings. Full article
Show Figures

Figure 1

27 pages, 1898 KB  
Article
Parallel Bilingual Datasets: A Multimodal Deep Learning Framework for Proficiency and Style Classification
by Padmavathi Kesavan, Miranda Lakshmi Travis, Martin Aruldoss and Martin Wynn
Multimodal Technol. Interact. 2026, 10(5), 47; https://doi.org/10.3390/mti10050047 - 30 Apr 2026
Viewed by 89
Abstract
This study presents a multimodal deep learning framework for automatic proficiency and style classification of parallel Bilingual Tamil–Hindi learner data. The proposed system employs a dual-headed neural architecture to simultaneously predict proficiency levels (Basic, Advanced) and stylistic categories (Formal, Literary) using shared feature [...] Read more.
This study presents a multimodal deep learning framework for automatic proficiency and style classification of parallel Bilingual Tamil–Hindi learner data. The proposed system employs a dual-headed neural architecture to simultaneously predict proficiency levels (Basic, Advanced) and stylistic categories (Formal, Literary) using shared feature representations. A curated dataset of bilingual text samples is utilized, along with synthetic speech generated through text-to-speech (TTS) to enable controlled multimodal experimentation. Five deep learning architectures are evaluated under text-only, audio-only, and learnable fusion settings. Experimental findings indicate that text-based models consistently achieve strong performance in both proficiency and style classification tasks. In contrast, the audio-only model demonstrates limited effectiveness, highlighting the constraints of synthetic acoustic features in capturing meaningful linguistic information. The fusion models provide only marginal improvements over text-based approaches, suggesting that textual representations play a dominant role in proficiency and stylistic classification within controlled datasets. These results emphasize the importance of linguistic features over acoustic signals for automated language assessment in low-resource settings. The proposed framework provides a scalable and reproducible approach and offers a foundation for future work incorporating real speech data and more diverse linguistic inputs. Full article
28 pages, 2998 KB  
Article
SHAP-Value-Weighted Case-Based Reasoning Model with Improved Mixup Data Augmentation for Software Effort Estimation
by Jing Li, Han Zhang, Shengxiang Sun, Mingchi Lin, Sishi Liu, Chen Zhu and Kai Li
Information 2026, 17(5), 431; https://doi.org/10.3390/info17050431 - 30 Apr 2026
Viewed by 152
Abstract
Software effort estimation (SEE) serves as a cornerstone of effective software project management, and case-based reasoning (CBR) stands out as one of the most extensively adopted approaches within this domain. Nevertheless, CBR-based SEE models are still plagued by two critical challenges: conventional case [...] Read more.
Software effort estimation (SEE) serves as a cornerstone of effective software project management, and case-based reasoning (CBR) stands out as one of the most extensively adopted approaches within this domain. Nevertheless, CBR-based SEE models are still plagued by two critical challenges: conventional case retrieval mechanisms lack the ability to differentiate the relative importance of various features, and data scarcity remains a persistent bottleneck. Both issues significantly compromise the estimation accuracy and interpretability of the models. To address these limitations, we propose a SHAP–Mixup synergistic framework that enhances both feature-aware similarity learning and data distribution modeling. Specifically, we introduce (1) a stability-aware SHAP-weighted similarity metric that integrates both the magnitude and variance of feature contributions to improve retrieval robustness, and (2) a density-aware Mixup augmentation strategy that generates synthetic samples guided by local data manifold structure rather than random interpolation. Experimental results on seven benchmark datasets demonstrate that the proposed method reduces MAE and MSE by up to 20.2% on average compared to baseline CBR models, while consistently improving Pred(0.25). Furthermore, by enhancing model interpretability, the proposed method equips project managers with actionable insights into the key drivers of software effort, thereby facilitating more informed and efficient resource allocation. Building on these findings, this study provides a novel and effective pathway for developing SEE models that are more accurate, robust, and transparent. Full article
(This article belongs to the Special Issue Artificial Intelligence and Decision Support Systems)
Show Figures

Figure 1

34 pages, 2208 KB  
Review
Next-Generation Artificial Intelligence Strategies for Mechanistic Cancer Target Discovery and Drug Development: A State-of-the-Art Review
by Muhammad Sohail Khan, Muhammad Saeed, Muhammad Arham, Imran Zafar, Majid Hussian, Adil Jamal, Muhammad Usman, Fayez Saeed Bahwerth, Gabsik Yang and Ki Sung Kang
Int. J. Mol. Sci. 2026, 27(9), 4028; https://doi.org/10.3390/ijms27094028 - 30 Apr 2026
Viewed by 122
Abstract
Artificial intelligence (AI) is increasingly used in cancer research, enabling integrative analysis of complex biomedical data to identify actionable therapeutic vulnerabilities. This review specifically examines how AI advances mechanistic cancer target discovery and translational drug development, focusing on: (1) the processing of large-scale [...] Read more.
Artificial intelligence (AI) is increasingly used in cancer research, enabling integrative analysis of complex biomedical data to identify actionable therapeutic vulnerabilities. This review specifically examines how AI advances mechanistic cancer target discovery and translational drug development, focusing on: (1) the processing of large-scale genomics, transcriptomics, proteomics, metabolomics, single-cell profiling, spatial, and clinical datasets using machine learning (ML) and deep learning (DL) algorithms; (2) the identification of candidate biomarkers, driver genes, dysregulated pathways, tumor dependencies, and molecular targets that traditional methods often miss; (3) the integration of multi-omics data, network biology, causal inference, and systems-level modeling to refine mechanistic understanding of cancer progression and separate functional driver events from passengers; and (4) applications in drug development, including virtual screening, molecular modeling, structure-informed target validation, drug repurposing, synthetic lethality prediction, and de novo drug design, which collectively may enhance early-stage drug discovery efficiency. The review underscores that AI serves as both a predictive tool and a platform for linking molecular mechanisms to hypothesis generation, target prioritization, and rational treatment design. Challenges such as data heterogeneity, algorithmic bias, interpretability, reproducibility, regulatory requirements, and patient privacy must be addressed for robust translation and clinical use. Future directions may focus on hybrid approaches that integrate causal modeling, explainable AI, multimodal data, and experimental validation to yield mechanistically grounded, clinically actionable insights. AI-driven approaches ultimately aim to accelerate mechanism-based cancer target discovery and enable more precise, biologically informed anticancer therapies. Full article
30 pages, 4825 KB  
Article
Constructing a Ship Collision Accident Dataset Using Template-Based Corpus and Named Entity Recognition
by Xinsheng Zhang, Liwen Huang, Shiyong Huang, Pengfei Chen and Junmin Mou
J. Mar. Sci. Eng. 2026, 14(9), 832; https://doi.org/10.3390/jmse14090832 - 30 Apr 2026
Viewed by 158
Abstract
Ship collisions pose substantial risks to maritime safety, causing vessel damage, casualties, and environmental impacts. Efficient extraction and analysis of key navigational and causal information from accident reports are important for risk assessment and decision support. This study proposes a framework for synthetic [...] Read more.
Ship collisions pose substantial risks to maritime safety, causing vessel damage, casualties, and environmental impacts. Efficient extraction and analysis of key navigational and causal information from accident reports are important for risk assessment and decision support. This study proposes a framework for synthetic data generation, DistilBERT-based named entity recognition, and structured dataset construction for ship collision accidents. Using a template-based method, 56,000 annotated sentences were generated, covering navigational elements and causal factor trigger phrases. The fine-tuned DistilBERT model showed good performance on both synthetic and real accident reports. Statistical and co-occurrence analyses further indicated that failure to maintain proper lookout, failure to take effective evasive action, and failure to maintain safe speed were the main contributing factors across different environments and accident severity levels. Based on the extraction results, a standardized structured dataset was constructed to support subsequent causal analysis, dynamic risk modeling, and collision risk prediction. The study shows that combining template-based data synthesis with Transformer-based named entity recognition is a feasible approach for extracting information from maritime accident reports and transforming unstructured text into structured datasets. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

21 pages, 3383 KB  
Article
A Synthetic Data Generation Framework for the Development of Computer Vision Applications in Manufacturing
by Kosmas Alexopoulos, Christos Manettas, Dimitrios Tsikos and Nikolaos Nikolakis
Appl. Sci. 2026, 16(9), 4388; https://doi.org/10.3390/app16094388 - 30 Apr 2026
Viewed by 111
Abstract
Machine learning techniques are increasingly used for computer vision applications in manufacturing. Synthetic data, generated through realistic simulations, are utilized to accelerate the data collection process while optimizing accuracy and precision of ML models. However, in manufacturing there is usually the need for [...] Read more.
Machine learning techniques are increasingly used for computer vision applications in manufacturing. Synthetic data, generated through realistic simulations, are utilized to accelerate the data collection process while optimizing accuracy and precision of ML models. However, in manufacturing there is usually the need for the development of several CV applications that support different production steps. This obstacle requires a systematic approach for generating synthetic datasets that can be used for developing effective CV systems. Hence, this work presents a pipeline for generating photorealistic synthetic datasets, using a set of digital tools such as 3D modeling, photorealistic rendering, automated labeling, and ML training tools. The proposed framework is tested and validated in a robot-assisted packaging case in the dairy industry. The industrial use case provides a pilot-level demonstration that the synthetic dataset generation framework can support the development of CV modules across several production steps and thus it can aid in accelerating commissioning and reconfiguration of industrial automation setups. Moreover, the pilot validation indicates that object detection and recognition models trained on synthetic data can provide sufficient performance for the specific requirements of the examined packaging scenario. Full article
Show Figures

Figure 1

18 pages, 1689 KB  
Article
Biogas Prediction Enhancement for a Swine Farm Bio-Digester Using a Lag-Based Surrogate Machine Learning Model
by María Estela Montes-Carmona, Ivan Andres Burgos-Castro, Rogelio de Jesús Portillo-Vélez, Pedro Javier García-Ramírez, Luis Felipe Marín-Urías and Miguel Ángel Hernández-Pérez
Processes 2026, 14(9), 1452; https://doi.org/10.3390/pr14091452 - 30 Apr 2026
Viewed by 132
Abstract
Biogas production estimation has been one of the most important and challenging objectives for anaerobic digestion processes due to the complexity of its dynamics and the lack of high-quality open-access datasets. This study presents a hybrid modeling framework that combines a mechanistic model, [...] Read more.
Biogas production estimation has been one of the most important and challenging objectives for anaerobic digestion processes due to the complexity of its dynamics and the lack of high-quality open-access datasets. This study presents a hybrid modeling framework that combines a mechanistic model, based on ordinary differential equations (ODEs), with a machine learning model. Rather than relying exclusively on experimental data, the proposed approach leverages physics-informed synthetic data generation, complemented by a lag-based feature engineering to capture inherent temporal dependencies in the process dynamics available in operational data of a bio-digester. Two configurations were evaluated: a baseline model and an enhanced version incorporating lag features and a simplified temperature profile. This specific computational enhancement provides a robust predictive core that successfully avoids the severe predictive degradation observed in purely mechanistic approaches at high spatial discretizations. While the improved surrogate model achieved high predictive performance (R2=0.9788, RMSE=131.80 [L/d]), additional analyses reveal that this resilience is driven by temporal memory and remains sensitive to noise and feature composition. Instead of presenting the model as a final independent physical validation, this work is rigorously framed as a proof-of-concept digital twin core, acknowledging the gap that still exists between simulation-based ODE emulation and unstructured real-world reliability. Full article
Show Figures

Figure 1

24 pages, 1395 KB  
Article
Decision Support Framework for Post-War Infrastructure Revitalization Using a Hybrid Fuzzy–Simulation–ANN Model
by Roman Trach, Iurii Chupryna, Ruslan Tormosov, Viktor Leshchynsky, Yuliia Trach, Galyna Ryzhakova, Dmytro Ratnikov and Oleh Onofriichuk
Appl. Sci. 2026, 16(9), 4364; https://doi.org/10.3390/app16094364 - 29 Apr 2026
Viewed by 130
Abstract
Post-war reconstruction requires effective decision-support tools capable of integrating technical, economic, and organizational criteria under conditions of high uncertainty. The evaluation and prioritization of damaged buildings for recovery interventions are critical challenges for reconstruction project management. This study proposes a hybrid decision-support framework [...] Read more.
Post-war reconstruction requires effective decision-support tools capable of integrating technical, economic, and organizational criteria under conditions of high uncertainty. The evaluation and prioritization of damaged buildings for recovery interventions are critical challenges for reconstruction project management. This study proposes a hybrid decision-support framework for assessing the strategic feasibility of building recovery using a novel Strategic Revitalization Index (SRI). The proposed methodology integrates a hierarchical fuzzy inference system, simulation techniques, and an artificial neural network surrogate model. The fuzzy model aggregates four key evaluation dimensions: technical condition of the building, economic feasibility of recovery actions, project implementation capability, and environmental and social impact. To analyze the model’s behavior and generate training data, a synthetic dataset was created using Latin Hypercube Sampling, covering a wide range of possible reconstruction conditions. The generated dataset was subsequently used to train an artificial neural network capable of approximating the nonlinear mapping implemented by the fuzzy decision model. The obtained results demonstrate high predictive performance of the surrogate model, with R2 = 0.976, RMSE = 0.0266, MAE = 0.0133, and MAPE = 4.95%. Scenario analysis further illustrates how different recovery strategies influence SRI values and enables comparison of alternative reconstruction approaches. The proposed framework provides a flexible analytical tool for supporting strategic decision-making in post-war reconstruction projects. By combining fuzzy logic, simulation techniques, and machine learning, the model enables systematic prioritization of recovery strategies and may support large-scale reconstruction planning in post-conflict environments. Full article
(This article belongs to the Section Civil Engineering)
Back to TopTop