MDPI - Publisher of Open Access Journals

32 pages, 3925 KB

Open AccessArticle

Expert-Based Evaluation and Simulation Validation of a Smart Emergency Response System for Urban Settings in Resource-Constrained Environments

by Milliam Maxime Zekeng Ndadji, Mahamat Abdel Aziz Assoul, Baudoin Nguimeya Tsofack, Garrik Brel Jagho Mdemaya, Abakar Mahamat Tahir and Taibi Mahmoud

Information 2026, 17(6), 582; https://doi.org/10.3390/info17060582 - 11 Jun 2026

Viewed by 293

Abstract

The present study provides a multi-faceted validation and refinement of a distributed system architecture designed to improve emergency response in resource-constrained urban areas. The architecture integrates IoT sensors, edge computing, field-programmable gate arrays and distributed shortest-path algorithms to enhance resilience and operational efficiency. [...] Read more.

The present study provides a multi-faceted validation and refinement of a distributed system architecture designed to improve emergency response in resource-constrained urban areas. The architecture integrates IoT sensors, edge computing, field-programmable gate arrays and distributed shortest-path algorithms to enhance resilience and operational efficiency. As a primary validation strategy, a survey of 78 Cameroonian experts in software engineering, distributed systems, urban planning and emergency technologies was conducted. The survey yielded quantitative and qualitative data across multiple analytical dimensions, including subgroup analysis and a transferability assessment covering Nigeria, Senegal, and Kenya. The statistical analysis confirmed that the architecture is technically feasible, adaptable to local constraints, and has the potential to reduce response times. As a secondary validation strategy, a simulation-based study was conducted using iFogSim on smart-city models ranging from 25 to 100 nodes, encompassing five experiments: result consistency, geographic sensitivity, concurrent incident management, path-caching efficiency, and scalability analysis. The simulation results quantitatively corroborate the expert assessments, demonstrating low end-to-end latency and sustained throughput with realistic urban load conditions. Key challenges identified include interoperability, urban data structuring, financial sustainability and inter-institutional coordination. Experts have proposed a hierarchical structure of priority actions and concrete recommendations for engineers, researchers and policymakers. The combined findings validate the architecture and establish a replicable expert-simulation evaluation framework applicable to analogous distributed emergency-response systems in comparable resource-constrained contexts. The empirical results further constitute a reference baseline for the design and implementation of similar architectures. Full article

(This article belongs to the Special Issue Internet of Things (IoT) and Cloud/Edge Computing)

► Show Figures

Graphical abstract

21 pages, 1968 KB

Open AccessArticle

A Decoupled Access Control Framework for Secure and Scalable PLM Systems in Industry 4.0

by Xiaoda Li, Xianghui Zhan, Jingde Huang and Zhichao Gong

Electronics 2026, 15(12), 2570; https://doi.org/10.3390/electronics15122570 - 10 Jun 2026

Viewed by 134

Abstract

In the current Industrial Internet of Things (IIoT) environment, data security for product lifecycle management is greatly challenged, particularly in scenarios involving vertical multi-level Bill of Materials (BOM) deep nesting and lifecycle dynamic evolution. The traditional case-bounding model, in large-scale deployment, easily leads [...] Read more.

In the current Industrial Internet of Things (IIoT) environment, data security for product lifecycle management is greatly challenged, particularly in scenarios involving vertical multi-level Bill of Materials (BOM) deep nesting and lifecycle dynamic evolution. The traditional case-bounding model, in large-scale deployment, easily leads to rule expansion and an increase in database I/O overhead, thus causing authorization lag, authority boundary ambiguity and other problems. To address these limitations, this paper proposes a Decoupled Hybrid Access Resolution (DHAR) framework. The framework separates static organizational roles from dynamic lifecycle constraints, and the complexity of authorization configuration is reconstructed from case-dependent growth into an object-instance-independent bounded structure; combined with the state-based pre-filtering mechanism and memory cache strategy, redundant recursive query is reduced. Experiments on increasing BOM depths show that, under a 20-layer topology, DHAR reduces average access latency from 285.8 ms to 1.3 ms. Under a 20-layer BOM with 1000 concurrent requests, DHAR maintains an average latency of 5.2 ms, while compressing the authorization rule set from millions to hundreds. These results indicate that, within the studied vertical multi-level BOM setting, DHAR improves response performance while preserving data consistency and strengthening protection against unauthorized modification. Full article

(This article belongs to the Special Issue Advances in Data Security: Challenges, Technologies, and Applications)

► Show Figures

Figure 1

42 pages, 1713 KB

Open AccessArticle

Multimodal Environment-Aware 3D Adaptive Scheduling for UAV-Enabled Fluid Antenna Systems

by Siying Ding and Yue Hu

Electronics 2026, 15(11), 2330; https://doi.org/10.3390/electronics15112330 - 27 May 2026

Viewed by 219

Abstract

To mitigate 3D spatial blockages and channel uncertainty in VHF/low-UHF UAV emergency networks, this paper presents a multimodal environment-aware framework for 3D virtual fluid antenna port scheduling within an Integrated Sensing, Computing, and Communication (ISCCC) architecture. Under rigorously verified spatial resolution and channel [...] Read more.

To mitigate 3D spatial blockages and channel uncertainty in VHF/low-UHF UAV emergency networks, this paper presents a multimodal environment-aware framework for 3D virtual fluid antenna port scheduling within an Integrated Sensing, Computing, and Communication (ISCCC) architecture. Under rigorously verified spatial resolution and channel stationarity conditions, UAV micro-mobility is mapped onto a discrete 3D virtual port array, transforming continuous flight space into a controllable fluid antenna system (FAS). We define a spatial efficiency metric that quantifies the Pareto trade-off between spatial degrees of freedom and estimation error, parameterized by an error-sensitivity index, and prove the existence of a unique optimal flight scale. Utilizing a joint spatio-temporal channel model, we derive the irreducible entropy lower bound of channel uncertainty, demonstrating that intrinsic environmental randomness constitutes a fundamental predictability limit regardless of port density—a benchmark independent of any specific scheduling strategy. To ensure real-time viability, we introduce an ISCCC-inspired computation-and-caching strategy that leverages pre-calculated stationary probabilities to drive a multidimensional scoring mechanism incorporating channel entropy-based stability, predictive SNR, and load balancing. The suboptimality gap relative to a perfect-CSI oracle is analytically bounded, and proven to narrow significantly under the high temporal correlation inherent in VHF bands. Numerical results confirm that the proposed strategy attains 10.36 bps/Hz effective throughput and 10.5% outage probability, consistently outperforming rule-based, learning-based, and 2D spatial baselines, particularly under prolonged structural obstructions. Full article

(This article belongs to the Special Issue Wireless Multimodal Communications for Integrated Heterogeneous Networks)

► Show Figures

Figure 1

20 pages, 4689 KB

Open AccessArticle

GPU-Accelerated Signal Processing for Distributed Vibration Sensing Based on OVNA Method

by Alessandro Meoli, Raffaele Vallifuoco, Agnese Coscetta, Luigi Zeni and Aldo Minardo

Sensors 2026, 26(11), 3314; https://doi.org/10.3390/s26113314 - 23 May 2026

Viewed by 453

Abstract

Distributed vibration sensing based on optical vector network analysis (OVNA) is a promising technique for measuring dynamic perturbations in optical fibers, but its practical use is limited by the high computational cost of short-time Fourier transform (STFT) and cross-correlation stages. In this work, [...] Read more.

Distributed vibration sensing based on optical vector network analysis (OVNA) is a promising technique for measuring dynamic perturbations in optical fibers, but its practical use is limited by the high computational cost of short-time Fourier transform (STFT) and cross-correlation stages. In this work, we present a GPU-accelerated signal processing pipeline, together with an optimization strategy based on dataflow reduction, mixed-precision arithmetic, and hardware-aware tuning. The proposed implementation reduces the processing time for 200 sweeps from 64.7 s on a single-core CPU to 0.199 s on a modern GPU, while preserving the final shift results, with zero mismatches over 199,199 measurement points. Benchmarking across three GPU generations further shows that STFT benefits more from large on-chip cache resources, whereas cross-correlation scales more closely with memory bandwidth. These results suggest that modern GPUs can significantly reduce the computational burden of OVNA, as well as other distributed sensing methods with a similar processing flow, enabling kHz-rate aggregate throughput from batched processing, supporting real-time-oriented operation on modern GPUs. Full article

(This article belongs to the Special Issue Distributed Sensors: Development and Applications)

► Show Figures

Figure 1

24 pages, 1305 KB

Open AccessArticle

FPCache: A Fingerprint-Rectified Learned Index Cache for Disaggregated Memory

by Chenyang Jia and Miao Cai

Electronics 2026, 15(10), 2210; https://doi.org/10.3390/electronics15102210 - 21 May 2026

Viewed by 206

Abstract

The rapid growth of data-intensive applications has increased the demand for efficient storage in large-scale key-value (KV) stores. Disaggregated memory architectures provide a scalable solution by separating compute and memory resources via RDMA. However, existing indexing schemes in these environments suffer from poor [...] Read more.

The rapid growth of data-intensive applications has increased the demand for efficient storage in large-scale key-value (KV) stores. Disaggregated memory architectures provide a scalable solution by separating compute and memory resources via RDMA. However, existing indexing schemes in these environments suffer from poor read efficiency, significantly degrading overall system throughput and scalability. Specifically, learned indexes often encounter substantial read amplification during remote data retrieval due to prediction errors. In addition, caching full keys incurs a high cache footprint, limiting the effective cache capacity on compute nodes and leading to additional remote memory accesses. This paper presents FPCache, a fingerprint-rectified learned index cache for disaggregated memory. We propose a fingerprint-assisted two-stage read approach to mitigate read amplification. FPCache first retrieves a compact fingerprint array for local matching. It then converts range reads into precise point accesses and directly reads the corresponding data item, thereby avoiding reading the entire range and reducing extra data transfers. Next, we design a fingerprint-offset compression strategy to maximize cache density. Leveraging fixed-length fingerprints and position offsets enables compute nodes to retain significantly more hotspot data within limited memory resources. Experimental evaluations using various YCSB workloads demonstrate that FPCache consistently outperforms state-of-the-art methods. Compared to systems like CHIME and ROLEX, FPCache improves system throughput by up to 62% and effectively maintains stable access efficiency under diverse data distributions. Full article

(This article belongs to the Special Issue New Challenges in High-Performance Computing and Computer Architecture)

► Show Figures

Figure 1

30 pages, 11018 KB

Open AccessArticle

A Hybrid Deep Learning Architecture for Content Request Prediction in the Internet of Vehicles

by Assem Rezki, Lyamine Guezouli, Abderrezak Benyahia, Djallel Eddine Boubiche, Mohamed Zohir Mabane, Sohaib Chine, Homero Toral-Cruz, Rafael Martínez-Peláez and Julio Cesar Ramirez-Pacheco

Sensors 2026, 26(10), 3252; https://doi.org/10.3390/s26103252 - 20 May 2026

Viewed by 419

Abstract

Low-latency content delivery is essential in the Internet of Vehicles (IoV) to support autonomous driving, cooperative perception, and infotainment services. However, rapidly changing vehicular mobility and demand patterns limit the effectiveness of existing content prediction and caching strategies, which often capture either short-term [...] Read more.

Low-latency content delivery is essential in the Internet of Vehicles (IoV) to support autonomous driving, cooperative perception, and infotainment services. However, rapidly changing vehicular mobility and demand patterns limit the effectiveness of existing content prediction and caching strategies, which often capture either short-term temporal trends or long-range dependencies, but not both. This paper proposes a hybrid deep learning architecture that integrates Long Short-Term Memory (LSTM) networks with Transformer encoders to jointly model fine-grained temporal dynamics and global correlations in content requests. The resulting popularity predictions are incorporated into a reinforcement learning (RL)-based caching policy, enabling proactive and adaptive cache placement at roadside units (RSUs) within an end-to-end optimization framework. Simulation results across representative IoV scenarios show that the proposed approach consistently improves cache hit ratio, retrieval latency, and prediction accuracy compared with LSTM-only, Transformer-only, Least Frequently Used (LFU), and Least Recently Used (LRU) baselines. Ablation studies further demonstrate the complementary strengths of the hybrid components, highlighting improved convergence behavior and robustness under varying demand distributions. Full article

(This article belongs to the Section Vehicular Sensing)

► Show Figures

Figure 1

21 pages, 1732 KB

Open AccessArticle

Resource-Aware Deep Reinforcement Learning for Joint Caching and Service Placement in Multi-Access Edge Computing

by Elias Dritsas and Maria Trigka

Electronics 2026, 15(10), 2074; https://doi.org/10.3390/electronics15102074 - 13 May 2026

Viewed by 353

Abstract

Multi-access edge computing (MEC) enables low-latency service provisioning by placing computation closer to mobile users. However, efficient service placement remains challenging due to dynamic user mobility, limited edge resources, and the need to manage service migration as system conditions evolve. This study proposes [...] Read more.

Multi-access edge computing (MEC) enables low-latency service provisioning by placing computation closer to mobile users. However, efficient service placement remains challenging due to dynamic user mobility, limited edge resources, and the need to manage service migration as system conditions evolve. This study proposes a resource-aware, cache-enabled service placement framework based on deep reinforcement learning (DRL) to dynamically select edge nodes for hosting services. The approach jointly considers user location, resource availability, and cache status within a unified decision framework, enabling efficient and adaptive service placement in dynamic MEC environments. The problem is formulated as a Markov decision process (MDP) and solved using deep Q-network (DQN)-based methods, with a reward function that balances latency, resource utilization, and cache efficiency. The proposed framework is evaluated in a simulated MEC environment with mobile users and multiple edge nodes. Experimental results demonstrate that the approach achieves lower latency, improved resource utilization, and enhanced cache efficiency compared to baseline strategies. Among the evaluated models, the dueling double deep Q-network (DDDQN) achieves the most balanced overall performance. The proposed framework provides an adaptive and scalable solution for service management in dynamic MEC environments. Full article

(This article belongs to the Special Issue Machine Learning Approach for Prediction: Cross-Domain Applications)

► Show Figures

Figure 1

16 pages, 35402 KB

Open AccessArticle

JefiFast: Accelerating Jefimenko’s Equations with Memory-Centric Optimizations and Multi-GPU Parallelism

by Bing He, Shengyu Peng, Nan Sun, Guoliang Li, Xiaofei Zhu, Peng Xu and Xiaowei Shen

Physics 2026, 8(2), 43; https://doi.org/10.3390/physics8020043 - 7 May 2026

Viewed by 323

Abstract

As a foundation for numerical solvers in computational electromagnetics, particularly for multiphysics and electromagnetic compatibility applications, Jefimenko’s equations offer a generalized solution to Maxwell’s equations, enabling the direct computation of electromagnetic fields from time-dependent source distributions without the boundary-condition artifacts inherent to grid-based [...] Read more.

As a foundation for numerical solvers in computational electromagnetics, particularly for multiphysics and electromagnetic compatibility applications, Jefimenko’s equations offer a generalized solution to Maxwell’s equations, enabling the direct computation of electromagnetic fields from time-dependent source distributions without the boundary-condition artifacts inherent to grid-based methods. However, the numerical integration of these equations is computationally intensive, typically scaling as

O (N_{s} N_{o})

for

N_{s}

source points and

N_{o}

observation points. In this paper, we present JefiFast, a highly optimized graphics processing unit (GPU) implementation that significantly outperforms the state-of-the-art JefiGPU algorithm. We identify that previous implementations are strictly memory-bound due to inefficient global memory transactions and a lack of data reuse. JefiFast addresses these bottlenecks through four key optimizations: (i) a packed memory layout (PML) using an array-of-structures approach to ensure coalesced memory access for source densities and their derivatives; (ii) geometry-aware shared memory tiling strategies that maximize L2 (level-2) cache hit rates and on-chip data reuse; (iii) pre-computation of time derivatives to minimize redundant arithmetic operations; and (iv) a robust observation domain decomposition strategy that enables linear scaling across multiple GPUs. Benchmarks demonstrate that JefiFast achieves speedups ranging from

4.08

times (for

30^{3}

grids on a single NVIDIA V100 graphic processor) to

84.51

times (for

50^{3}

grids on 4 NVIDIA V100 processors) compared to the baseline. Notably, for a

50^{3}

grid on a single GPU, JefiFast reduces execution time from about 51 min to just about 2.6 min (

19.54

times speedup). These performance advances make high-resolution relativistic heavy-ion collision simulations feasible in near real-time. Full article

(This article belongs to the Topic Advanced Electromagnetic Modeling and Simulation for Multidisciplinary Engineering Systems)

► Show Figures

Figure 1

25 pages, 41994 KB

Open AccessArticle

Efficient Self-Collision Culling for Real-Time Cloth Simulation Using Discrete Curvature Analysis

by Nak-Jun Sung, Taeheon Kim, Hamin Lee, Sungjin Lee, Jun Ma and Min Hong

Mathematics 2026, 14(9), 1504; https://doi.org/10.3390/math14091504 - 29 Apr 2026

Viewed by 661

Abstract

Self-collision detection has become the dominant computational bottleneck in GPU-accelerated cloth simulation, as modern parallel solvers such as XPBD have drastically reduced the cost of position updates while leaving collision resolution largely unoptimized. Existing spatial partitioning methods treat all cloth regions uniformly, saturating [...] Read more.

Self-collision detection has become the dominant computational bottleneck in GPU-accelerated cloth simulation, as modern parallel solvers such as XPBD have drastically reduced the cost of position updates while leaving collision resolution largely unoptimized. Existing spatial partitioning methods treat all cloth regions uniformly, saturating GPU memory bandwidth despite the fact that the vast majority of the mesh surface remains geometrically flat and collision-free at any given frame. We propose a hierarchical self-collision culling framework built upon a resolution-independent discrete curvature metric derived from the

h^{2}

-normalized Laplace-Beltrami operator, integrated with a discrete Kirchhoff–Love shell model combining distance and dihedral bending constraints within XPBD. Unlike prior cache-dependent acceleration strategies, our method tightly couples curvature-driven geometric pruning with a fused GPU kernel design and shows that this stateless formulation is both faster and physically more reliable. Evaluated on meshes of

512 \times 512

and

1024 \times 1024

particles, our method achieves a

5.5 %

and

9.7 %

FPS improvement alongside a

34.9 %

and

28.4 %

reduction in active collision pairs, respectively, with qualitative validation via high-fidelity rendering confirming artifact-free self-contact and strict ground-plane non-penetration. Ablation results further reveal that temporal coherence, conventionally regarded as an optimization standard, strictly degrades both performance (FPS decrease of

1.4 % p

to

1.9 % p

) and physical accuracy (penetration depth increase of

36.1 %

to

100.0 %

relative to the curvature-only stage) on RTX 3070 GPU, advocating for stateless per-frame geometric evaluation as the preferred design paradigm. Full article

(This article belongs to the Special Issue Mathematical Applications in Computer Graphics)

► Show Figures

Figure 1

29 pages, 5890 KB

Open AccessArticle

A Cooperative Keypoint–Sparse Cache and Improved PPO Framework for Rapid 3D UAV Path Planning

by Yonggang Wang, Genwei Wang, Zehua Chen, Jiang Wang and Pu Huang

Drones 2026, 10(5), 330; https://doi.org/10.3390/drones10050330 - 28 Apr 2026

Cited by 1 | Viewed by 559

Abstract

UAV path planning in complex 3D terrain faces the dual challenges of computational efficiency and reliable obstacle avoidance. To address these issues, this paper proposes a Keypoint–Sparse Cache (KSC) strategy and a hierarchical KSC-PPO (Proximal Policy Optimization) framework for mountainous environments with both [...] Read more.

UAV path planning in complex 3D terrain faces the dual challenges of computational efficiency and reliable obstacle avoidance. To address these issues, this paper proposes a Keypoint–Sparse Cache (KSC) strategy and a hierarchical KSC-PPO (Proximal Policy Optimization) framework for mountainous environments with both static terrain and dynamic obstacles. The KSC strategy reduces search complexity through orthogonal slice-based sparse keypoint extraction and path caching reuse, thereby improving the efficiency of global path planning. On this basis, PPO-based local obstacle avoidance is activated only when safety thresholds are exceeded, while the remaining path is replanned globally after threat clearance, which confines avoidance computation to a local scope while preserving global path quality. Experiments in static mountainous environments show that KSC requires substantially less computation time than RRT* and Informed RRT* while maintaining competitive path efficiency, and it also outperforms four bio-inspired optimization algorithms across terrains of increasing complexity. Hybrid navigation validation experiments further show that KSC-PPO achieves high mission success, low collision rates, and low avoidance overhead in dynamic mountainous environments. Experiments demonstrate that KSC-PPO decomposes exponential global search space into controllable linear subproblems, significantly enhancing efficiency while ensuring path quality, providing an effective solution for UAV navigation in complex terrain. Full article

(This article belongs to the Special Issue Advanced Optimization Strategies for UAV Mission Planning and Operation)

► Show Figures

Figure 1

24 pages, 11348 KB

Open AccessArticle

Intelligent Optimization Methods for Cloud–Edge Collaborative Vehicular Networks via the Integration of Bayesian Decision-Making and Reinforcement Learning

by Youjian Yu, Zhaowei Song, Sifeng Zhu and Qinghua Zhang

Future Internet 2026, 18(4), 215; https://doi.org/10.3390/fi18040215 - 17 Apr 2026

Viewed by 321

Abstract

To improve vehicle user service quality and address data privacy and security issues in intelligent transportation vehicle networking systems, a three-tier communication architecture with cloud-edge-end collaboration was designed in this paper. A Bayesian decision criterion was utilized to divide user data segments into [...] Read more.

To improve vehicle user service quality and address data privacy and security issues in intelligent transportation vehicle networking systems, a three-tier communication architecture with cloud-edge-end collaboration was designed in this paper. A Bayesian decision criterion was utilized to divide user data segments into fine-grained slices based on their privacy levels, and differential privacy techniques were applied to protect the offloaded data. To achieve multi-objective optimization between user service quality and data privacy and security, the problem was formulated as a constrained Markov decision process. A communication model, a caching model, a latency model, an energy consumption model, and a data-fragment privacy protection model were designed. Additionally, a deep reinforcement learning algorithm based on the actor–critic approach was proposed for the collaborative and centralized training of multiple intelligent agents (CTMA-AC), enabling multi-objective optimization decision-making for the protection of offloaded private user data. Simulation experiments demonstrate that the proposed multi-agent collaborative privacy data offloading protection strategy can effectively safeguard private user data while ensuring high service quality. Full article

(This article belongs to the Section Network Virtualization and Edge/Fog Computing)

► Show Figures

Graphical abstract

24 pages, 1522 KB

Open AccessArticle

M-DGNN: Accelerating Large-Scale Dynamic Graph Neural Network Training via PCIe-Interconnected Multiple Computational Storage Devices

by Junhao Zhu, Xiaotong Han, Wenqing Wang, Liang Fang, Xinjie Shi and Junwei Zeng

Electronics 2026, 15(8), 1620; https://doi.org/10.3390/electronics15081620 - 13 Apr 2026

Viewed by 458

Abstract

The explosive growth of temporal graph data has led to significant training overheads for Dynamic Graph Neural Networks (DGNNs), a bottleneck primarily driven by massive data movement between host processors and storage arrays across conventional PCIe I/O buses. While near-data processing with Computational [...] Read more.

The explosive growth of temporal graph data has led to significant training overheads for Dynamic Graph Neural Networks (DGNNs), a bottleneck primarily driven by massive data movement between host processors and storage arrays across conventional PCIe I/O buses. While near-data processing with Computational Storage Devices (CSDs) can alleviate this bottleneck, a single CSD is inherently incapable of meeting the terabyte-scale capacity requirements and complex sequence modeling demands of modern large-scale DGNNs. Horizontal scaling with multi-CSD clusters over standard PCIe topologies presents a viable, cost-effective solution, yet our in-depth profiling identifies two critical architectural bottlenecks in naive multi-CSD architectures: host-bounced memory copies significantly compromise inter-device communication efficiency, and sparse graph sampling frequently exceeds the capacity of the tightly constrained local DRAM of CSDs, resulting in excessive flash I/O and performance degradation. To address these interconnected bottlenecks, we propose M-DGNN, a hardware–software co-designed architecture optimized for standard PCIe interconnects. First, M-DGNN orchestrates direct peer-to-peer (P2P) DMA dataflows for inter-CSD hidden state exchange, completely bypassing host operating system intervention and reducing the context-switching overhead. Second, we design a host-assisted caching strategy with a Host-Pinned Memory Extension (HPME) mechanism, which leverages host-pinned memory as an asynchronous DMA extension pool to shield resource-constrained CSDs from high-latency flash I/O during structural subgraph sampling. Extensive experimental evaluations across seven large-scale dynamic graph datasets demonstrate that M-DGNN delivers up to a 6.2× end-to-end speedup over the state-of-the-art DGNN systems. This work establishes an efficient, scalable near-data computing paradigm for large-scale DGNN training. Full article

(This article belongs to the Special Issue High-Performance Computer Architectures: Designs and Applications)

► Show Figures

Figure 1

18 pages, 16035 KB

Open AccessArticle

An Optimized Dual-Path SGM System for Real-Time Stereo Matching on FPGA

by Yang Song, Hongyu Sun, Wenmin Song, Xiangpeng Wang and Fanqiang Lin

Electronics 2026, 15(8), 1549; https://doi.org/10.3390/electronics15081549 - 8 Apr 2026

Viewed by 616

Abstract

Stereo matching constitutes a critical technology in applications such as autonomous driving and robot navigation. Conventional algorithms, however, often encounter limitations in real-time performance and resource efficiency when deployed on embedded platforms. This paper presents a real-time stereo matching system implemented on a [...] Read more.

Stereo matching constitutes a critical technology in applications such as autonomous driving and robot navigation. Conventional algorithms, however, often encounter limitations in real-time performance and resource efficiency when deployed on embedded platforms. This paper presents a real-time stereo matching system implemented on a Field-Programmable Gate Array (FPGA), which is built around a lightweight, hardware-optimized dual-path Semi-Global Matching (SGM) algorithm. The proposed method simplifies the traditional eight-path cost aggregation into horizontal and vertical dual-path aggregation, substantially reducing hardware resource consumption while preserving matching accuracy. The system employs a pipelined architecture that integrates image capture, DDR3 caching, and HDMI display output. Experimental results demonstrate that under the configuration of a 5 × 5 matching window and a disparity range of 64, the system generates stable disparity maps at 60 frames per second, with total power consumption below 2.2 W and FPGA core logic utilization under 30%. Compared to the conventional eight-path SGM, the dual-path strategy incurs only a marginal 6% increase in average bad pixel rate on standard stereo datasets while reducing Block RAM (BRAM) usage by approximately 30%. This achieves an effective practical balance between accuracy, computational efficiency, and power consumption. Full article

(This article belongs to the Section Circuit and Signal Processing)

► Show Figures

Figure 1

32 pages, 823 KB

Open AccessArticle

A Hybrid Temporal Recommender System Based on Sliding-Window Weighted Popularity and Elite Evolutionary Discrete Particle Swarm Optimization

by Shanxian Lin, Yuichi Nagata and Haichuan Yang

Electronics 2026, 15(8), 1544; https://doi.org/10.3390/electronics15081544 - 8 Apr 2026

Viewed by 518

Abstract

This paper proposes a hybrid non-personalized temporal recommendation framework integrating Sliding-Window Weighted Popularity (SWWP) with Elite Evolutionary Discrete Particle Swarm Optimization (EEDPSO) to address the challenges of extreme data sparsity and temporal dynamics in global popularity-based recommendation. We first formally prove the NP [...] Read more.

This paper proposes a hybrid non-personalized temporal recommendation framework integrating Sliding-Window Weighted Popularity (SWWP) with Elite Evolutionary Discrete Particle Swarm Optimization (EEDPSO) to address the challenges of extreme data sparsity and temporal dynamics in global popularity-based recommendation. We first formally prove the NP hardness of the temporal-constrained recommendation problem, justifying the adoption of a metaheuristic approach. The proposed SWWP model employs a dual-scale sliding-window mechanism to balance short-term trend adaptation with long-term periodicity capture. A novel deep integration mechanism couples SWWP with EEDPSO through a “purchase heat” indicator, which guides temporal-aware particle initialization, position updates, and fitness evaluation. Extensive experiments on the Amazon Reviews dataset with extreme sparsity (density < 0.0005%) demonstrate that SWWP achieves an NDCG@20 of 0.245, outperforming nine temporal baselines by at least 13%. Furthermore, under a unified fitness function incorporating temporal prediction accuracy, the SWWP-EEDPSO framework achieves 5.95% higher fitness compared to vanilla EEDPSO, while significantly outperforming Differential Evolution and Genetic Algorithms. The temporally informed search strategy enables SWWP-EEDPSO to discover recommendations that better align with future user behavior, while maintaining sub-millisecond online query latency (0.52 ms) through offline precomputation and caching, demonstrating practical feasibility for deployment scenarios where periodic offline updates are acceptable. Full article

(This article belongs to the Special Issue Evolutionary and Swarm Intelligence Approaches for Recommender Systems)

► Show Figures

Figure 1

24 pages, 2876 KB

Open AccessArticle

High-Performance Computing Optimization of the Maxwell–Stefan Diffusion Model in OpenFOAM

by Zixin Chi, Xin Hui and Bosen Wang

Appl. Sci. 2026, 16(7), 3611; https://doi.org/10.3390/app16073611 - 7 Apr 2026

Viewed by 576

Abstract

Multicomponent diffusion modeling based on the Maxwell–Stefan formulation is widely used in high-fidelity combustion simulations due to its superior physical accuracy compared with simplified diffusion models. However, the computational complexity of the Maxwell–Stefan model, which arises from the solution of coupled multicomponent transport [...] Read more.

Multicomponent diffusion modeling based on the Maxwell–Stefan formulation is widely used in high-fidelity combustion simulations due to its superior physical accuracy compared with simplified diffusion models. However, the computational complexity of the Maxwell–Stefan model, which arises from the solution of coupled multicomponent transport equations, becomes a major performance bottleneck in large-scale CFD simulations. In this work, a high-performance computing optimization strategy for the Maxwell–Stefan diffusion model is developed within the OpenFOAM framework. The proposed method improves computational efficiency through block-based computation and vectorization-oriented data organization to better exploit modern CPU architectures and SIMD instruction capabilities. The optimized implementation enhances memory locality, increases data reuse efficiency, and reduces cache miss penalties. Numerical validation is performed using two-dimensional laminar counterflow flame cases and ammonia–hydrogen turbulent combustion cases, including both premixed and non-premixed jet flames. Results demonstrate that the optimized Maxwell–Stefan implementation preserves numerical accuracy while significantly improving computational performance. Speedups of 2.5×–4.5× are achieved depending on the number of chemical species. The developed approach provides an efficient solution for detailed combustion simulations involving large chemical mechanisms. The test cases and source code are openly shared. Full article

► Show Figures

Figure 1

Search Results (229)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (229)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI