MDPI - Publisher of Open Access Journals

18 pages, 716 KB

Open AccessArticle

Metacognitive Modulation of Cognitive-Emotional Dynamics Under Social-Evaluative Stress: An Integrated Behavioural–EEG Study

by Katia Rovelli, Angelica Daffinà and Michela Balconi

Appl. Sci. 2025, 15(19), 10678; https://doi.org/10.3390/app151910678 - 2 Oct 2025

Viewed by 334

Abstract

Background/Objectives: Decision-making under socially evaluative stress engages a dynamic interplay between cognitive control, emotional appraisal, and motivational systems. Contemporary models of multi-level co-regulation posit that these systems operate in reciprocal modulation, redistributing processing resources to prioritise either rapid socio-emotional alignment or deliberate evaluation [...] Read more.

Background/Objectives: Decision-making under socially evaluative stress engages a dynamic interplay between cognitive control, emotional appraisal, and motivational systems. Contemporary models of multi-level co-regulation posit that these systems operate in reciprocal modulation, redistributing processing resources to prioritise either rapid socio-emotional alignment or deliberate evaluation depending on situational demands. Methods: Adopting a neurofunctional approach, a novel dual-task protocol combining the MetaCognition–Stress Convergence Paradigm (MSCP) and the Social Stress Test Neuro-Evaluation (SST-NeuroEval), a simulated social–evaluative speech task calibrated across progressive emotional intensities, was implemented. Twenty professionals from an HR consultancy firm participated in the study, with concurrent recording of frontal-temporoparietal electroencephalography (EEG) and bespoke psychometric indices: the MetaStress-Insight Index and the TimeSense Scale. Results: Findings revealed that decision contexts with higher socio-emotional salience elicited faster, emotionally guided choices (mean RT difference emotional vs. cognitive: −220 ms, p = 0.026), accompanied by oscillatory signatures (frontal delta: F(1,19) = 13.30, p = 0.002; gamma: F(3,57) = 14.93, p ≤ 0.001) consistent with intensified socio-emotional integration and contextual reconstruction. Under evaluative stress, oscillatory activity shifted across phases, reflecting the transition from anticipatory regulation to reactive engagement, in line with models of phase-dependent stress adaptation. Across paradigms, convergences emerged between decision orientation, subjective stress, and oscillatory patterns, supporting the view that cognitive–emotional regulation operates as a coordinated, multi-level system. Conclusions: These results underscore the importance of integrating behavioural, experiential, and neural indices to characterise how individuals adaptively regulate decision-making under socially evaluative stress and highlight the potential of dual-paradigm designs for advancing theory and application in cognitive–affective neuroscience. Full article

(This article belongs to the Special Issue Brain Functional Connectivity: Prediction, Dynamics, and Modeling)

► Show Figures

Figure 1

12 pages, 768 KB

Open AccessArticle

ECG Waveform Segmentation via Dual-Stream Network with Selective Context Fusion

by Yongpeng Niu, Nan Lin, Yuchen Tian, Kaipeng Tang and Baoxiang Liu

Electronics 2025, 14(19), 3925; https://doi.org/10.3390/electronics14193925 - 2 Oct 2025

Viewed by 252

Abstract

Electrocardiogram (ECG) waveform delineation is fundamental to cardiac disease diagnosis. This task requires precise localization of key fiducial points, specifically the onset, peak, and offset positions of P-waves, QRS complexes, and T-waves. Current methods exhibit significant performance degradation in noisy clinical environments (baseline [...] Read more.

Electrocardiogram (ECG) waveform delineation is fundamental to cardiac disease diagnosis. This task requires precise localization of key fiducial points, specifically the onset, peak, and offset positions of P-waves, QRS complexes, and T-waves. Current methods exhibit significant performance degradation in noisy clinical environments (baseline drift, electromyographic interference, powerline interference, etc.), compromising diagnostic reliability. To address this limitation, we introduce ECG-SCFNet: a novel dual-stream architecture employing selective context fusion. Our framework is further enhanced by a consistency training paradigm, enabling it to maintain robust waveform delineation accuracy under challenging noise conditions.The network employs a dual-stream architecture: (1) A temporal stream captures dynamic rhythmic features through sequential multi-branch convolution and temporal attention mechanisms; (2) A morphology stream combines parallel multi-scale convolution with feature pyramid integration to extract multi-scale waveform structural features through morphological attention; (3) The Selective Context Fusion (SCF) module adaptively integrates features from the temporal and morphology streams using a dual attention mechanism, which operates across both channel and spatial dimensions to selectively emphasize informative features from each stream, thereby enhancing the representation learning for accurate ECG segmentation. On the LUDB and QT datasets, ECG-SCFNet achieves high performance, with F1-scores of 97.83% and 97.80%, respectively. Crucially, it maintains robust performance under challenging noise conditions on these datasets, with 88.49% and 86.25% F1-scores, showing significantly improved noise robustness compared to other methods and demonstrating exceptional robustness and precise boundary localization for clinical ECG analysis. Full article

► Show Figures

Figure 1

17 pages, 801 KB

Open AccessArticle

Dual-Task Interference Increases Variability in Sub-Second Repetitive Motor Timing

by Ivan Šerbetar and Asgeir Mamen

J. Funct. Morphol. Kinesiol. 2025, 10(4), 366; https://doi.org/10.3390/jfmk10040366 - 25 Sep 2025

Viewed by 390

Abstract

Objectives: Sub-second motor timing is critical for skilled performance in domains such as sport, music, and safety-critical multitasking; however, its robustness under cognitive load remains unresolved. Dual-task paradigms offer a method to test whether attentional demands selectively disrupt temporal precision. This study [...] Read more.

Objectives: Sub-second motor timing is critical for skilled performance in domains such as sport, music, and safety-critical multitasking; however, its robustness under cognitive load remains unresolved. Dual-task paradigms offer a method to test whether attentional demands selectively disrupt temporal precision. This study intended to investigate the effects of cognitive load on rhythmic finger tapping at a sub-second interval. Methods: A sample of 103 college students (19–25 years) performed a synchronization–continuation tapping task at 500 ms intervals under single- and dual-task conditions across five trials. The dual-task condition included a distracting letter-span task imposing working memory load. Inter-response intervals (IRIs), their variability (IRI SD), and accuracy (AI) were analyzed using linear mixed-effects models. Results: Tapping intervals were consistently shorter than the 500 ms target by approximately 70 ms in both conditions, showing anticipatory mechanisms that remained stable under cognitive load. Mean accuracy did not vary between single- and dual-task conditions. By contrast, temporal variability was significantly higher in the dual-task condition, reflecting diminished trial-to-trial consistency. These effects continued throughout trials and were supported by model estimates, which indicated robust between-subject variability but selective disruption of consistency rather than mean performance. Conclusions: Dual-tasking selectively hinders temporal stability in sub-second motor timing while ensuring that the reproduction and accuracy of the mean interval remain unchanged. This pattern supports dual-process accounts of timing, suggesting distinct roles for predictive control and attentional allocation. The results have applied relevance for situations requiring precise rhythmic performance under cognitive load, including sports, ensemble music, and safety-critical tasks. Full article

(This article belongs to the Section Kinesiology and Biomechanics)

► Show Figures

Figure 1

12 pages, 313 KB

Open AccessArticle

Spatiotemporal Gait Variables and Step-to-Step Variability in Preschool-Aged Children Born Very Preterm at Risk for Developmental Coordination Disorder: A Cohort Study

by Reem A. Albesher, Jennifer L. McGinley, Fiona L. Dobson, Benjamin F. Mentiplay, Tara L. FitzGerald, Kate L. Cameron, Jeanie L. Y. Cheong and Alicia J. Spittle

Children 2025, 12(9), 1261; https://doi.org/10.3390/children12091261 - 19 Sep 2025

Viewed by 364

Abstract

Background/Objective: The gait pattern of children born very preterm shows gait decrements compared to their full-term peers in dual-task walking. It is essential to identify children at a higher risk for these gait deficits. The aim of this study was to compare spatiotemporal [...] Read more.

Background/Objective: The gait pattern of children born very preterm shows gait decrements compared to their full-term peers in dual-task walking. It is essential to identify children at a higher risk for these gait deficits. The aim of this study was to compare spatiotemporal gait variables in preschool-age children born very preterm at risk for developmental coordination disorder (DCD) with those not at risk. Methods: Preschool-age children born < 30 weeks’ gestation. Risk for DCD was defined as (i) ≤16th percentile on the Movement Assessment Battery for Children—Second Edition, (ii) ≥80 on the Wechsler Preschool and Primary Scale of Intelligence-Fourth Edition, and (iii) without cerebral palsy. Spatiotemporal gait variables and variability were assessed using GAITRite^® during preferred speed, cognitive and motor dual-task, and tandem conditions. Variables included speed (cm/s), step time (s), cadence (steps/min), step length (cm), base of support (BOS; cm), and single and double support time (%gait cycle). Results: Of 111 children who were assessed, 26 children were classified as at risk for DCD. Most gait variables were similar between groups at preferred speed walking. Children at risk for DCD had wider BOS and shorter single support time in motor dual-tasking (mean difference [MD] = 0.86 cm, 95% confidence interval [CI] 0.10, 1.61; MD = −1.77%, 95% CI −3.36, −0.19) compared to those not at risk. Similarly, wider BOS and higher cadence were found when tandem walking (MD = 0.63 cm, 95% CI 0.07, 1.20; MD = 0.63 steps/min, 95% CI 0.07, 1.20). Conclusions: Children born very preterm at risk for DCD had poorer walking performance than those not at risk for DCD at preschool age, especially during dual-task situations. Clinicians may incorporate complex gait assessments into early evaluations to detect subtle impairments in children. Future research is needed to investigate the impact of gait variability on children’s daily lives and participation in sports activities. Full article

(This article belongs to the Special Issue Physical Therapy in Pediatric Developmental Disorders)

► Show Figures

Figure 1

24 pages, 5065 KB

Open AccessArticle

Benchmark Dataset and Deep Model for Monocular Camera Calibration from Single Highway Images

by Wentao Zhang, Wei Jia and Wei Li

Sensors 2025, 25(18), 5815; https://doi.org/10.3390/s25185815 - 18 Sep 2025

Viewed by 432

Abstract

Single-image based camera auto-calibration holds significant value for improving perception efficiency in traffic surveillance systems. However, existing approaches face dual challenges: scarcity of real-world datasets and poor adaptability to multi-view scenarios. This paper presents a systematic solution framework. First, we constructed a large-scale [...] Read more.

Single-image based camera auto-calibration holds significant value for improving perception efficiency in traffic surveillance systems. However, existing approaches face dual challenges: scarcity of real-world datasets and poor adaptability to multi-view scenarios. This paper presents a systematic solution framework. First, we constructed a large-scale synthetic dataset containing 36 highway scenarios using the CARLA 0.9.15 simulation engine, generating approximately 336,000 virtual frames with precise calibration parameters. The dataset achieves statistical consistency with real-world scenes by incorporating diverse view distributions, complex weather conditions, and varied road geometries. Second, we developed DeepCalib, a deep calibration network that explicitly models perspective projection features through the triplet attention mechanism. This network simultaneously achieves road direction vanishing point localization and camera pose estimation using only a single image. Finally, we adopted a progressive learning paradigm: robust pre-training on synthetic data establishes universal feature representations in the first stage, followed by fine-tuning on real-world datasets in the second stage to enhance practical adaptability. Experimental results indicate that DeepCalib attains an average calibration precision of 89.6%. Compared to conventional multi-stage algorithms, our method achieves a single-frame processing speed of 10 frames per second, showing robust adaptability to dynamic calibration tasks across diverse surveillance views. Full article

(This article belongs to the Collection Applications of Convolutional Neural Networks in Imaging and Sensing)

► Show Figures

Figure 1

24 pages, 2616 KB

Open AccessArticle

Symmetric Affix–Context Co-Attention: A Dual-Gating Framework for Robust POS Tagging in Low-Resource MRLs

by Yuan Qi, Samat Ali and Alim Murat

Symmetry 2025, 17(9), 1561; https://doi.org/10.3390/sym17091561 - 18 Sep 2025

Viewed by 436

Abstract

Part-of-speech (POS) tagging in low-resource, morphologically rich languages (LRLs/MRLs) remains challenging due to extensive affixation, high out-of-vocabulary (OOV) rates, and pervasive polysemy. We propose MRL-POS, a unified Transformer-CRF framework that dynamically selects informative affix features and integrates them with deep contextual embeddings via [...] Read more.

Part-of-speech (POS) tagging in low-resource, morphologically rich languages (LRLs/MRLs) remains challenging due to extensive affixation, high out-of-vocabulary (OOV) rates, and pervasive polysemy. We propose MRL-POS, a unified Transformer-CRF framework that dynamically selects informative affix features and integrates them with deep contextual embeddings via a novel dual-gating co-attention mechanism. First, a Dynamic Affix Selector adaptively adjusts n-gram ranges and frequency thresholds based on word length to ensure high-precision affix segmentation. Second, the Affix–Context Co-Attention Module employs two gating functions that conditionally amplify contextual dimensions with affix cues and vice versa, enabling robust disambiguation of complex and ambiguous forms. Third, Layer-Wise Attention Pooling aggregates multi-layer XLM-RoBERTa representations, emphasizing those most relevant for morphological and syntactic tagging. Evaluations on Uyghur, Kyrgyz, and Uzbek show that MRL-POS achieves an average F₁ of 84.10%, OOV accuracy of 84.24%, and Poly-F₁ of 72.14%, outperforming strong baselines by up to 8 F₁ points. By explicitly modeling the symmetry between morphological affix cues and sentence-level context through a dual-gating co-attention mechanism, MRL-POS achieves a balanced fusion that both preserves local structure and captures global dependencies. Interpretability analyses confirm that 89.1% of the selected affixes align with linguistic expectations. This symmetric design not only enhances robustness in low-resource and agglutinative settings but also offers a general paradigm for symmetry-aware sequence labeling tasks. Full article

(This article belongs to the Special Issue Symmetry/Asymmetry Studies in Data Mining & Machine Learning of Large Language Models)

► Show Figures

Figure 1

24 pages, 8898 KB

Open AccessArticle

Performance and Efficiency Gains of NPU-Based Servers over GPUs for AI Model Inference

by Youngpyo Hong and Dongsoo Kim

Systems 2025, 13(9), 797; https://doi.org/10.3390/systems13090797 - 11 Sep 2025

Viewed by 1772

Abstract

The exponential growth of AI applications has intensified the demand for efficient inference hardware capable of delivering low-latency, high-throughput, and energy-efficient performance. This study presents a systematic, empirical comparison of GPU- and NPU-based server platforms across key AI inference domains: text-to-text, text-to-image, multimodal [...] Read more.

The exponential growth of AI applications has intensified the demand for efficient inference hardware capable of delivering low-latency, high-throughput, and energy-efficient performance. This study presents a systematic, empirical comparison of GPU- and NPU-based server platforms across key AI inference domains: text-to-text, text-to-image, multimodal understanding, and object detection. We configure representative models—LLama-family for text generation, Stable Diffusion variants for image synthesis, LLaVA-NeXT for multimodal tasks, and YOLO11 series for object detection—on a dual NVIDIA A100 GPU server and an eight-chip RBLN-CA12 NPU server. Performance metrics including latency, throughput, power consumption, and energy efficiency are measured under realistic workloads. Results demonstrate that NPUs match or exceed GPU throughput in many inference scenarios while consuming 35–70% less power. Moreover, optimization with the vLLM library on NPUs nearly doubles the tokens-per-second and yields a 92% increase in power efficiency. Our findings validate the potential of NPU-based inference architectures to reduce operational costs and energy footprints, offering a viable alternative to the prevailing GPU-dominated paradigm. Full article

(This article belongs to the Special Issue Data-Driven Analysis of Industrial Systems Using AI)

► Show Figures

Figure 1

26 pages, 62819 KB

Open AccessArticle

Low-Light Image Dehazing and Enhancement via Multi-Feature Domain Fusion

by Jiaxin Wu, Han Ai, Ping Zhou, Hao Wang, Haifeng Zhang, Gaopeng Zhang and Weining Chen

Remote Sens. 2025, 17(17), 2944; https://doi.org/10.3390/rs17172944 - 25 Aug 2025

Viewed by 876

Abstract

The acquisition of nighttime remote-sensing visible-light images is often accompanied by low-illumination effects and haze interference, resulting in significant image quality degradation and greatly affecting subsequent applications. Existing low-light enhancement and dehazing algorithms can handle each problem individually, but their simple cascade cannot [...] Read more.

The acquisition of nighttime remote-sensing visible-light images is often accompanied by low-illumination effects and haze interference, resulting in significant image quality degradation and greatly affecting subsequent applications. Existing low-light enhancement and dehazing algorithms can handle each problem individually, but their simple cascade cannot effectively address unknown real-world degradations. Therefore, we design a joint processing framework, WFDiff, which fully exploits the advantages of Fourier–wavelet dual-domain features and innovatively integrates the inverse diffusion process through differentiable operators to construct a multi-scale degradation collaborative correction system. Specifically, in the reverse diffusion process, a dual-domain feature interaction module is designed, and the joint probability distribution of the generated image and real data is constrained through differentiable operators: on the one hand, a global frequency-domain prior is established by jointly constraining Fourier amplitude and phase, effectively maintaining the radiometric consistency of the image; on the other hand, wavelets are used to capture high-frequency details and edge structures in the spatial domain to improve the prediction process. On this basis, a cross-overlapping-block adaptive smoothing estimation algorithm is proposed, which achieves dynamic fusion of multi-scale features through a differentiable weighting strategy, effectively solving the problem of restoring images of different sizes and avoiding local inconsistencies. In view of the current lack of remote-sensing data for low-light haze scenarios, we constructed the Hazy-Dark dataset. Physical experiments and ablation experiments show that the proposed method outperforms existing single-task or simple cascade methods in terms of image fidelity, detail recovery capability, and visual naturalness, providing a new paradigm for remote-sensing image processing under coupled degradations. Full article

(This article belongs to the Section AI Remote Sensing)

► Show Figures

Figure 1

21 pages, 728 KB

Open AccessArticle

Resolving Linguistic Asymmetry: Forging Symmetric Multilingual Embeddings Through Asymmetric Contrastive and Curriculum Learning

by Lei Meng, Yinlin Li, Wei Wei and Caipei Yang

Symmetry 2025, 17(9), 1386; https://doi.org/10.3390/sym17091386 - 25 Aug 2025

Viewed by 783

Abstract

The pursuit of universal, symmetric semantic representations within large language models (LLMs) faces a fundamental challenge: the inherent asymmetry of natural languages. Different languages exhibit vast disparities in syntactic structures, lexical choices, and cultural nuances, making the creation of a truly shared, symmetric [...] Read more.

The pursuit of universal, symmetric semantic representations within large language models (LLMs) faces a fundamental challenge: the inherent asymmetry of natural languages. Different languages exhibit vast disparities in syntactic structures, lexical choices, and cultural nuances, making the creation of a truly shared, symmetric embedding space a non-trivial task. This paper aims to address this critical problem by introducing a novel framework to forge robust and symmetric multilingual sentence embeddings. Our approach, named DACL (Dynamic Asymmetric Contrastive Learning), is anchored in two powerful asymmetric learning paradigms: Contrastive Learning and Dynamic Curriculum Learning (DCL). We extend Contrastive Learning to the multilingual context, where it asymmetrically treats semantically equivalent sentences from different languages (positive pairs) and sentences with distinct meanings (negative pairs) to enforce semantic symmetry in the target embedding space. To further refine this process, we incorporate Dynamic Curriculum Learning, which introduces a second layer of asymmetry by dynamically scheduling training instances from easy to hard. This dual-asymmetric strategy enables the model to progressively master complex cross-lingual relationships, starting with more obvious semantic equivalences and advancing to subtler ones. Our comprehensive experiments on benchmark cross-lingual tasks, including sentence retrieval and cross-lingual classification (XNLI, PAWS-X, MLDoc, MARC), demonstrate that DACL significantly outperforms a wide range of established baselines. The results validate our dual-asymmetric framework as a highly effective approach for forging robust multilingual embeddings, particularly excelling in tasks involving complex linguistic asymmetries. Ultimately, this work contributes a novel dual-asymmetric learning framework that effectively leverages linguistic asymmetry to achieve robust semantic symmetry across languages. It offers valuable insights for developing more capable, fair, and interpretable multilingual LLMs, emphasizing that deliberately leveraging asymmetry in the learning process is a highly effective strategy. Full article

(This article belongs to the Special Issue Symmetry/Asymmetry Studies in Data Mining & Machine Learning of Large Language Models)

► Show Figures

Figure 1

17 pages, 1594 KB

Open AccessArticle

TransMODAL: A Dual-Stream Transformer with Adaptive Co-Attention for Efficient Human Action Recognition

by Majid Joudaki, Mehdi Imani and Hamid R. Arabnia

Electronics 2025, 14(16), 3326; https://doi.org/10.3390/electronics14163326 - 21 Aug 2025

Viewed by 1073

Abstract

Human Action Recognition has seen significant advances through transformer-based architectures, yet achieving a nuanced understanding often requires fusing multiple data modalities. Standard models relying solely on RGB video can struggle with actions defined by subtle motion cues rather than appearance. This paper introduces [...] Read more.

Human Action Recognition has seen significant advances through transformer-based architectures, yet achieving a nuanced understanding often requires fusing multiple data modalities. Standard models relying solely on RGB video can struggle with actions defined by subtle motion cues rather than appearance. This paper introduces TransMODAL, a novel dual-stream transformer that synergistically fuses spatiotemporal appearance features from a pre-trained VideoMAE(Video Masked AutoEncoders) backbone with explicit skeletal kinematics from a state-of-the-art pose estimation pipeline (RT-DETR(Real-Time DEtection Transformer) + ViTPose++). We propose two key architectural innovations to enable effective and efficient fusion: a CoAttentionFusion module that facilitates deep, iterative cross-modal feature exchange between the RGB and pose streams, and an efficient AdaptiveSelector mechanism that dynamically prunes less informative spatiotemporal tokens to reduce computational overhead. Evaluated on three challenging benchmarks, TransMODAL demonstrates robust generalization, achieving accuracies of 98.5% on KTH, 96.9% on UCF101, and 84.2% on HMDB51. These results significantly outperform a strong VideoMAE-only baseline and are competitive with state-of-the-art methods, demonstrating the profound impact of explicit pose guidance. TransMODAL presents a powerful and efficient paradigm for composing pre-trained foundation models to tackle complex video understanding tasks by providing a fully reproducible implementation and strong benchmark results. Full article

(This article belongs to the Special Issue Real-Time Audio, Video and Image Processing: Latest Advances and Prospects)

► Show Figures

Figure 1

31 pages, 2542 KB

Open AccessArticle

ECR-MobileNet: An Imbalanced Largemouth Bass Parameter Prediction Model with Adaptive Contrastive Regression and Dependency-Graph Pruning

by Hao Peng, Cheng Ouyang, Lin Yang, Jingtao Deng, Mingyu Tan, Yahui Luo, Wenwu Hu, Pin Jiang and Yi Wang

Animals 2025, 15(16), 2443; https://doi.org/10.3390/ani15162443 - 20 Aug 2025

Viewed by 551

Abstract

The precise, non-destructive monitoring of fish length and weight is a core technology for advancing intelligent aquaculture. However, this field faces dual challenges: traditional contact-based measurements induce stress and yield loss. In addition, existing computer vision methods are hindered by prediction biases from [...] Read more.

The precise, non-destructive monitoring of fish length and weight is a core technology for advancing intelligent aquaculture. However, this field faces dual challenges: traditional contact-based measurements induce stress and yield loss. In addition, existing computer vision methods are hindered by prediction biases from imbalanced data and the deployment bottleneck of balancing high accuracy with model lightweighting. This study aims to overcome these challenges by developing an efficient and robust deep learning framework. We propose ECR-MobileNet, a lightweight framework built on MobileNetV3-Small. It features three key innovations: an efficient channel attention (ECA) module to enhance feature discriminability, an original adaptive multi-scale contrastive regression (AMCR) loss function that extends contrastive learning to multi-dimensional regression for length and weight simultaneously to mitigate data imbalance, and a dependency-graph-based (DepGraph) structured pruning technique that synergistically optimizes model size and performance. On our multi-scene largemouth bass dataset, the pruned ECR-MobileNet-P model comprehensively outperformed 14 mainstream benchmarks. It achieved an R² of 0.9784 and a root mean square error (RMSE) of 0.4296 cm for length prediction, as well as an R² of 0.9740 and an RMSE of 0.0202 kg for weight prediction. The model’s parameter count is only 0.52 M, with a computational load of 0.07 giga floating-point operations per second (GFLOPs) and a CPU latency of 10.19 ms, achieving Pareto optimality. This study provides an edge-deployable solution for stress-free biometric monitoring in aquaculture and establishes an innovative methodological paradigm for imbalanced regression and task-oriented model compression. Full article

(This article belongs to the Section Aquatic Animals)

► Show Figures

Figure 1

24 pages, 6883 KB

Open AccessArticle

A Human-in-the-Loop Study of Eye-Movement-Based Control for Workload Reduction in Delayed Teleoperation of Ground Vehicles

by Qiang Zhang, Aiping Zhao, Feng Zhao and Wangyu Wu

Machines 2025, 13(8), 735; https://doi.org/10.3390/machines13080735 - 18 Aug 2025

Viewed by 824

Abstract

Teleoperated ground vehicles (TGVs) are widely applied in hazardous and dynamic environments, where communication delay and low transparency increase operator workload and reduce control performance. This study explores the cognitive and physiological workload associated with such conditions and evaluates the effectiveness of an [...] Read more.

Teleoperated ground vehicles (TGVs) are widely applied in hazardous and dynamic environments, where communication delay and low transparency increase operator workload and reduce control performance. This study explores the cognitive and physiological workload associated with such conditions and evaluates the effectiveness of an eye-movement-based predicted trajectory guidance control (ePTGC) framework in alleviating operator burden. A human-in-the-loop teleoperation experiment was conducted using a 2 × 2 within-subject design, incorporating subjective ratings (NASA-TLX), objective performance metrics from a dual-task paradigm (one-back memory task), and multimodal physiological indicators (ECG and EDA). Results show that delay and low transparency significantly elevated subjective, objective, and physiological workload levels. Compared to direct control (DC), the ePTGC framework significantly reduced workload across all three dimensions, particularly under high-delay conditions, while maintaining or even improving task performance. Notably, ePTGC enabled even lower workload levels under low-delay conditions than the baseline condition. These findings demonstrate the potential of the ePTGC framework to enhance teleoperation stability and reduce operator burden in delay-prone and low-transparency scenarios. Full article

► Show Figures

Figure 1

22 pages, 24173 KB

Open AccessArticle

ScaleViM-PDD: Multi-Scale EfficientViM with Physical Decoupling and Dual-Domain Fusion for Remote Sensing Image Dehazing

by Hao Zhou, Yalun Wang, Wanting Peng, Xin Guan and Tao Tao

Remote Sens. 2025, 17(15), 2664; https://doi.org/10.3390/rs17152664 - 1 Aug 2025

Viewed by 520

Abstract

Remote sensing images are often degraded by atmospheric haze, which not only reduces image quality but also complicates information extraction, particularly in high-level visual analysis tasks such as object detection and scene classification. State-space models (SSMs) have recently emerged as a powerful paradigm [...] Read more.

Remote sensing images are often degraded by atmospheric haze, which not only reduces image quality but also complicates information extraction, particularly in high-level visual analysis tasks such as object detection and scene classification. State-space models (SSMs) have recently emerged as a powerful paradigm for vision tasks, showing great promise due to their computational efficiency and robust capacity to model global dependencies. However, most existing learning-based dehazing methods lack physical interpretability, leading to weak generalization. Furthermore, they typically rely on spatial features while neglecting crucial frequency domain information, resulting in incomplete feature representation. To address these challenges, we propose ScaleViM-PDD, a novel network that enhances an SSM backbone with two key innovations: a Multi-scale EfficientViM with Physical Decoupling (ScaleViM-P) module and a Dual-Domain Fusion (DD Fusion) module. The ScaleViM-P module synergistically integrates a Physical Decoupling block within a Multi-scale EfficientViM architecture. This design enables the network to mitigate haze interference in a physically grounded manner at each representational scale while simultaneously capturing global contextual information to adaptively handle complex haze distributions. To further address detail loss, the DD Fusion module replaces conventional skip connections by incorporating a novel Frequency Domain Module (FDM) alongside channel and position attention. This allows for a more effective fusion of spatial and frequency features, significantly improving the recovery of fine-grained details, including color and texture information. Extensive experiments on nine publicly available remote sensing datasets demonstrate that ScaleViM-PDD consistently surpasses state-of-the-art baselines in both qualitative and quantitative evaluations, highlighting its strong generalization ability. Full article

(This article belongs to the Special Issue Artificial Intelligence Algorithm for Remote Sensing Imagery Processing (5th Edition))

► Show Figures

Figure 1

24 pages, 9664 KB

Open AccessArticle

Frequency-Domain Collaborative Lightweight Super-Resolution for Fine Texture Enhancement in Rice Imagery

by Zexiao Zhang, Jie Zhang, Jinyang Du, Xiangdong Chen, Wenjing Zhang and Changmeng Peng

Agronomy 2025, 15(7), 1729; https://doi.org/10.3390/agronomy15071729 - 18 Jul 2025

Viewed by 670

Abstract

In rice detection tasks, accurate identification of leaf streaks, pest and disease distribution, and spikelet hierarchies relies on high-quality images to distinguish between texture and hierarchy. However, existing images often suffer from texture blurring and contour shifting due to equipment and environment limitations, [...] Read more.

In rice detection tasks, accurate identification of leaf streaks, pest and disease distribution, and spikelet hierarchies relies on high-quality images to distinguish between texture and hierarchy. However, existing images often suffer from texture blurring and contour shifting due to equipment and environment limitations, which affects the detection performance. In view of the fact that pests and diseases affect the whole situation and tiny details are mostly localized, we propose a rice image reconstruction method based on an adaptive two-branch heterogeneous structure. The method consists of a low-frequency branch (LFB) that recovers global features using orientation-aware extended receptive fields to capture streaky global features, such as pests and diseases, and a high-frequency branch (HFB) that enhances detail edges through an adaptive enhancement mechanism to boost the clarity of local detail regions. By introducing the dynamic weight fusion mechanism (CSDW) and lightweight gating network (LFFN), the problem of the unbalanced fusion of frequency information for rice images in traditional methods is solved. Experiments on the 4× downsampled rice test set demonstrate that the proposed method achieves a 62% reduction in parameters compared to EDSR, 41% lower computational cost (30 G) than MambaIR-light, and an average PSNR improvement of 0.68% over other methods in the study while balancing memory usage (227 M) and inference speed. In downstream task validation, rice panicle maturity detection achieves a 61.5% increase in mAP50 (0.480 → 0.775) compared to interpolation methods, and leaf pest detection shows a 2.7% improvement in average mAP50 (0.949 → 0.975). This research provides an effective solution for lightweight rice image enhancement, with its dual-branch collaborative mechanism and dynamic fusion strategy establishing a new paradigm in agricultural rice image processing. Full article

(This article belongs to the Collection AI, Sensors and Robotics for Smart Agriculture)

► Show Figures

Figure 1

31 pages, 3231 KB

Open AccessArticle

Capturing User Preferences via Multi-Perspective Hypergraphs with Contrastive Learning for Next-Location Prediction

by Fengyu Liu, Kexin Zhang, Chao Lian and Yunong Tian

Appl. Sci. 2025, 15(14), 7672; https://doi.org/10.3390/app15147672 - 9 Jul 2025

Viewed by 619

Abstract

With the widespread adoption of mobile devices and the increasing availability of user trajectory data, accurately predicting the next location a user will visit has become a pivotal task in location-based services. Despite recent progress, existing methods often fail to effectively disentangle the [...] Read more.

With the widespread adoption of mobile devices and the increasing availability of user trajectory data, accurately predicting the next location a user will visit has become a pivotal task in location-based services. Despite recent progress, existing methods often fail to effectively disentangle the diverse and entangled behavioral signals, such as collaborative user preferences, global transition mobility patterns, and geographical influences, embedded in user trajectories. To address these challenges, we propose a novel framework named Multi-Perspective Hypergraphs with Contrastive Learning (MPHCL), which explicitly captures and disentangles user preferences from three complementary perspectives. Specifically, MPHCL constructs a global transition flow graph and two specialized hypergraphs: a collective preference hypergraph to model collaborative check-in behavior and a geospatial-context hypergraph to reflect geographical proximity relationships. A unified hypergraph representation learning network is developed to preserve semantic independence across views through a dual propagation mechanism. Furthermore, we introduce a cross-view contrastive learning strategy that aligns multi-perspective embeddings by maximizing agreement between corresponding user and location representations across views while enhancing discriminability through negative sampling. Extensive experiments conducted on two real-world datasets demonstrate that MPHCL consistently outperforms state-of-the-art baselines. These results validate the effectiveness of our multi-perspective learning paradigm for next-location prediction. Full article

► Show Figures

Figure 1

Search Results (90)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (90)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI