MDPI - Publisher of Open Access Journals

36 pages, 23663 KB

Open AccessArticle

Neuro-Prismatic Video Models for Causality-Aware Action Recognition in Neural Rehabilitation Systems

by Hend Alshaya

Mathematics 2026, 14(8), 1341; https://doi.org/10.3390/math14081341 - 16 Apr 2026

Video-based action recognition for neural rehabilitation—spanning stroke recovery, Parkinsonian gait assessment, and cerebral palsy monitoring—faces critical challenges, including temporal ambiguity, non-causal motion correlations, and the absence of causally grounded dynamics modeling. While transformer-based architectures achieve strong performance, they often exploit spurious temporal and [...] Read more.

Video-based action recognition for neural rehabilitation—spanning stroke recovery, Parkinsonian gait assessment, and cerebral palsy monitoring—faces critical challenges, including temporal ambiguity, non-causal motion correlations, and the absence of causally grounded dynamics modeling. While transformer-based architectures achieve strong performance, they often exploit spurious temporal and environmental cues, limiting reliability in safety-critical clinical settings. We propose NeuroPrisma, a neuro-prismatic video framework that integrates frequency-domain spectral decomposition with causal intervention under Structural Causal Models (SCMs) via the backdoor criterion. NeuroPrisma introduces (i) a Prismatic Spectral Attention (PSA) module, which applies discrete Fourier transforms to decompose temporal features into multi-scale frequency bands, disentangling slow postural dynamics from rapid corrective movements, and (ii) a Causal Intervention Layer (CIL), which performs do-calculus-based backdoor adjustment to remove confounding influences and produce causally invariant representations. PSA preconditions representations prior to intervention, improving confounder estimation and causal robustness. Extensive evaluation against seven state-of-the-art models (I3D, SlowFast, TimeSformer, ViViT, Video Swin Transformer, UniFormerV2, and VideoMAE) demonstrates that NeuroPrisma achieves 98.7% Top-1 accuracy on UCF101, 82.4% on HMDB51, 71.2% on Something-Something V2, and 91.5%/95.8% on NTU RGB+D (Cross-Subject/Cross-View), consistently outperforming prior methods. It further reduces the Causal Confusion Score (CCS) by 42.3%, indicating substantially lower reliance on spurious correlations, while maintaining real-time performance with 23.4 ms latency per 16-frame clip on an NVIDIA A100 GPU. All improvements are statistically significant (p < 0.001, Cohen’s d = 0.72–1.24). Evaluation was conducted exclusively on benchmark datasets (UCF101, HMDB51, Something-Something V2, and NTU RGB+D) under controlled conditions, without direct clinical validation on neurological patient cohorts. Overfitting was mitigated using three random seeds (42, 123, 456), RandAugment, Mixup (α = 0.8), weight decay (0.05), and early stopping. Cross-dataset generalization from UCF101 to HMDB51 without fine-tuning achieved 76.2% Top-1 accuracy. Future work will focus on prospective clinical validation across stroke, Parkinson’s disease, and cerebral palsy populations, including correlation with standardized clinical assessment scales such as Fugl–Meyer, UPDRS, and GMFCS. These results establish NeuroPrisma as a causally grounded and computationally efficient framework for reliable, real-time movement assessment in clinical rehabilitation systems. Full article

(This article belongs to the Special Issue Advances in Computer Vision and Image Processing with Applications to Bioinformatics)

► Show Figures

Figure 1

26 pages, 5204 KB

Open AccessArticle

A Spatial-Frequency Joint Decoupling Network for Dense Small-Object Detection

by Zhexiang Zhao, Jintong Li and Peng Liu

Remote Sens. 2026, 18(8), 1203; https://doi.org/10.3390/rs18081203 - 16 Apr 2026

Abstract

Small-object detection in remote sensing imagery faces two specific challenges that existing lightweight detectors fail to address jointly: the irreversible loss of high-frequency boundary cues during repeated downsampling, and feature smearing between neighboring instances caused by uniform multi-scale fusion. This paper presents SFD-Net, [...] Read more.

Small-object detection in remote sensing imagery faces two specific challenges that existing lightweight detectors fail to address jointly: the irreversible loss of high-frequency boundary cues during repeated downsampling, and feature smearing between neighboring instances caused by uniform multi-scale fusion. This paper presents SFD-Net, a spatial–frequency adaptive network designed to explicitly address these two limitations for aerial imagery. A backbone network and a spatial–frequency adaptive neck are used in the proposed model. Wavelet-based downsampling is applied in the backbone to reduce aliasing while preserving high-frequency information. The direction-sensitive aggregation is incorporated to better capture oriented structural patterns. In the neck, asymmetric and scale-dependent feature routing is introduced to enhance shallow boundary cues, improve instance separation in crowded regions, and limit interference from deep semantic features. Experiments on the VisDrone-DET2019, UAVDT, SIMD, and NWPU VHR-10 datasets demonstrate that SFD-Net achieves a favorable balance between detection accuracy and computational cost. In particular, on the SIMD dataset, SFD-Net achieves 82.2% mAP@0.5 and 66.7% mAP@0.5:0.95 with only 3.4 M parameters and 8.3 GFLOPs. These results indicate that the proposed method is an effective and parameter-efficient solution for remote sensing small-object detection, especially in resource-constrained deployment scenarios. Full article

► Show Figures

Figure 1

13 pages, 1200 KB

Open AccessArticle

Spatial Release from Masking with Simulated Electric–Acoustic and Cochlear Implant Speech

by Nirmal Srinivasan, Bailey Borkowski, Morgan Barkhouse and Chhayakanta Patro

J. Otorhinolaryngol. Hear. Balance Med. 2026, 7(1), 15; https://doi.org/10.3390/ohbm7010015 - 16 Apr 2026

Viewed by 8

Abstract

Background/Objectives: Spatial release from masking (SRM) refers to the improvement in speech understanding that occurs when a target talker is spatially separated from competing speech. Although normal-hearing (NH) listeners benefit substantially from spatially separating the maskers from the target, cochlear implant (CI) users [...] Read more.

Background/Objectives: Spatial release from masking (SRM) refers to the improvement in speech understanding that occurs when a target talker is spatially separated from competing speech. Although normal-hearing (NH) listeners benefit substantially from spatially separating the maskers from the target, cochlear implant (CI) users experience markedly reduced advantages due to degraded spectral and binaural cue transmission. Electric–acoustic stimulation (EAS), which preserves low-frequency acoustic hearing in combination with electric stimulation, may partially restore these cues, but its benefits at small, conversationally relevant spatial separations remain poorly understood. Methods: This study measured speech identification thresholds using Coordinate Response Measure (CRM) sentences in NH listeners using natural, EAS, and simulated CI speech across five spatial configurations (0°, ±5°, ±10°, ±15°, ±30°). Speech identification thresholds were measured using a one-up/one-down adaptive procedure with Coordinate Response Measure (CRM) sentences. CI simulation used an eight-channel noise-band vocoder, whereas EAS simulation replaced the two lowest-frequency vocoder channels with low-pass speech (≤500 Hz). All stimuli were spatialized using head-related impulse responses generated from a validated virtual-acoustics model. Results: All stimulus types showed improved thresholds with increasing spatial separation; however, the magnitude of spatial release from masking (SRM) varied systematically. Natural speech produced the lowest thresholds and largest SRM, EAS speech yielded intermediate benefits, and simulated CI speech produced the smallest improvements. Notably, EAS and CI simulations were comparable at small separations, but EAS provided significantly greater SRM at ±15° and ±30°. Conclusions: These findings demonstrate that even partial low-frequency acoustic preservation enhances SRM at moderate spatial separations, highlighting the importance of EAS configurations for improving spatial hearing in CI-related listening environments. Full article

(This article belongs to the Section Otology and Neurotology)

► Show Figures

Figure 1

24 pages, 1651 KB

Open AccessArticle

FALB: A Frequency-Aware Lightweight Bottleneck with Learnable Wavelet Fusion and Contextual Attention for Enhanced Ship Classification in Remote Sensing

by Liang Huang, Yiping Song, Qiao Sun, He Yang, Lin Chen and Xianfeng Zhang

Remote Sens. 2026, 18(8), 1186; https://doi.org/10.3390/rs18081186 - 15 Apr 2026

Viewed by 204

Abstract

Ship classification in optical remote sensing requires balancing discriminative representation and model efficiency. Standard convolutional neural network (CNN) bottlenecks rely on local spatial kernels and may emphasize high-frequency texture cues, while stronger backbones increase parameter cost. We propose a frequency-aware lightweight bottleneck (FALB) [...] Read more.

Ship classification in optical remote sensing requires balancing discriminative representation and model efficiency. Standard convolutional neural network (CNN) bottlenecks rely on local spatial kernels and may emphasize high-frequency texture cues, while stronger backbones increase parameter cost. We propose a frequency-aware lightweight bottleneck (FALB) that couples enhanced wavelet convolution (WTsConv) and contextual anchor attention (CAA) in a cascaded design. WTsConv adopts Sym4 wavelets and a learnable symmetric fusion weight between spatial and wavelet-reconstructed features to improve frequency-aware feature mixing. CAA is then applied to the refined features for contextual aggregation. Integrated into ResNet-50 bottlenecks, FALB is evaluated on FGSCM-52 and achieves 97.88% top-1 accuracy with 17.78 M parameters, compared with 96.92% and 25.56 M for the ResNet-50 baseline, surpassing ResNet-50 by 0.96% and outperforming compared general-purpose baselines while reducing parameters by 30.4%. Under this experimental setting, FALB improves the observed accuracy–parameter trade-off for remote sensing ship classification. Full article

(This article belongs to the Special Issue Ship Imaging, Detection and Recognition for High-Resolution SAR)

► Show Figures

Figure 1

21 pages, 11025 KB

Open AccessArticle

A Multi-Step RUL Prediction Method for Lithium-Ion Batteries Based on Multi-Scale Temporal Features and Frequency-Domain Spectral Interaction

by Ye Tu, Shixiong Xu, Jie Wang and Mengting Jin

Batteries 2026, 12(4), 137; https://doi.org/10.3390/batteries12040137 - 14 Apr 2026

Viewed by 213

Abstract

With the rapid development of new energy vehicles and energy storage systems, accurate prediction of the remaining useful life (RUL) of lithium-ion batteries is of great importance for predictive maintenance and operational safety. However, battery degradation during cycling usually exhibits multi-scale characteristics, including [...] Read more.

With the rapid development of new energy vehicles and energy storage systems, accurate prediction of the remaining useful life (RUL) of lithium-ion batteries is of great importance for predictive maintenance and operational safety. However, battery degradation during cycling usually exhibits multi-scale characteristics, including long-term degradation trends, stage-wise drifts, and stochastic disturbances, which makes existing methods still face significant challenges in multi-step forecasting and cross-domain generalization. To address this issue, this paper proposes a time–frequency fusion model for multi-step RUL prediction, termed TF-RULNet (Time-Frequency RUL Network). The model takes cycle-level feature sequences as input and consists of three components: a multi-scale temporal convolution encoder (MSTC) for parallel extraction of degradation cues at different temporal scales; a multi-head spectral interaction module (MHSI), which performs 1D-FFT along the temporal dimension for each head and further applies adaptive band-wise mask refinement to capture local spectral structures and hierarchical band patterns with a computational complexity of

O (L log L)

; and a cross-gated fusion module (CGF), which generates gating signals from the summary of one domain to modulate the features of the other domain, thereby enabling dynamic balancing and complementary enhancement of time–frequency information. Experiments are conducted on the NASA dataset (B005/B007) for in-domain evaluation, and further cross-dataset tests from NASA to the Maryland dataset (CS-35/CS-37) are carried out to verify the robustness of the proposed model under distribution shifts. The results show that, compared with the strongest baseline PatchTST, TF-RULNet reduces RMSE and MAE by more than 38.23% and 50.51%, respectively, in cross-dataset generalization, while achieving an additional RMSE reduction of about 24% in in-domain prediction. In summary, TF-RULNet can effectively characterize the multi-scale time–frequency degradation patterns of batteries and improve cross-domain generalization, providing a high-accuracy and scalable modeling solution for practical battery health management and life prognostics. Full article

► Show Figures

Figure 1

20 pages, 1516 KB

Open AccessArticle

Unlikely Storyteller: Leveraging Narrative-Based Communication in LLM-Generated Medical Advice

by Fan Wang, Ningshen Wang, Weiming Xu and Peng Zhang

Healthcare 2026, 14(8), 1015; https://doi.org/10.3390/healthcare14081015 - 13 Apr 2026

Viewed by 253

Abstract

Background/Objectives: Time-constrained consultations in high-volume settings can crowd out patient-centered communication, while AI-generated advice may face algorithm aversion when it lacks a humanistic dimension. This study examined whether a brief narrative-based prompt could improve coded patient-facing communication features in an LLM relative to [...] Read more.

Background/Objectives: Time-constrained consultations in high-volume settings can crowd out patient-centered communication, while AI-generated advice may face algorithm aversion when it lacks a humanistic dimension. This study examined whether a brief narrative-based prompt could improve coded patient-facing communication features in an LLM relative to both clinicians and an unprompted model on authentic patient queries. Methods: We conducted a three-condition comparative evaluation using a stratified sample of 1000 de-identified MedDialog-CN consultations (2016–2020). For each consultation, the same patient query was used to generate (i) a zero-shot GPT-o3-mini response and (ii) a narrative-prompted GPT-o3-mini response; the original physician reply served as the human baseline. Responses were annotated with a pre-specified schema operationalizing four communication dimensions—Storytelling, Empathy, Personalization, and Clarity—with expert adjudication. Frequency-based indicators were summarized as mean events per consultation, and binary indicators as proportions; secondary checks captured unwarranted certainty and risk-relevant language. Results: Narrative prompting shifted coded patient-facing communication from sparse and selectively deployed (clinicians and zero-shot AI) to more routine and standardized. Across the reported communication measures, the prompted model showed the most favorable overall pattern, with higher narrative-device use, empathic support, contextual tailoring, and terminology explanation, alongside more frequent consideration of patient preferences and markedly higher rates of emotion–symptom linkage and the presence of a patient-centered narrative framework. Conclusions: Narrative prompting may offer a lightweight and potentially scalable strategy for improving patient-facing communication in Chinese asynchronous, text-based online consultations. An important next step is calibration: humanistic cues should be delivered selectively and safely so that responses remain credible, locally feasible, and cognitively manageable. Full article

(This article belongs to the Special Issue Artificial Intelligence in Healthcare: Opportunities and Challenges)

► Show Figures

Figure 1

17 pages, 2885 KB

Open AccessArticle

End-to-End 3-D Sound Source Localization from the Raw Waveform Based on Stereo Microphone Array

by Lipeng Xu and Chao Yang

Sensors 2026, 26(8), 2372; https://doi.org/10.3390/s26082372 - 12 Apr 2026

Viewed by 342

Abstract

The problem of performance degradation in current sound source localization algorithms under reverberant and noisy environments remains a critical challenge. Consequently, this paper introduces a novel approach to estimate the 3-D position of sound sources directly from raw audio signals using an artificial [...] Read more.

The problem of performance degradation in current sound source localization algorithms under reverberant and noisy environments remains a critical challenge. Consequently, this paper introduces a novel approach to estimate the 3-D position of sound sources directly from raw audio signals using an artificial neural network (ANN), which improves the performance of sound source localization algorithms under reverberant and noisy environments. Instead of relying on handcrafted features, raw audio signals recorded by a tetrahedral stereo microphone array are fed directly into the ANN. This design eliminates spatial symmetry issues found in 2-D microphone arrays and enhances 3-D localization accuracy. Inspired by human auditory systems, a convolutional layer is added after the input layer to simulate frequency analysis to search localization cues in different frequency bands. Furthermore, the proposed algorithm incorporates residual connections (RC) and squeeze-and-excitation (SE: an attention mechanisms). Residual connections introduce raw features into deeper network layers to prevent localized information loss caused by excessive network depth, while also enabling improved model training stability. The attention mechanism dynamically adjusts weights across and within channels, suppressing interference while enhancing localization-critical cues, thereby playing a pivotal role in boosting the algorithm’s reverberation and noise resistance. Experimental results demonstrate significant improvements: in semi-anechoic chambers, the method reduces localization errors by 0.2 m and increases accuracy by 10%; in conference rooms, errors decrease by 0.26 m with a 21% accuracy gain. These outcomes conclusively validate the effectiveness of the proposed approach in enhancing robustness against reverberation and noise in sound source localization systems. Full article

(This article belongs to the Special Issue AI and Smart Sensors for Intelligent Transportation Systems)

► Show Figures

Figure 1

21 pages, 5426 KB

Open AccessArticle

Deep Learning-Based Recognition and Classification of Jin Cang Embroidery Stitches

by Ke-Ke Sun, Lu-Fei Yang, Zi-Ning Lan and Lu Gao

Mathematics 2026, 14(8), 1259; https://doi.org/10.3390/math14081259 - 10 Apr 2026

Viewed by 227

Abstract

Jin Cang embroidery, characterized by elaborate metallic threadwork and intricate textural patterns, is an important form of intangible cultural heritage. The digital preservation of Jin Cang embroidery is hindered by the scarcity of specialized datasets and the lack of object detection models that [...] Read more.

Jin Cang embroidery, characterized by elaborate metallic threadwork and intricate textural patterns, is an important form of intangible cultural heritage. The digital preservation of Jin Cang embroidery is hindered by the scarcity of specialized datasets and the lack of object detection models that balance high performance with computational efficiency for edge deployment. To address these challenges, a dedicated dataset comprising 3050 images across eight core stitch categories is introduced as the first dataset of its kind for Jin Cang embroidery. Building upon this foundation, Lite-YOLOv11s, a domain-specific lightweight detection framework, is proposed with MobileNetV4 as its backbone to improve the extraction of high-frequency texture cues associated with metallic threadwork. Experimental results show that Lite-YOLOv11s achieves an mAP@0.5 of 0.951, outperforming the YOLOv11s baseline (0.927) while reducing model parameters by 40% and FLOPs by 46%. EigenCAM visualizations further show that the model can localize discriminative stitch-level features even under complex backgrounds. This work provides an efficient and deployable solution for intelligent embroidery recognition and offers a useful reference for the digital preservation of other fine-grained cultural heritage crafts. Full article

(This article belongs to the Special Issue Advances in Artificial Intelligence, Machine Learning and Optimization, 2nd Edition)

► Show Figures

Figure 1

28 pages, 2314 KB

Open AccessArticle

EF-YOLO: Detecting Small Targets in Early-Stage Agricultural Fires via UAV-Based Remote Sensing

by Jun Tao, Zhihan Wang, Jianqiu Wu, Yunqin Li, Tomohiro Fukuda and Jiaxin Zhang

Remote Sens. 2026, 18(8), 1119; https://doi.org/10.3390/rs18081119 - 9 Apr 2026

Viewed by 277

Abstract

Early detection of agricultural fires with Unmanned Aerial Vehicles (UAVs) is important for environmental safety, yet it remains difficult because ignition cues are extremely small, smoke patterns vary widely, and farmland scenes often contain strong background interference such as specular reflections. Model development [...] Read more.

Early detection of agricultural fires with Unmanned Aerial Vehicles (UAVs) is important for environmental safety, yet it remains difficult because ignition cues are extremely small, smoke patterns vary widely, and farmland scenes often contain strong background interference such as specular reflections. Model development is further constrained by the scarcity of data from the early ignition stage. To address these challenges, we propose a joint data and model optimization framework. We first build a hybrid dataset through an ROI-guided synthesis pipeline, in which latent diffusion models are used to insert high-fidelity, carefully screened fire samples into real farmland backgrounds. We then introduce EF-YOLO, a detector designed for high sensitivity to small targets. The network uses SPD-Conv to reduce feature loss during spatial downsampling and includes a high-resolution P2 head to improve the detection of minute objects. To reduce background clutter, a Dual-Path Frequency–Spatial Enhancement (DP-FSE) module serves as a lightweight statistical surrogate that extracts global contextual cues and local salient features in parallel, thereby suppressing high-frequency noise. Experimental results show that EF-YOLO achieves an AP_s of 40.2% on sub-pixel targets, exceeding the YOLOv8s baseline by 15.4 percentage points. With a recall of 88.7% and a real-time inference speed of 78 FPS, the proposed framework offers a strong balance between detection performance and efficiency, making it well suited for edge-deployed agricultural fire early-warning systems. Full article

18 pages, 3641 KB

Open AccessArticle

A Wavelet-Enhanced Detector for Tiny Objects in Remote-Sensing Images

by Weifan Xu and Yong Hu

Remote Sens. 2026, 18(8), 1109; https://doi.org/10.3390/rs18081109 - 8 Apr 2026

Viewed by 334

Abstract

Accurate and efficient detection is pivotal for tiny objects in remote sensing. However, achieving a favorable accuracy-efficiency trade-off remains challenging due to the few informative pixels of small targets, frequent occlusions, cluttered backgrounds, and detail degradation introduced by downsampling and multi-scale fusion. To [...] Read more.

Accurate and efficient detection is pivotal for tiny objects in remote sensing. However, achieving a favorable accuracy-efficiency trade-off remains challenging due to the few informative pixels of small targets, frequent occlusions, cluttered backgrounds, and detail degradation introduced by downsampling and multi-scale fusion. To address these challenges, we propose WEYOLO, a wavelet-enhanced detector that explicitly models frequency components and adaptively strengthens high-frequency cues to improve tiny-object robustness while maintaining competitive efficiency in inference speed and model size for remote-sensing deployment. To preserve edges and textures when spatial resolution is reduced, we design a Frequency-Aware Lifting Haar (FaLH) backbone that decomposes features into directional sub-bands and retains them during downsampling, preventing the loss of high-frequency information. Next, to address the blurring and detail loss caused by conventional pooling during multi-scale fusion, we introduce a Frequency-Domain Pyramid-Pooling (FDPP) module that performs wavelet-based multi-resolution analysis for frequency-aware feature-pyramid fusion. Additionally, we propose a stable size-aware quality focal regression loss that unifies Focaler-CIoU and size-aware DFL into a single objective, improving robustness and overall accuracy for small objects. Comprehensive experiments show that WEYOLO improves precision and recall over the baseline by 3.2%/4.2% on VisDrone and 2.6%/9.7% on TT100K; on AI-TOD, it achieves 47.5% mAP@0.5 and 21.3% mAP@0.5:0.95. Meanwhile, it reduces the parameter count by 60%, achieving a strong accuracy-efficiency balance for practical aerial sensing deployment. Full article

(This article belongs to the Section AI Remote Sensing)

► Show Figures

Figure 1

26 pages, 6011 KB

Open AccessArticle

CFADet: A Contextual and Frequency-Aware Detector for Citrus Buds in Complex Orchards Enabling Early Yield Estimation

by Qizong Lu, Lina Yang, Haoyan Yang, Yujian Yuan, Qinghua Lai and Jisen Zhang

Horticulturae 2026, 12(4), 459; https://doi.org/10.3390/horticulturae12040459 - 8 Apr 2026

Viewed by 255

Abstract

Citrus trees exhibit severe alternate bearing, resulting in significant annual yield fluctuations and posing substantial challenges to orchard management planning. Accurate citrus bud counting provides an effective solution by supplying essential data for tree-level and orchard-level yield prediction. However, citrus buds are extremely [...] Read more.

Citrus trees exhibit severe alternate bearing, resulting in significant annual yield fluctuations and posing substantial challenges to orchard management planning. Accurate citrus bud counting provides an effective solution by supplying essential data for tree-level and orchard-level yield prediction. However, citrus buds are extremely small (5–10 mm in diameter) and are frequently occluded by leaves during the flowering stage, which makes precise detection highly challenging in complex orchard environments. To address these challenges, this paper proposes a Contextual and Frequency-Aware Detector (CFADet) for robust citrus bud detection. Specifically, an Enhanced Feature Fusion (EFF) module is introduced in the neck to refine multi-scale feature aggregation and strengthen information flow for small targets. A Contextual Boundary Enhancement Module (CBEM) is designed to capture surrounding contextual cues and enhance boundary representation through dimensional interaction and max-pooling operations. To suppress background interference, a Frequency-Aware Module (FAM) is developed to adaptively recalibrate frequency components in the amplitude spectrum, thereby enhancing target features while reducing background noise. In addition, Spatial-to-Depth Convolution (SPDConv) is employed to reconstruct the backbone to preserve fine-grained bud features while reducing model parameters. Experimental results show that CFADet achieves 81.1% precision, 80.9% recall, 81.0% F1-score, and 87.8% mAP, with stable real-time performance on mobile devices in practical orchard scenarios. This study presents a preliminary investigation into robust citrus bud detection in real-world orchard environments and provides a promising technical foundation for intelligent orchard monitoring and early yield estimation, while further validation on larger and more diverse datasets is still required. Full article

(This article belongs to the Section Fruit Production Systems)

► Show Figures

Figure 1

31 pages, 6459 KB

Open AccessArticle

Cooperative Hybrid Domain Network for Salient Object Detection in Optical Remote Sensing Images

by Yi Gu, Jianhang Zhou and Lelei Yan

Remote Sens. 2026, 18(7), 1087; https://doi.org/10.3390/rs18071087 - 4 Apr 2026

Viewed by 292

Abstract

Salient Object Detection (SOD) in Optical Remote Sensing Images (ORSIs) aims to localize and segment visually prominent objects amidst complex backgrounds and extreme scale variations. However, we observe that current frequency-aware methods typically rely on a naive feature aggregation paradigm, merging frequency and [...] Read more.

Salient Object Detection (SOD) in Optical Remote Sensing Images (ORSIs) aims to localize and segment visually prominent objects amidst complex backgrounds and extreme scale variations. However, we observe that current frequency-aware methods typically rely on a naive feature aggregation paradigm, merging frequency and spatial features via simple concatenation, addition, or direct combination. This shallow interaction overlooks the inherent semantic misalignment between the two domains, resulting in feature redundancy and poor boundary delineation. To address this limitation, we propose the Cooperative Hybrid Domain Network (CHDNet), a framework designed to facilitate synergistic cooperation between heterogeneous domains. Specifically, we propose the Cross-Domain Multi-Head Self-Attention (CD-MHSA) mechanism as a semantic bridge following the encoder. It employs a dimension expansion strategy to construct a Unified Interaction Manifold and utilizes a Frequency Anchor Interaction mechanism to achieve precise modulation of spatial textures using global spectral cues. Furthermore, to address the dual challenges of lacking explicit interpretation mechanisms for semantic co-occurrence and the susceptibility of topological structures to fracture in complex scenes during the decoding phase, we design a Multi-Branch Cooperative Decoder (MBCD) comprising three parallel paths: edge semantics, global relations, and reverse correction. This module dynamically integrates these heterogeneous clues through a Cooperative Fusion Strategy, combining explicit global dependency modeling with dual-domain reverse mining. Extensive experiments on multiple benchmark datasets demonstrate that the proposed CHDNet achieves performance superior to state-of-the-art (SOTA) methods. Full article

► Show Figures

Figure 1

19 pages, 357 KB

Open AccessData Descriptor

Scrabbling Syllables into Words: Wordlikeness Norms for European Portuguese Auditory Pseudowords

by Ana Paula Soares, Alberto Lema, Diana R. Pereira, Ana Cláudia Rodrigues, Vinicius Canonici and Helena M. Oliveira

Data 2026, 11(4), 76; https://doi.org/10.3390/data11040076 - 3 Apr 2026

Viewed by 303

Abstract

Auditory pseudowords are widely used in psycholinguistics and cognitive neuroscience, but their construction requires control of sublexical familiarity and careful characterization of how acoustic cue manipulations may shift perceived lexical plausibility. Here we introduce the Minho Pseudoword Wordlikeness Ratings (MPWR), the first normative [...] Read more.

Auditory pseudowords are widely used in psycholinguistics and cognitive neuroscience, but their construction requires control of sublexical familiarity and careful characterization of how acoustic cue manipulations may shift perceived lexical plausibility. Here we introduce the Minho Pseudoword Wordlikeness Ratings (MPWR), the first normative dataset of wordlikeness judgments for European Portuguese (EP) auditory trisyllabic CV pseudowords, and evaluate whether adding a localized F0-based prominence cue modulates wordlikeness beyond distributional familiarity. One hundred and twenty pseudowords were assembled from naturally produced syllables drawn from the Minho Spoken Syllable Pool (MSSP) and recorded under uniform conditions. Each item was implemented in three token types with constant segmental content: a flat baseline and two F0-enhanced versions (+15%) targeting either the penultimate or final syllable. Native EP listeners (N = 101) provided wordlikeness ratings on a 7-point scale. MSSP-derived indices quantified pseudoword syllable familiarity (SWI_All, SWI_N3) and stress-position propensity for the targeted syllable (SPP_marked). Ratings were intentionally low overall yet showed substantial item-to-item variability. F0 enhancement produced a small but reliable decrease in wordlikeness relative to flat tokens, with no reliable difference between penultimate and final targeting positions. SWI_All robustly predicted ratings, whereas SPP_marked added little explanatory value. MPWR provides a practical EP resource for selecting and matching auditory pseudowords using normative wordlikeness ratings and transparent corpus-based descriptors. Full article

(This article belongs to the Section Featured Reviews of Data Science Research)

26 pages, 55794 KB

Open AccessArticle

Distortion-Aware Routing and Parameter-Shared MoE for Multispectral Remote Sensing Super-Resolution

by Shuo Yang, Shi Chen, Yuxuan Liu and Tianhui Zhang

Sensors 2026, 26(7), 2186; https://doi.org/10.3390/s26072186 - 1 Apr 2026

Viewed by 577

Abstract

Multispectral remote sensing image super-resolution (RSISR) aims to reconstruct high-frequency details while preserving cross-band structural consistency under strict computational budgets. However, real-world satellite imagery exhibits heterogeneous distortions, ranging from band-dependent noise to spatially varying texture degradation, rendering uniform restoration strategies suboptimal. To address [...] Read more.

Multispectral remote sensing image super-resolution (RSISR) aims to reconstruct high-frequency details while preserving cross-band structural consistency under strict computational budgets. However, real-world satellite imagery exhibits heterogeneous distortions, ranging from band-dependent noise to spatially varying texture degradation, rendering uniform restoration strategies suboptimal. To address these challenges, we propose a unified framework that integrates cue extraction, expert specialization, and efficiency-aware restoration. Specifically, a Distortion-Aware Feature Extractor (DAFE) explicitly encodes distortion cues by synthesizing fixed frequency bases, learnable residual components, lightweight spatial edge representations, and noise proxies. Subsequently, a Distortion-Aware Expert Choice (DAEC) router utilizes these cues to establish distortion-conditioned affinities and performs capacity-constrained, load-balanced expert assignment. Finally, a parameter-shared Mixture-of-Experts (PS-MoE) architecture employs shared expert parameters across spectral bands, augmented by band-wise low-rank adapters, to enable coarse-to-fine restoration with minimal computational overhead. Extensive experiments on the SEN2VENμS and OLI2MSI datasets demonstrate that the proposed method achieves a PSNR of 49.38 dB on SEN2VENμS 2×, 45.91 dB on SEN2VENμS 4×, and 45.94 dB on OLI2MSI 3×. Compared to the strongest baseline for each task, our method yields PSNR improvements of 0.12 dB, 0.10 dB, and 0.09 dB, respectively, while simultaneously reducing FLOPs and parameter counts. These results confirm that explicit distortion modeling and parameter-shared expert specialization provide an effective and computationally efficient solution for multispectral remote sensing image super-resolution. Full article

(This article belongs to the Section Remote Sensors)

► Show Figures

Figure 1

25 pages, 626 KB

Open AccessArticle

Impacting Brand Awareness and Emotions in Retail Consumer Decision-Making Within a Digital Context

by Hiba Jbara, Sam El Nemar, Wael Bakhit, Demetris Vrontis and Alkis Thrassou

Analytics 2026, 5(2), 16; https://doi.org/10.3390/analytics5020016 - 30 Mar 2026

Viewed by 413

Abstract

This study explores the intricate behavioral consumer psychology dynamics of how certain elements—color, price, gender differences, and the concept of the frequency illusion—affect emotions, brand awareness, and consumer decision-making in a digital environment. Going beyond conventional analyses, this study also explores the intersection [...] Read more.

This study explores the intricate behavioral consumer psychology dynamics of how certain elements—color, price, gender differences, and the concept of the frequency illusion—affect emotions, brand awareness, and consumer decision-making in a digital environment. Going beyond conventional analyses, this study also explores the intersection of sustainable business practices, elucidating the potential for ethical, environmentally conscious, and business-sustainable decision-making. Utilizing a quantitative method and survey data from 207 respondents, this research contributes to a more profound level of understanding of consumer decision-making in the Lebanese retail sector, offering strategic insights for organizations seeking to enhance brand recognition, while aligning with responsible and sustainable practices in today’s dynamic and competitive environment. The study found that psychological cues—color, price, gender differences, and frequency illusion—significantly influence emotions, brand awareness, and consumer decision-making in retail. Future research should examine the tensions in consumer decision-making, where brand awareness and emotional cues can simultaneously facilitate and bias choices, with effects contingent on exposure, demographic characteristics, digital fluency, and cultural context. Full article

► Show Figures

Figure 1

Search Results (273)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (273)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI