Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (643)

Search Parameters:
Keywords = perceptual quality

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 52010 KB  
Article
VSJE: A Variational-Based Spatial–Spectral Joint Enhancement Method for Underwater Image
by Bing Long, Shuhan Chen, Jingchun Zhou, Dehuan Zhang and Deming Zhang
Oceans 2026, 7(1), 11; https://doi.org/10.3390/oceans7010011 - 30 Jan 2026
Abstract
Underwater imaging suffers from significant degradation due to scattering by suspended particles, selective absorption by the medium, and depth-dependent noise, leading to issues such as contrast reduction, color distortion, and blurring. Existing enhancement methods typically address only one aspect of these problems, relying [...] Read more.
Underwater imaging suffers from significant degradation due to scattering by suspended particles, selective absorption by the medium, and depth-dependent noise, leading to issues such as contrast reduction, color distortion, and blurring. Existing enhancement methods typically address only one aspect of these problems, relying on unrealistic assumptions of uniform noise, and fail to jointly handle the spatially heterogeneous noise and spectral channel attenuation. To address these challenges, we propose the variational-based spatial–spectral joint enhancement method (VSJE). This method is based on the physical principles of underwater optical imaging and constructs a depth-aware noise heterogeneity model to accurately capture the differences in noise intensity between near and far regions. Additionally, we propose a channel-sensitive adaptive regularization mechanism based on multidimensional statistics to accommodate the spectral attenuation characteristics of the red, green, and blue channels. A unified variational energy function is then formulated to integrate noise suppression, data fidelity, and color consistency constraints within a collaborative optimization framework, where the depth-aware noise model and channel-sensitive regularization serve as the core adaptive components of the variational formulation. This design enables the joint restoration of multidimensional degradation in underwater images by leveraging the variational framework’s capability to balance multiple enhancement objectives in a mathematically rigorous manner. Experimental results using the UIEBD-VAL dataset demonstrate that VSJE achieves a URanker score of 2.4651 and a UICM score of 9.0740, representing a 30.9% improvement over the state-of-the-art method GDCP in the URanker metric—a key indicator for evaluating the overall visual quality of underwater images. VSJE exhibits superior performance in metrics related to color uniformity (UICM), perceptual quality (CNNIQA, PAQ2PIQ), and overall visual ranking (URanker). Full article
Show Figures

Figure 1

37 pages, 12169 KB  
Article
Perceptual Evaluation of Acoustic Level of Detail in Virtual Acoustic Environments
by Stefan Fichna, Steven van de Par, Bernhard U. Seeber and Stephan D. Ewert
Acoustics 2026, 8(1), 9; https://doi.org/10.3390/acoustics8010009 - 30 Jan 2026
Abstract
Virtual acoustics enables the creation and simulation of realistic and ecologically valid indoor environments vital for hearing research and audiology. For real-time applications, room acoustics simulation requires simplifications. However, the acoustic level of detail (ALOD) necessary to capture all perceptually relevant effects remains [...] Read more.
Virtual acoustics enables the creation and simulation of realistic and ecologically valid indoor environments vital for hearing research and audiology. For real-time applications, room acoustics simulation requires simplifications. However, the acoustic level of detail (ALOD) necessary to capture all perceptually relevant effects remains unclear. This study examines the impact of varying ALOD in simulations of three real environments: a living room with a coupled kitchen, a pub, and an underground station. ALOD was varied by generating different numbers of image sources for early reflections, or by excluding geometrical room details specific for each environment. Simulations were perceptually evaluated using headphones in comparison to measured, real binaural room impulse responses, or by using loudspeakers. The perceived overall difference, spatial audio quality differences, plausibility, speech intelligibility, and externalization were assessed. A transient pulse, an electric bass, and a speech token were used as stimuli. The results demonstrate that considerable reductions in acoustic level of detail are perceptually acceptable for communication-oriented scenarios. Speech intelligibility was robust across ALOD levels, whereas broadband transient stimuli revealed increased sensitivity to simplifications. High-ALOD simulations yielded plausibility and externalization ratings comparable to real-room recordings under both headphone and loudspeaker reproduction. Full article
Show Figures

Figure 1

27 pages, 5263 KB  
Article
MDEB-YOLO: A Lightweight Multi-Scale Attention Network for Micro-Defect Detection on Printed Circuit Boards
by Xun Zuo, Ning Zhao, Ke Wang and Jianmin Hu
Micromachines 2026, 17(2), 192; https://doi.org/10.3390/mi17020192 - 30 Jan 2026
Abstract
Defect detection on Printed Circuit Boards (PCBs) constitutes a pivotal component of the quality control system in electronics manufacturing. However, owing to the intricate circuitry structures on PCB surfaces and the characteristics of defects—specifically their minute scale, irregular morphology, and susceptibility to background [...] Read more.
Defect detection on Printed Circuit Boards (PCBs) constitutes a pivotal component of the quality control system in electronics manufacturing. However, owing to the intricate circuitry structures on PCB surfaces and the characteristics of defects—specifically their minute scale, irregular morphology, and susceptibility to background texture interference—existing generic deep learning models frequently fail to achieve an optimal equilibrium between detection accuracy and inference speed. To address these challenges, this study proposes MDEB-YOLO, a lightweight real-time detection network tailored for PCB micro-defects. First, to enhance the model’s perceptual capability regarding subtle geometric variations along conductive line edges, we designed the Efficient Multi-scale Deformable Attention (EMDA) module within the backbone network. By integrating parallel cross-spatial channel learning with deformable offset networks, this module achieves adaptive extraction of irregular concave–convex defect features while effectively suppressing background noise. Second, to mitigate feature loss of micro-defects during multi-scale transformations, a Bidirectional Residual Multi-scale Feature Pyramid Network (BRM-FPN) is proposed. Utilizing bidirectional weighted paths and residual attention mechanisms, this network facilitates the efficient fusion of multi-view features, significantly enhancing the representation of small targets. Finally, the detection head is reconstructed based on grouped convolution strategies to design the Lightweight Grouped Convolution Head (LGC-Head), which substantially reduces parameter volume and computational complexity while maintaining feature discriminability. The validation results on the PKU-Market-PCB dataset demonstrate that MDEB-YOLO achieves a mean Average Precision (mAP) of 95.9%, an inference speed of 80.6 FPS, and a parameter count of merely 7.11 M. Compared to baseline models, the mAP is improved by 1.5%, while inference speed and parameter efficiency are optimized by 26.5% and 24.5%, respectively; notably, detection accuracy for challenging mouse bite and spur defects increased by 3.7% and 4.0%, respectively. The experimental results confirm that the proposed method outperforms state-of-the-art approaches in both detection accuracy and real-time performance, possessing significant value for industrial applications. Full article
Show Figures

Figure 1

37 pages, 24380 KB  
Article
Denoising of CT and MRI Images Using Decomposition-Based Curvelet Thresholding and Classical Filtering Techniques
by Mahmoud Nasr, Krzysztof Brzostowski, Rafał Obuchowicz and Adam Piórkowski
Appl. Sci. 2026, 16(3), 1335; https://doi.org/10.3390/app16031335 - 28 Jan 2026
Viewed by 83
Abstract
Medical image denoising is crucial for enhancing the diagnostic accuracy of CT and MRI images. This paper presents a modular hybrid framework that combines multiscale decomposition techniques (Empirical Mode Decomposition, Variational Mode Decomposition, Bidimensional EMD, and Multivariate EMD) with curvelet transform thresholding and [...] Read more.
Medical image denoising is crucial for enhancing the diagnostic accuracy of CT and MRI images. This paper presents a modular hybrid framework that combines multiscale decomposition techniques (Empirical Mode Decomposition, Variational Mode Decomposition, Bidimensional EMD, and Multivariate EMD) with curvelet transform thresholding and traditional spatial filters. The methodology was assessed using a phantom dataset containing regulated Rician noise, clinical CT images rebuilt with sharp (B50f) and medium (B46f) kernels, and MRI scans obtained at various GRAPPA acceleration factors. In phantom trials, MEMD–Curvelet attained the highest SSIM (0.964) and PSNR (28.35 dB), while preserving commendable perceptual scores (NIQE approximately 7.55, BRISQUE around 38.8). In CT images, VMD–Curvelet and MEMD–Curvelet consistently outperformed classical filters, achieving SSIM values over 0.95 and PSNR values above 28 dB, even with sharp-kernel reconstructions. In MRI datasets, MEMD–Curvelet and BEMD–Curvelet reduced perceptual distortion, decreasing NIQE by up to 15% and BRISQUE by 20% compared to Gaussian and median filtering. Deep learning baselines validated the framework’s competitiveness: BM3D attained high fidelity but necessitated 6.65 s per slice, while DnCNN delivered equivalent SSIM (0.958) with a diminished runtime of 2.33 s. The results indicate that the proposed framework excels at noise reduction and structure preservation across various imaging settings, surpassing independent filtering and transform-only methods. Its versatility and efficiency underscore its potential for therapeutic integration in situations necessitating high-quality denoising under limited acquisition conditions. Full article
12 pages, 2668 KB  
Article
Spatial-Frequency Fusion Tiny-Transformer for Efficient Image Super-Resolution
by Qiaoyue Man
Appl. Sci. 2026, 16(3), 1284; https://doi.org/10.3390/app16031284 - 27 Jan 2026
Viewed by 79
Abstract
In image super-resolution tasks, methods based on Generative Adversarial Networks (GANs), Transformer models, and diffusion models demonstrate robust global modeling capabilities and outstanding performance. However, their computational costs remain prohibitively high, limiting deployment on resource-constrained devices. Meanwhile, frequency-domain approaches based on convolutional neural [...] Read more.
In image super-resolution tasks, methods based on Generative Adversarial Networks (GANs), Transformer models, and diffusion models demonstrate robust global modeling capabilities and outstanding performance. However, their computational costs remain prohibitively high, limiting deployment on resource-constrained devices. Meanwhile, frequency-domain approaches based on convolutional neural networks (CNNs) capture complementary structural information but lack long-range dependencies, resulting in suboptimal perceptual image quality. To overcome these limitations, we propose a micro-Transformer-based architecture. This framework enriches high-frequency image information through wavelet transform-based frequency-domain features, integrates spatio-temporal and frequency-domain cross-feature fusion, and incorporates a discriminator constraint to achieve image super-resolution. Extensive experiments demonstrate that this approach achieves competitive PSNR/SSIM performance while maintaining reasonable computational complexity. Its visual quality and efficiency outperform most existing SR methods. Full article
Show Figures

Figure 1

20 pages, 4627 KB  
Article
Entropy Subtraction-Supported Residual-Diffusion Framework for Image Super-Resolution
by Honghe Huang, Changbin Shao, Chunlong Hu, Xin Shu and Hualong Yu
Symmetry 2026, 18(1), 193; https://doi.org/10.3390/sym18010193 - 20 Jan 2026
Viewed by 117
Abstract
Diffusion probabilistic models have demonstrated remarkable superiority in SISR. Yet, their multi-step denoising mechanism incurs prohibitive computational overhead, which severely limits real-world deployment. To address this issue, we propose an Entropy Subtraction-Supported Diffusion Denoising framework for image Reconstruction (ESRDF). The core idea is [...] Read more.
Diffusion probabilistic models have demonstrated remarkable superiority in SISR. Yet, their multi-step denoising mechanism incurs prohibitive computational overhead, which severely limits real-world deployment. To address this issue, we propose an Entropy Subtraction-Supported Diffusion Denoising framework for image Reconstruction (ESRDF). The core idea is to shift part of the SR burden from the diffusion model to an image Decoder, with a key focus on recovering the symmetric structural correspondence between LR and HR images that is often degraded during downsampling. Specifically, ESRDF’s main branch employs a CNN that performs one-step feature reconstruction, supervised by a novel entropy-matching loss in addition to the conventional reconstruction loss. This loss adopts a patch-wise entropy matching strategy that enforces regional consistency between the True and the predicted images. Building on L1’s focus on pixel-level details and perceptual loss’s grasp of global semantics, region-wise entropy measurement further completes the global alignment of intra-region information structures. Under this framework, the main branch delivers coarse low-frequency content, drastically reducing the workload of the diffusion branch, which now only needs to sparsely refine high-frequency details. Experimental results on multiple benchmark datasets demonstrate that ESRDF achieves shorter model convergence times and higher generation quality with fewer denoising steps, outperforming previous diffusion-based image reconstruction methods. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

11 pages, 4436 KB  
Proceeding Paper
SRGAN-Based Deep Learning Framework for Wind Turbine Damage Detection from Sentinel-2 Imagery
by Kübra Çakır, Onur Elma and Murat Kuzlu
Eng. Proc. 2026, 122(1), 19; https://doi.org/10.3390/engproc2026122019 - 19 Jan 2026
Viewed by 130
Abstract
The operational reliability of wind turbines is critical for sustainable energy production in smart grids. This study proposes a remote monitoring approach using perceptually enhanced satellite imagery. Sentinel-2 multispectral data (10 m resolution) has been processed with a Super-Resolution Generative Adversarial Network (SRGAN) [...] Read more.
The operational reliability of wind turbines is critical for sustainable energy production in smart grids. This study proposes a remote monitoring approach using perceptually enhanced satellite imagery. Sentinel-2 multispectral data (10 m resolution) has been processed with a Super-Resolution Generative Adversarial Network (SRGAN) to improve visual quality to a perceptual resolution of 30 cm. Although true spatial refinement is not achieved, the sharper structural details enhance classification accuracy. The data set comprises 15,000 images—10,000 SRGAN-enhanced and 5000 augmented through rotation, zoom in, increasing brightness, noise addition, and blurring. A custom Convolutional Neural Network (CNN) has been trained to classify turbines as damaged or intact, achieving 95% accuracy, a 0.99 ROC-AUC, and a 0.95 F1 score. These results demonstrate that perceptually sharpened satellite data can effectively support automated wind turbine damage detection and predictive maintenance. The proposed framework also lays the groundwork for broader real-time and multimodal monitoring and cost-efficient applications in renewable energy systems. Full article
Show Figures

Figure 1

18 pages, 1024 KB  
Systematic Review
Anxiety-Related Functional Dizziness: A Systematic Review of the Recent Evidence on Vestibular, Cognitive Behavioral, and Integrative Therapies
by Rosario Ferlito, Francesco Cannistrà, Salvatore Giunta, Manuela Pennisi, Carmen Concerto, Maria S. Signorelli, Rita Bella, Maria P. Mogavero, Raffaele Ferri and Giuseppe Lanza
Life 2026, 16(1), 159; https://doi.org/10.3390/life16010159 - 18 Jan 2026
Viewed by 275
Abstract
Background: Functional dizziness and persistent postural-perceptual dizziness (PPPD) involve mutually reinforcing vestibular symptoms and anxiety. Non-pharmacological interventions, such as vestibular rehabilitation therapy (VRT) and cognitive behavioral therapy (CBT), aim to address both mechanisms, yet their overall effectiveness remains unclear. Methods: We [...] Read more.
Background: Functional dizziness and persistent postural-perceptual dizziness (PPPD) involve mutually reinforcing vestibular symptoms and anxiety. Non-pharmacological interventions, such as vestibular rehabilitation therapy (VRT) and cognitive behavioral therapy (CBT), aim to address both mechanisms, yet their overall effectiveness remains unclear. Methods: We systematically examined randomized controlled trials (RCTs) published between 2000 and 2025 that evaluated VRT, CBT, or multimodal approaches for adults with functional or chronic dizziness (including PPPD and related functional dizziness constructs) accompanied by significant anxiety. Twelve RCTs (513 participants) met the criteria, involving individuals with PPPD, chronic subjective dizziness, chronic vestibular disorders with prominent anxiety, and residual dizziness after benign paroxysmal positional vertigo. Results: Conventional VRT delivered in clinic or as structured home-based programs produced small-to-moderate improvements in dizziness-related disability versus usual care. Combining VRT with CBT or psychologically informed components yielded larger and more consistent reductions in disability and maladaptive dizziness-related beliefs. CBT-based interventions reduced anxiety and dizziness-related distress compared with supportive controls. Emerging modalities, including virtual-reality-based VRT, non-invasive neuromodulation, and heart-rate-variability biofeedback, showed potential, although they were limited by small samples and methodological issues. Most trials had some risk-of-bias concerns and evidence certainty ranged from very low to moderate. Conclusions: Integrated multimodal rehabilitation shows promise, although larger, high-quality RCTs using standardized procedures and outcome measures are required. Full article
Show Figures

Figure 1

21 pages, 4290 KB  
Article
Information Modeling of Asymmetric Aesthetics Using DCGAN: A Data-Driven Approach to the Generation of Marbling Art
by Muhammed Fahri Unlersen and Hatice Unlersen
Information 2026, 17(1), 94; https://doi.org/10.3390/info17010094 - 15 Jan 2026
Viewed by 363
Abstract
Traditional Turkish marbling (Ebru) art is an intangible cultural heritage characterized by highly asymmetric, fluid, and non-reproducible patterns, making its long-term preservation and large-scale dissemination challenging. It is highly sensitive to environmental conditions, making it enormously difficult to mass produce while maintaining its [...] Read more.
Traditional Turkish marbling (Ebru) art is an intangible cultural heritage characterized by highly asymmetric, fluid, and non-reproducible patterns, making its long-term preservation and large-scale dissemination challenging. It is highly sensitive to environmental conditions, making it enormously difficult to mass produce while maintaining its original aesthetic qualities. A data-driven generative model is therefore required to create unlimited, high-fidelity digital surrogates that safeguard this UNESCO heritage against physical loss and enable large-scale cultural applications. This study introduces a deep generative modeling framework for the digital reconstruction of traditional Turkish marbling (Ebru) art using a Deep Convolutional Generative Adversarial Network (DCGAN). A dataset of 20,400 image patches, systematically derived from 17 original marbling works, was used to train the proposed model. The framework aims to mathematically capture the asymmetric, fluid, and stochastic nature of Ebru patterns, enabling the reproduction of their aesthetic structure in a digital medium. The generated images were evaluated using multiple quantitative and perceptual metrics, including Fréchet Inception Distance (FID), Kernel Inception Distance (KID), Learned Perceptual Image Patch Similarity (LPIPS), and PRDC-based indicators (Precision, Recall, Density, Coverage). For experimental validation, the proposed DCGAN framework is additionally compared against a Vanilla GAN baseline trained under identical conditions, highlighting the advantages of convolutional architectures for modeling marbling textures. The results show that the DCGAN model achieved a high level of realism and diversity without mode collapse or overfitting, producing images that were perceptually close to authentic marbling works. In addition to the quantitative evaluation, expert qualitative assessment by a traditional Ebru artist confirmed that the model reproduced the organic textures, color dynamics, and compositional asymmetrical characteristic of real marbling art. The proposed approach demonstrates the potential of deep generative models for the digital preservation, dissemination, and reinterpretation of intangible cultural heritage recognized by UNESCO. Full article
Show Figures

Graphical abstract

13 pages, 246 KB  
Article
Effectiveness of Group Voice Therapy in Teachers with Hyperfunctional Voice Disorder
by Nataša Prebil, Rozalija Kušar, Maja Šereg Bahar and Irena Hočevar Boltežar
Clin. Pract. 2026, 16(1), 16; https://doi.org/10.3390/clinpract16010016 - 14 Jan 2026
Viewed by 172
Abstract
Background/Objectives: The aim of this study was to assess the short-term and long-term effectiveness of group voice therapy in changing vocal behaviour and improving voice quality (VQ) among teachers with hyperfunctional voice disorders (HFVD), using both subjective and objective measures. Methods: [...] Read more.
Background/Objectives: The aim of this study was to assess the short-term and long-term effectiveness of group voice therapy in changing vocal behaviour and improving voice quality (VQ) among teachers with hyperfunctional voice disorders (HFVD), using both subjective and objective measures. Methods: Thirty-one teachers participated in a structured group voice therapy programme. Participants underwent videoendostroboscopic evaluation of laryngeal morphology and function, perceptual assessment of voice, acoustic analysis of voice samples, and aerodynamic measurements of phonation. Patients’ self-assessment of VQ and its impact on quality of life were measured using a Visual Analogue Scale (VAS) and the Voice Handicap Index-30 (VHI-30). Evaluations were conducted at four time points: pre-therapy (T0), immediately post-therapy (T1), and at 3-month (T3) and 12-month (T12) follow-up visits. Results: Significant improvement was observed between T0 and T1 in perceptual voice evaluations: grade, roughness, asthenia, strain, loudness, fast speaking rate, as well as in neck muscle tension, shimmer, patients’ most harmful vocal behaviours, VHI-30 scores, patients VQ evaluation, and its impact on quality of life (all p < 0.05). Almost all parameters of subjective and objective voice assessment improved over the 12-month observation period, with the greatest improvement between T0 and T12 (all p < 0.05), indicating lasting reduced laryngeal tension and improved phonatory efficiency. Conclusions: Group voice therapy has been shown to be an effective treatment for teachers with HFVD, leading to significant and long-lasting improvements in perceptual, acoustic, and self-assessment outcomes. Therapy also promoted healthier vocal and lifestyle behaviours, supporting its role as a successful and cost-effective rehabilitation and prevention method for occupational voice disorders. Full article
15 pages, 2396 KB  
Article
A Study on Perception Differences in Sustainable Non-Motorized Transportation Assessment Based on Female Perspectives and Machine Scoring: A Case Study of Changsha
by Ziyun Ye, Jiawei Zhu, Yaming Ren and Jiachuan Wang
Sustainability 2026, 18(2), 810; https://doi.org/10.3390/su18020810 - 13 Jan 2026
Viewed by 287
Abstract
Against the backdrop of rising global carbon emissions, promoting active transportation modes such as walking and cycling has become a key strategy for countries worldwide to meet carbon reduction targets and advance the goals of sustainable development. In China, the concept of low-carbon [...] Read more.
Against the backdrop of rising global carbon emissions, promoting active transportation modes such as walking and cycling has become a key strategy for countries worldwide to meet carbon reduction targets and advance the goals of sustainable development. In China, the concept of low-carbon mobility has gained rapid traction, leading to a significant increase in public demand for non-motorized travel options like walking and cycling. From the perspective of inclusive urban development, gender imbalances in sample representation during design and evaluation processes have contributed to homogenization and a lack of diversity in urban slow-traffic environments. To address this issue, this study adopts a problem-oriented approach. First, we collect street scene images of slow-traffic environments through self-conducted field surveys. Concurrently, we gather satisfaction survey responses from 511 urban residents regarding existing slow-traffic streets, identifying three key environmental evaluation indicators: safety, liveliness, and beauty. Second, an experimental analysis is conducted to compare machine-generated assessments based on self-collected street view data with manual evaluations performed by 27 female participants. The findings reveal significant perceptual differences between genders in the assessment of slow-moving environments, particularly regarding attention to environmental elements, challenges in utilizing non-motorized lanes, and overall environmental satisfaction. Moreover, notable discrepancies are observed between machine scores and manual assessments performed by women. Based on these findings, this study investigates the underlying causes of such perceptual disparities and the mechanisms influencing them. Finally, it proposes female-inclusive strategies aimed at enhancing the quality of slow-traffic environments, thereby addressing the current absence of gender considerations in their design. This research seeks to provide a robust female perspective and empirical evidence to support improvements in the quality of slow-moving environments and to inform strategic advancements in their design. The findings of this study can provide a theoretical and empirical basis for the optimization of gender-inclusive non-motorized transportation environment design, policy formulation, and subsequent interdisciplinary research. Full article
(This article belongs to the Section Environmental Sustainability and Applications)
Show Figures

Figure 1

22 pages, 9200 KB  
Article
Subjectively Preferred Surface Scattering Coefficients in Performance Venues for Traditional Inner Mongolian Instruments
by Shuonan Ni, Xiaoyun Yue, Zifan Xu, Zhongzheng Qu, Da Yang and Xiangdong Zhu
Buildings 2026, 16(2), 324; https://doi.org/10.3390/buildings16020324 - 12 Jan 2026
Viewed by 174
Abstract
At performance venues, a well-recognized factor-shaping sound quality is surface scattering. However, how scattering coefficients relate to auditory perception remains underexplored. This study mapped surface scattering coefficients to listening preferences under numerous conditions. Specifically, it used traditional Mongolian instruments in two simulated environments: [...] Read more.
At performance venues, a well-recognized factor-shaping sound quality is surface scattering. However, how scattering coefficients relate to auditory perception remains underexplored. This study mapped surface scattering coefficients to listening preferences under numerous conditions. Specifically, it used traditional Mongolian instruments in two simulated environments: a theater-type space and a rectangular performance space. Impulse responses were generated under four scattering coefficients (0.1, 0.3, 0.6, and 0.9) and convolved with dry recordings to produce experimental audio samples. Forty-eight participants of varying musical expertise completed paired-comparison listening tests to identify preferred coefficients. The results showed that a scattering coefficient of 0.6 consistently yielded the highest preference across spatial, surface, listener, and tempo variations. Side-wall scattering had a stronger perceptual impact than ceiling scattering, and listener expertise significantly influenced preference. Non-professionals favored lower scattering values, while instrumental specialists preferred moderate-to-high diffusion. This study provides empirical evidence and design guidance for optimizing acoustic diffusion in theaters and auditoriums. Full article
(This article belongs to the Section Building Energy, Physics, Environment, and Systems)
Show Figures

Figure 1

32 pages, 1010 KB  
Article
A Quantum OFDM Framework for Next-Generation Video Transmission over Noisy Channels
by Udara Jayasinghe and Anil Fernando
Electronics 2026, 15(2), 284; https://doi.org/10.3390/electronics15020284 - 8 Jan 2026
Viewed by 170
Abstract
Quantum communication presents new opportunities for overcoming the limitations of classical wireless systems, particularly those associated with noise, fading, and interference. Building upon the principles of classical orthogonal frequency division multi-plexing (OFDM), this work proposes a quantum OFDM architecture tailored for video transmission. [...] Read more.
Quantum communication presents new opportunities for overcoming the limitations of classical wireless systems, particularly those associated with noise, fading, and interference. Building upon the principles of classical orthogonal frequency division multi-plexing (OFDM), this work proposes a quantum OFDM architecture tailored for video transmission. In the proposed system, video sequences are first compressed using the versatile video coding (VVC) standard with different group of pictures (GOP) sizes. Each GOP size is processed through a channel encoder and mapped to multi-qubit states with various qubit configurations. The quantum-encoded data is converted from serial-to-parallel form and passed through the quantum Fourier transform (QFT) to generate mutually orthogonal quantum subcarriers. Following reserialization, a cyclic prefix is appended to mitigate inter-symbol interference within the quantum channel. At the receiver, the cyclic prefix is removed, and the signal is restored to parallel before the inverse QFT (IQFT) recovers the original quantum subcarriers. Quantum decoding, classical channel decoding, and VVC reconstruction are then employed to recover the videos. Experimental evaluations across different GOP sizes and channel conditions demonstrate that quantum OFDM provides superior resilience to channel noise and improved perceptual quality compared to classical OFDM, achieving peak signal-to-noise ratio (PSNR) up to 47.60 dB, structural similarity index measure (SSIM) up to 0.9987, and video multi-method assessment fusion (VMAF) up to 96.40. Notably, the eight-qubit encoding scheme consistently achieves the highest SNR gains across all channels, underscoring the potential of quantum OFDM as a foundation for future high-quality video transmission. Full article
Show Figures

Figure 1

24 pages, 3204 KB  
Article
AMUSE++: A Mamba-Enhanced Speech Enhancement Framework with Bi-Directional and Advanced Front-End Modeling
by Tsung-Jung Li, Berlin Chen and Jeih-Weih Hung
Electronics 2026, 15(2), 282; https://doi.org/10.3390/electronics15020282 - 8 Jan 2026
Viewed by 343
Abstract
This study presents AMUSE++, an advanced speech enhancement framework that extends the MUSE++ model by redesigning its core Mamba module with two major improvements. First, the originally unidirectional one-dimensional (1D) Mamba is transformed into a bi-directional architecture to capture temporal dependencies more effectively. [...] Read more.
This study presents AMUSE++, an advanced speech enhancement framework that extends the MUSE++ model by redesigning its core Mamba module with two major improvements. First, the originally unidirectional one-dimensional (1D) Mamba is transformed into a bi-directional architecture to capture temporal dependencies more effectively. Second, this module is extended to a two-dimensional (2D) structure that jointly models both time and frequency dimensions, capturing richer speech features essential for enhancement tasks. In addition to these structural changes, we propose a Preliminary Denoising Module (PDM) as an advanced front-end, which is composed of multiple cascaded 2D bi-directional Mamba Blocks designed to preprocess and denoise input speech features before the main enhancement stage. Extensive experiments on the VoiceBank+DEMAND dataset demonstrate that AMUSE++ significantly outperforms both the backbone MUSE++ across a variety of objective speech enhancement metrics, including improvements in perceptual quality and intelligibility. These results confirm that the combination of bi-directionality, two-dimensional modeling, and an enhanced denoising frontend provides a powerful approach for tackling challenging noisy speech scenarios. AMUSE++ thus represents a notable advancement in neural speech enhancement architectures, paving the way for more effective and robust speech enhancement systems in real-world applications. Full article
Show Figures

Figure 1

22 pages, 1784 KB  
Article
Automated Severity and Breathiness Assessment of Disordered Speech Using a Speech Foundation Model
by Vahid Ashkanichenarlogh, Arman Hassanpour and Vijay Parsa
Information 2026, 17(1), 32; https://doi.org/10.3390/info17010032 - 3 Jan 2026
Viewed by 251
Abstract
In this study, we propose a novel automated model for speech quality estimation that objectively evaluates perceptual dysphonia severity and breathiness in audio samples, demonstrating strong correlation with expert ratings. The proposed model integrates Whisper encoder embeddings with Mel spectrograms augmented by second-order [...] Read more.
In this study, we propose a novel automated model for speech quality estimation that objectively evaluates perceptual dysphonia severity and breathiness in audio samples, demonstrating strong correlation with expert ratings. The proposed model integrates Whisper encoder embeddings with Mel spectrograms augmented by second-order delta features combined with a sequential-attention fusion network feature mapping path. This hybrid approach enhances the model’s sensitivity to phonetic, high-level feature representation, and spectral variations, enabling more accurate predictions of perceptual speech quality. A sequential-attention fusion network feature mapping module captures long-range dependencies through the multi-head attention network, while LSTM layers refine the learned representations by modeling temporal dynamics. Comparative analysis against state-of-the-art methods for dysphonia assessment demonstrates our model’s better correlation with clinician’s judgments across test samples. Our findings underscore the effectiveness of ASR-derived embeddings alongside the deep feature mapping structure in disordered speech quality assessment, offering a promising pathway for advancing automated evaluation systems. Full article
Show Figures

Graphical abstract

Back to TopTop