Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (230)

Search Parameters:
Keywords = color space transformation

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
29 pages, 8082 KB  
Article
CMYD-SurfaceNet: Scale-Aware Cascaded Multimodal MRI Segmentation via Representation-Level Structural Decoupling and Boundary-Constrained Learning
by Chaymae El Mechal, Mostefa Mesbah, Loubna Mazgouti, Fatima Zahra Ammor and Najiba El Amrani El Idrissi
Digital 2026, 6(2), 49; https://doi.org/10.3390/digital6020049 - 16 Jun 2026
Viewed by 209
Abstract
Reliable delineation of brain tumor boundaries in multimodal magnetic resonance imaging (MRI) remains challenging despite substantial advances in deep learning–based segmentation. Although modern encoder–decoder architectures achieve strong volumetric overlap, precise geometric alignment of tumor contours remains inconsistent, particularly for small lesions and heterogeneous [...] Read more.
Reliable delineation of brain tumor boundaries in multimodal magnetic resonance imaging (MRI) remains challenging despite substantial advances in deep learning–based segmentation. Although modern encoder–decoder architectures achieve strong volumetric overlap, precise geometric alignment of tumor contours remains inconsistent, particularly for small lesions and heterogeneous clinical cases. In neuro-oncology, even minor boundary deviations may influence surgical planning, radiotherapy targeting, and longitudinal treatment assessment. These limitations suggest that segmentation performance is not determined solely by network depth or loss design, but also by how multimodal information is structured prior to learning. We introduce CMYD-SurfaceNet, a scale-aware cascaded framework that restructures multimodal MRI inputs at the representation level to enhance boundary-sensitive segmentation. Rather than treating modalities as independently concatenated channels, selected sequences are first organized into a task-guided pseudo-RGB projection. This intermediate representation is subsequently transformed into the CMYK color space to disentangle shared luminance structure from modality-specific contrast dominance. To further encode geometric priors, a gradient-derived boundary density channel is incorporated to explicitly emphasize spatial discontinuities corresponding to tumor margins. The resulting CMYD representation is integrated within a two-stage nnU-Net cascade, where global tumor localization is followed by high-resolution region-of-interest refinement with auxiliary contour supervision. This scale-aware design improves sensitivity to small tumor components while stabilizing contour delineation. Extensive evaluation on the BraTS benchmark demonstrates consistent improvements in boundary-sensitive metrics. Compared with baseline nnU-Net, the proposed framework reduces HD95 from 3.6 mm to 2.4 mm and increases Surface Dice at 1 mm tolerance from 0.82 to 0.89, while maintaining competitive Dice performance. These findings suggest that representation-level structural decoupling, when combined with scale-aware refinement, may provide clinically relevant boundary-aware multimodal MRI segmentation support without increasing architectural complexity. Full article
Show Figures

Figure 1

21 pages, 7368 KB  
Article
IA4CACAO: Deep Learning-Based Classification of Fermented Cocoa Beans (Cut Test Images) in Colombia
by Ariolfo Camacho Velasco, Ramiro S. Avila Chacón, Diego A. Zárate, Lucero G. Rodriguez Silva, German A. Estrada-Bonilla and Cesar A. Vargas
AgriEngineering 2026, 8(6), 206; https://doi.org/10.3390/agriengineering8060206 - 27 May 2026
Viewed by 408
Abstract
Automated and objective grading of cocoa (Theobroma cacao L.) fermentation remains a major challenge because the conventional cut test relies on subjective visual inspection and is difficult to scale. In this study, we develop and evaluate a deep learning pipeline for classifying [...] Read more.
Automated and objective grading of cocoa (Theobroma cacao L.) fermentation remains a major challenge because the conventional cut test relies on subjective visual inspection and is difficult to scale. In this study, we develop and evaluate a deep learning pipeline for classifying cocoa bean fermentation levels from expert-annotated cut-test images acquired under controlled conditions, enabling the systematic evaluation and comparison of multiple convolutional and transformer-based architectures under consistent preprocessing, training, and evaluation protocols. The dataset comprises 4347 segmented cocoa bean images distributed across four severely imbalanced classes, namely fermented, under-fermented, slaty, and violet. Representative architectures, including EfficientNet-B0, MobileNetV3-Large, ConvNeXt-XLarge, ViT-Base, and ViT-Large, are benchmarked to analyze the effects of class imbalance, RGB versus HSV color representation, training duration, and label-space formulation. The results show that severe class imbalance strongly degrades performance in direct four-class classification. A hierarchical binary-to-multiclass strategy significantly improves balanced recognition by separating fermented from unfermented beans prior to subclass discrimination, increasing macro-F1 scores from approximately 80–83% to 89–91%. Among the evaluated models, ViT-Base emerges as the most stable architecture across experimental settings and offers the best balance between classification performance, training stability, and computational cost. Although larger models achieve slightly higher peak performance under balanced conditions, ViT-Base provides more consistent results under realistic constraints. The proposed framework enables near-real-time inference on segmented single-bean images and supports objective, reproducible, and scalable fermentation assessment. These findings demonstrate that performance in cocoa fermentation grading is determined not only by model capacity, but also by imbalance-aware label-space design and evaluation protocols aligned with real-world cut-test conditions. Full article
Show Figures

Figure 1

20 pages, 13558 KB  
Article
Deep Hybrid Synesthesia Model for Audio-Image Transfer
by Zhaojie Luo, Jiayong Jiang and Ladóczki Bence
Electronics 2026, 15(10), 2218; https://doi.org/10.3390/electronics15102218 - 21 May 2026
Viewed by 328
Abstract
Most artistic expressions are conveyed through images (e.g., painting) and audio (e.g., music), and deep learning has been successfully applied to neural style transfer within each of these modalities. However, there is still a lack of deep models that explicitly learn to transfer [...] Read more.
Most artistic expressions are conveyed through images (e.g., painting) and audio (e.g., music), and deep learning has been successfully applied to neural style transfer within each of these modalities. However, there is still a lack of deep models that explicitly learn to transfer style between images and audio. Motivated by synesthesia, which reflects intrinsic connections between vision and hearing in the human brain, we propose a deep hybrid synesthesia model for audio–image style transfer. Our framework consists of two main components: (1) a component conversion module that learns cross-modal mappings between audio rhythm/spectrum and image color/shape in a continuous valence–arousal (VA) emotion space; and (2) a style conversion module that transfers high-level artistic styles between Eastern (ink-wash, shui-mo) and Western painting and their corresponding musical counterparts. We first learn emotion-aware feature networks that align low-level audio and visual components based on shared affective representations, and then model long-term stylistic structures for cross-modal style transfer. Experiments include “seeing the sound” (audio-to-image generation with controllable components) and full audio–image style transformations. Both objective analyses and subjective evaluations suggest that our model can produce cross-modal artworks whose perceived style and emotional content are consistent with human synesthetic impressions. Full article
Show Figures

Figure 1

17 pages, 3484 KB  
Article
Environmental Preference as a Mediator of Streetscape Vitality: A Chain Mediation Model for Landscape Design
by Tiean Zou, Yutong Zhang, Wenbo Duan, Yuhao Liu, Xin Meng, Yuexin Zhang and Xingyuan Fu
Land 2026, 15(5), 846; https://doi.org/10.3390/land15050846 - 14 May 2026
Viewed by 262
Abstract
As the inner driving factor of space vitality, environmental perception can be expressed in many ways. Given the current lack of in-depth research on related perceptions, the study integrated theoretical origin and empirical study methods to clarify the role that preference played as [...] Read more.
As the inner driving factor of space vitality, environmental perception can be expressed in many ways. Given the current lack of in-depth research on related perceptions, the study integrated theoretical origin and empirical study methods to clarify the role that preference played as the common foundation of different expression ways of environmental perception. The study also explored the interaction mechanism of different preference expression ways in the “quality-to-vitality” pathway and significant environmental characteristics of them, so as to realize the transformation from landscape design to urban vitality. Key findings indicate that: (1) Three environmental preference expressions—emotion, satisfaction, and behavioral preference—collectively lend credence to a significant chain mediation pathway (“emotion → satisfaction → behavioral preference”) in the quality-to-vitality process; (2) Pedestrian safety infrastructure (e.g., traffic barricades, well-maintained pavements) could ensure perceived security and walking activities; (3) Cultural/recreational facilities mean complementary legibility-enhancing elements (appropriate spatial enclosure, pleasant color schemes, architectural coherence) to evoke positive affect; (4) Streetscape diversity and visual interest might mitigate monotony induced by excessive block length, serving as vital vitality catalysts in some degree. Full article
Show Figures

Figure 1

20 pages, 19314 KB  
Article
Haptic and Thermal Rendering of Astronomical Data: A Multimodal Approach to Inclusive Science Communication
by Beatriz García, Johanna Casado and Alexis Mancilla
Multimodal Technol. Interact. 2026, 10(5), 54; https://doi.org/10.3390/mti10050054 - 12 May 2026
Viewed by 415
Abstract
Universal Accessibility in Astronomy requires a paradigm shift from visual-centric communication to multisensory data interaction. Because astronomy communication relies inherently on high-resolution imagery and visual metaphors, it creates significant accessibility barriers for blind and low-vision (BLV) audiences. To address this, multimodal encoding offers [...] Read more.
Universal Accessibility in Astronomy requires a paradigm shift from visual-centric communication to multisensory data interaction. Because astronomy communication relies inherently on high-resolution imagery and visual metaphors, it creates significant accessibility barriers for blind and low-vision (BLV) audiences. To address this, multimodal encoding offers a feasible and meaningful solution by redistributing information across alternative sensory channels, ensuring that the absence of sight does not preclude the comprehension of spatial data. This article explores the development and evaluation of a low-cost, multimodal tool designed to represent complex astronomical concepts—specifically stellar magnitude and color—through tactile and auditory stimuli. Unlike traditional methods, our approach focuses on the haptic-cognitive link, allowing users to “feel” data through physical relief models. We present a structured impact study involving a heterogeneous group of blind, low-vision, and sighted participants. The methodology followed a mixed-methods approach, including a participatory workshop with 20 individuals and a detailed usability assessment with a core group (n= 6) of blind and low-vision participants. Preliminary results from this pilot phase demonstrate that multimodal integration effectively reduces the perceived mental effort for complex spatial data comprehension. Quantitative and qualitative feedback suggests that tactile-auditory sensory substitution not only improves accessibility but also enhances engagement and information retention across all user groups. These findings highlight the potential of multimodal models in transforming public scientific environments, such as museums and observatories, into inclusive, interactive spaces. Full article
Show Figures

Figure 1

22 pages, 19098 KB  
Article
Symmetry Analysis of Aesthetic Features for Computational Support in Assessment of Art Learning Outcomes
by Yan Ruan and Xiaofei Li
Symmetry 2026, 18(5), 811; https://doi.org/10.3390/sym18050811 - 9 May 2026
Viewed by 259
Abstract
The assessment of art learning outcomes has long relied on teachers’ subjective judgment, facing challenges such as inconsistent evaluation criteria and difficulty in multi-dimensional quantitative analysis. To address these issues, this study proposes a framework for the automatic assessment of art learning outcomes [...] Read more.
The assessment of art learning outcomes has long relied on teachers’ subjective judgment, facing challenges such as inconsistent evaluation criteria and difficulty in multi-dimensional quantitative analysis. To address these issues, this study proposes a framework for the automatic assessment of art learning outcomes based on symmetry analysis of multi-dimensional aesthetic features. The model quantifies the symmetry between student works and instructional exemplars across three aesthetic dimensions: color distribution features (HSV color space histograms and dominant color composition), compositional features (visual center distribution and structural symmetry), and art movement style features (multi-layer Gram matrices from VGG-19 with PCA dimensionality reduction). Using publicly available artwork datasets, this study constructed Temporal Evolution Pairs (early and late works by the same artist) and Stylistic Inheritance Pairs (works by different artists within the same movement) to validate the model’s effectiveness. The experimental results demonstrate that the proposed multi-dimensional feature fusion strategy achieves 87.6% accuracy in artist style evolution trajectory recognition and 82.3% accuracy in art movement style inheritance quantification, significantly outperforming baseline methods including SSIM (52.3%), VGG-fc features (68.9%), and single style loss (76.4%). Two in-depth case studies further validate the model’s quantitative capability: in analyzing Picasso’s stylistic evolution, the Mastery Index and the Creativity Divergence Index successfully captured the stylistic continuity of adjacent periods (Blue Period to Rose Period: the Mastery Index = 73.6) and the breakthrough innovation of cross-period transformations (Rose Period to Cubism: the Creativity Divergence Index = 82.7). t-SNE visualization of the feature space further revealed that deep style features can clearly distinguish different art movements and individual artists, with spatial distances between artists closely corresponding to stylistic affinities. This research provides new perspectives and tools for a computational framework with the potential for art education assessment practice. It is important to emphasize that the reported performance demonstrates the model’s ability to quantify stylistic relationships between artworks but does not yet demonstrate its validity for assessing student learning outcomes in real classroom settings. As noted, the current validation is based on art-historical consensus, and direct application to educational contexts will require further validation with actual student works and expert evaluation, which we plan to address in future work. Full article
Show Figures

Figure 1

25 pages, 3616 KB  
Article
Simultaneous Decompositions of Two Sets of Five Quaternion Tensors and Applications in Color Videos Processing
by Zhuo-Heng He, Yu-Fei Jiang, Mei-Ling Deng and Shao-Wen Yu
Mathematics 2026, 14(9), 1558; https://doi.org/10.3390/math14091558 - 5 May 2026
Viewed by 325
Abstract
This paper extends the theory of equivalence canonical forms from quaternion matrices to quaternion tensors under the Einstein product. Motivated by recent results on the simultaneous decomposition of two specific configurations of five quaternion matrices, we establish a comprehensive framework for the corresponding [...] Read more.
This paper extends the theory of equivalence canonical forms from quaternion matrices to quaternion tensors under the Einstein product. Motivated by recent results on the simultaneous decomposition of two specific configurations of five quaternion matrices, we establish a comprehensive framework for the corresponding configurations of five quaternion tensors. The core approach leverages bijective transformation maps that establish isomorphisms between quaternion tensor spaces and matrix spaces, allowing us to systematically construct invertible transformation tensors that simultaneously reduce the given tensor quintuples to canonical forms consisting solely of binary entries (0 and 1). A detailed structural analysis of the resulting canonical tensor forms is provided, including explicit dimension formulas for all identity blocks derived from precise rank conditions. To demonstrate practical utility, we integrate the proposed tensor decomposition with the discrete wavelet transform to construct a color video encryption and decryption system. Experimental results confirm perfect reconstruction (PSNR exceeding 300 dB, SSIM equal to 1) and strong security performance: NPCR of 49.8%, UACI of 49.6%, information entropy of 0.9986 bits per pixel, adjacent pixel correlation below 0.03 in absolute value, and a key space exceeding 2512. The developed theory significantly extends the existing literature on quaternion tensor decompositions and provides powerful tools for multidimensional signal processing. Full article
Show Figures

Figure 1

24 pages, 598 KB  
Article
Color Transformations Resulting in Loss of Performance in Modern Video Compression Software Systems
by Marek Domański, Adam Grzelka and Olgierd Stankiewicz
Information 2026, 17(4), 366; https://doi.org/10.3390/info17040366 - 13 Apr 2026
Viewed by 339
Abstract
Modern video compression is implemented in complex software systems that reuse software modules from various sources. This is particularly evident in experimental software systems designed for researching and standardizing new compression technologies. These systems often incorporate software modules operating in different color spaces. [...] Read more.
Modern video compression is implemented in complex software systems that reuse software modules from various sources. This is particularly evident in experimental software systems designed for researching and standardizing new compression technologies. These systems often incorporate software modules operating in different color spaces. For example, AI-based techniques are often used in video coding experiments. The corresponding software modules often operate on RGB representations, while other modules operate on YCBCR components. In this study, we demonstrate that the quality loss resulting from color transformations is comparable to the respective quantization noise. Consecutive cycles of color transformations do not result in significant additional degradation. However, for image compression, very different results are obtained in different color representations. This aspect must be carefully considered in compression research. This paper supports these considerations with extensive experimental results in the context of ITU Recommendations BT.709 and BT.2020, as well as AVC and HEVC compression. Full article
(This article belongs to the Special Issue Signal Processing and Machine Learning, 2nd Edition)
Show Figures

Figure 1

17 pages, 2594 KB  
Article
Dunhuang Mural Style Transfer Using Vision Mamba: In-Context Prompting and Physically Motivated HSV Modulation
by Peijun Qin, Long Liu, Hongjuan Wang, Siyuan Ma, Cui Chen, Zixuan Han and Mingzhi Cheng
Electronics 2026, 15(8), 1578; https://doi.org/10.3390/electronics15081578 - 9 Apr 2026
Viewed by 435
Abstract
Digital stylization of Dunhuang murals can support cultural heritage revitalization by transferring their distinctive aesthetics to modern images, but existing methods face practical limitations. Transformer-based models can yield high visual quality, but often at a prohibitive computational cost. In contrast, standard state space [...] Read more.
Digital stylization of Dunhuang murals can support cultural heritage revitalization by transferring their distinctive aesthetics to modern images, but existing methods face practical limitations. Transformer-based models can yield high visual quality, but often at a prohibitive computational cost. In contrast, standard state space models (SSMs) are more efficient but tend to incur issues such as semantic loss, inconsistent stylization, and an undesired coupling between color and structure when processing the complex textures of historical murals. To address these issues, we propose Dh-Mamba, a hierarchical visual Mamba framework tailored for high-fidelity Dunhuang mural style transfer. Dh-Mamba introduces a CrossMamba in-context style injection mechanism. This mechanism prefixes the style token sequence to the content sequence, which enables globally consistent style propagation as a persistent memory and retains linear-time efficiency. We also designed two additional components: a Modulated Style Perception Module (Δt) and an Orthogonal Decoupled HSV Modulator. The former adaptively regulates texture injection based on style complexity. The latter models mineral pigment palettes and mitigates oxidation-related artifacts by disentangling hue, saturation, and value. Experiments on a custom Dunhuang dataset show that Dh-Mamba improves content preservation and produces more natural mural textures than recent state-of-the-art methods; multiple quantitative metrics corroborate these gains. With 20.04 million parameters, Dh-Mamba provides a resource-efficient solution suitable for deployment in resource-constrained terminal applications for cultural heritage preservation Full article
Show Figures

Figure 1

28 pages, 105542 KB  
Article
Underwater Image Enhancement via HSV-CS Representation and Perception-Driven Adaptive Fusion
by Fengxu Guan, Tong Guo and Yuzhu Zhang
Remote Sens. 2026, 18(7), 986; https://doi.org/10.3390/rs18070986 - 25 Mar 2026
Viewed by 725
Abstract
Underwater images often suffer from color distortion and low contrast, severely limiting the reliability of visual perception systems. Existing methods struggle to balance enhancement quality and computational efficiency. To address this issue, we propose PCF-Net (Perception-driven Color Fusion Network), a lightweight dual-branch network [...] Read more.
Underwater images often suffer from color distortion and low contrast, severely limiting the reliability of visual perception systems. Existing methods struggle to balance enhancement quality and computational efficiency. To address this issue, we propose PCF-Net (Perception-driven Color Fusion Network), a lightweight dual-branch network for underwater image enhancement based on a stable HSV-CS (Hue-Saturation-Value with sine–cosine transformation) color-space representation. Specifically, a sine–cosine transformation is introduced to construct a stable HSV-CS color space, effectively avoiding hue discontinuities at boundary regions in conventional HSV representations. To compensate for underwater degradation, a Color-Bias-Aware module and a Value-Confidence module are designed to adaptively correct color distortion and luminance degradation. Furthermore, a lightweight Channel-Spatial Adaptive Gated Fusion module dynamically aggregates features from the RGB and HSV-CS branches in a perception-driven manner. The overall architecture incorporates multi-branch re-parameterizable convolutions, significantly reducing computational cost while preserving strong representational capacity. Extensive experiments on underwater image enhancement benchmarks, including UIEB and RUIE, demonstrate that PCF-Net achieves state-of-the-art performance in terms of PSNR, SSIM, and UIQM, along with visually superior color correction and contrast enhancement. With only 0.17 M parameters, the proposed model runs at 118.6 FPS on an RTX 3090 and 35.3 FPS on a Jetson Orin Nano at a resolution of 512 × 512, making it well suited for resource-constrained real-time underwater vision applications. Full article
(This article belongs to the Special Issue Deep Learning for Remote Sensing Image Enhancement)
Show Figures

Figure 1

23 pages, 7102 KB  
Article
Detection of Uniform Corrosion in Steel Pipes Using a Mobile Artificial Vision System
by Rafael Antonio Rodríguez Ospino, Cristhian Manuel Durán Acevedo and Jeniffer Katerine Carrillo Gómez
Corros. Mater. Degrad. 2026, 7(1), 21; https://doi.org/10.3390/cmd7010021 - 20 Mar 2026
Viewed by 1163
Abstract
Corrosion in steel pipelines can cause critical failures in industrial systems, while conventional inspection methods such as radiography and ultrasonic testing are costly and require specialized personnel. This study presents a mobile computer vision system for automated corrosion detection inside steel pipes using [...] Read more.
Corrosion in steel pipelines can cause critical failures in industrial systems, while conventional inspection methods such as radiography and ultrasonic testing are costly and require specialized personnel. This study presents a mobile computer vision system for automated corrosion detection inside steel pipes using deep learning-based visual analysis. The proposed system consists of a Raspberry Pi 4-based mobile robot equipped with a high-resolution camera for internal inspection. Acquired images were processed using color-space transformations (RGB–HSV), filtering, and segmentation. Convolutional neural networks and semantic segmentation models, including YOLOv8-seg (Instance segmentation) and DeepLabV3 (Semantic segmentation), were trained on a custom corrosion image dataset to identify corroded regions. Real-time visualization was implemented via Flask-based video streaming. Experimental results demonstrated high detection accuracy for uniform corrosion, achieving a mean Intersection over Union (mIoU) above 0.98 and a precision of 0.99 with the YOLOv8-seg model. These results indicate that the proposed system enables reliable and automated corrosion inspection, with the potential to reduce inspection costs and improve operational efficiency. Future work will focus on enhancing real-time performance through hardware optimization. Full article
Show Figures

Figure 1

23 pages, 12225 KB  
Article
Stain-Standardized Deep Learning Framework for Robust Leukocyte Segmentation Across Heterogeneous Cytological Datasets
by Leila Ryma Lazouni, Mourtada Benazzouz, Fethallah Hadjila, Mohammed El Amine Lazouni and Mostafa El Habib Daho
Information 2026, 17(3), 262; https://doi.org/10.3390/info17030262 - 5 Mar 2026
Viewed by 684
Abstract
Accurate leukocyte segmentation remains challenging in automated hematological analysis due to staining variability, heterogeneous imaging conditions, and morphological diversity across cytological datasets, severely limiting deep learning model generalization. This work proposes a dual-module framework designed to achieve stain-invariant and robust leukocyte segmentation. The [...] Read more.
Accurate leukocyte segmentation remains challenging in automated hematological analysis due to staining variability, heterogeneous imaging conditions, and morphological diversity across cytological datasets, severely limiting deep learning model generalization. This work proposes a dual-module framework designed to achieve stain-invariant and robust leukocyte segmentation. The first module performs explicit stain standardization by combining a VGG-based encoder, a transformer bottleneck, and a convolutional decoder to harmonize diverse inputs toward a Wright–Giemsa reference appearance. The second module introduces a multi-encoder segmentation architecture integrating complementary spatial, leukocyte-specific, and nucleus-focused representations extracted from multiple color spaces. The framework is evaluated on six public and clinical datasets covering multiple staining protocols, magnifications, and imaging scenarios. Experimental results demonstrate consistent high performance, with Dice coefficients exceeding 96% on most datasets and systematic improvements over state-of-the-art methods. Extensive ablation studies confirm the synergistic contributions of stain-standardization and multi-encoder fusion to model robustness and cross-dataset generalization. This framework overcomes stain variability and domain shift, offering a practical tool for automated leukocyte analysis in clinical settings. Full article
Show Figures

Graphical abstract

26 pages, 30049 KB  
Article
HVIFormer: A Dual-Stage Low-Light Image Enhancement Method Based on HVI Representation
by Yimei Li, Liuhong Luo and Hongjun Li
Appl. Sci. 2026, 16(5), 2450; https://doi.org/10.3390/app16052450 - 3 Mar 2026
Viewed by 778
Abstract
Low-light image enhancement improves the quality of video surveillance and image analysis and, as a result, has long been a hot topic in image processing. However, current research on this topic faces a difficult challenge—effectively suppressing noise while improving brightness and maintaining color [...] Read more.
Low-light image enhancement improves the quality of video surveillance and image analysis and, as a result, has long been a hot topic in image processing. However, current research on this topic faces a difficult challenge—effectively suppressing noise while improving brightness and maintaining color consistency, especially in extremely dark scenes, where dark noise amplification, uneven exposure, and color shifts often interact, leading to detail loss and color distortion. To address the issue, we propose a dual-stage low-light enhancement framework based on the HVI (Horizontal/Vertical-Intensity) color space. The low-light image is first mapped to the HVI space, obtaining the intensity component I and the HVI-based feature map, with I being explicitly extracted as an intensity prior. A Transformer-based pre-recovery module is introduced for global dependency modeling, guided by the intensity prior I through an Intensity-Conditioned Block (ICB) for conditional feature interaction. Subsequently, a dual-branch enhancement network utilizes lightweight Complementary Cross-Attention (CCA) blocks for brightness refinement and color denoising. Finally, the enhanced image is remapped to the sRGB color space. The proposed framework decouples global brightness recovery and feature preprocessing from detail enhancement and color refinement, improving stability in extremely dark and high-noise scenarios. Through 18 quantitative and qualitative experiments, we demonstrate that our proposed method achieves superior performance in dark noise suppression and color restoration across multiple low-light datasets. Full article
Show Figures

Figure 1

16 pages, 4072 KB  
Article
SCGViT: A Pseudo-Multimodal Low-Latency Framework for Real-Time Skin Lesion Diagnosis
by Zirui Luo, Chengyu Hou and Haishi Wang
Electronics 2026, 15(4), 845; https://doi.org/10.3390/electronics15040845 - 16 Feb 2026
Viewed by 502
Abstract
In order to solve the problems of insufficient medical image feature extraction, high classification accuracy, and computational complexity in automatic diagnosis of skin lesions in the edge computing environment, this paper proposes a real-time pseudo-multimodal low-delay diagnosis framework, SCGViT, based on a vision [...] Read more.
In order to solve the problems of insufficient medical image feature extraction, high classification accuracy, and computational complexity in automatic diagnosis of skin lesions in the edge computing environment, this paper proposes a real-time pseudo-multimodal low-delay diagnosis framework, SCGViT, based on a vision transformer. The framework is constructed around three functional objectives: mitigating data imbalance through generative modeling, capturing diverse representations via multi-dimensional perception, and optimizing feature fusion through adaptive refinement. Firstly, using Class-Conditioned Generative Adversarial Networks (CGANs) simulates manifolds of minority class samples in latent space, achieving preliminary balance of data distribution. Secondly, a branch feature-extraction path is constructed to simulate inversion (INV) and infrared (IR) modes in the original visual primary color mode (RGB), in order to achieve multi-dimensional perception. Finally, a cross-attention mechanism is combined for cross-branch feature aggregation, and a channel-attention mechanism (squeeze and excitation) is embedded for secondary refinement of the mixed global local features to enhance the representation ability of key pathological regions by integrating complementary structural and contrast information. The experimental results on the HAM10000 dataset showed that the F1 score reached 0.973, the inference speed reached 304.439 FPS, the parameter count was only 0.524 M, and the computational complexity was only 0.866 G FLOPs, achieving a balance between high accuracy and light weight. Full article
Show Figures

Figure 1

21 pages, 3373 KB  
Article
A Lightweight Fire Detection Framework for Edge Visual Sensors Using Small-Sample Domain Adaptation
by Jie Hu, Ruitong Yao, Qingyuan Yang, Yuning Ding, Long Zhang and Juan Liu
Sensors 2026, 26(4), 1121; https://doi.org/10.3390/s26041121 - 9 Feb 2026
Viewed by 660
Abstract
Addressing the challenges in vision-based sensor networks, this study proposes a novel fire detection framework combining Multi-Feature Fusion and Adaptive Support Vector Machine (A-SVM). First, a high-dimensional feature vector is constructed by fusing HSI color space statistics, Local Binary Pattern (LBP) dynamic textures, [...] Read more.
Addressing the challenges in vision-based sensor networks, this study proposes a novel fire detection framework combining Multi-Feature Fusion and Adaptive Support Vector Machine (A-SVM). First, a high-dimensional feature vector is constructed by fusing HSI color space statistics, Local Binary Pattern (LBP) dynamic textures, and Wavelet Transform shape features. A baseline SVM classifier is then trained on source domain data. Second, to overcome the difficulty of acquiring labeled samples in target domains (e.g., strong daytime interference or low nighttime illumination), a small-sample domain adaptation mechanism is introduced. This mechanism fine-tunes the source model parameters using only a few labeled samples from the target domain via regularization constraints. Experimental results demonstrate that, compared with traditional color thresholding methods and unadapted baseline SVMs, the proposed method increases the F1-score by 19% and 30% in typical daytime and nighttime cross-domain scenarios, respectively. This study effectively achieves low-cost, high-precision, and robust cross-scenario fire detection, making it highly suitable for deployment on resource-constrained edge computing nodes within smart sensor networks. Full article
(This article belongs to the Section Internet of Things)
Show Figures

Figure 1

Back to TopTop