Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,213)

Search Parameters:
Keywords = contrastive fusion

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
13 pages, 3780 KB  
Article
CT-Based Analysis of Rod Trace Length Changes During Posterior Spinal Correction in Adult Spinal Deformity
by Takumi Takeuchi, Takafumi Iwasaki, Kaito Jinnai, Yosuke Kawano, Kazumasa Konishi, Masahito Takahashi, Hitoshi Kono and Naobumi Hosogane
J. Clin. Med. 2026, 15(2), 778; https://doi.org/10.3390/jcm15020778 (registering DOI) - 18 Jan 2026
Abstract
Background: In adult spinal deformity (ASD) surgery, appropriate rod length determination is crucial, as excessive cranial rod length can lead to skin problems, especially in thin elderly patients if proximal junctional kyphosis (PJK) develops. In adolescent idiopathic scoliosis (AIS), correction is primarily [...] Read more.
Background: In adult spinal deformity (ASD) surgery, appropriate rod length determination is crucial, as excessive cranial rod length can lead to skin problems, especially in thin elderly patients if proximal junctional kyphosis (PJK) develops. In adolescent idiopathic scoliosis (AIS), correction is primarily performed in the coronal plane, and rod length changes are relatively predictable. Moreover, PJK is uncommon in AIS, making excess rod length rarely a clinical concern. In contrast, ASD correction involves more complex three-dimensional realignment, including restoration of lumbar lordosis (LL), which makes it challenging to predict postoperative changes in rod trace length (RTL). Furthermore, because PJK occurs more frequently in ASD surgery, appropriate rod length selection becomes clinically important. This study aimed to quantitatively evaluate changes in RTL before and after posterior correction. Method: Thirty patients with ASD who underwent staged lateral lumbar interbody fusion (LLIF) followed by posterior corrective fusion from T9 to the pelvis were retrospectively analyzed. RTL before posterior correction (Pre-RTL) was estimated from the planned screw insertional point on axial CT after LLIF, and postoperative RTL (Post-RTL) was measured from screw head centers on post-operative CT. LL and Cobb angle were assessed before and after posterior correction. Correlations between RTL change and alignment change were evaluated. Results: Postoperative RTL was shortened in all patients, with an average reduction of approximately 16–17 mm. RTL shortening demonstrated significant correlations with LL correction (R = 0.51, p = 0.003) and Cobb angle correction (R = 0.70, p = 0.00001). Greater shortening of RTL was observed on the convex side in patients with preoperative Cobb angle ≥ 10° (p = 0.04). Conclusions: Greater coronal deformity, particularly on the convex side, was associated with increased RTL shortening. These findings suggest that routine preparation of excessively long rods may be unnecessary. Consideration of anticipated RTL shortening may help avoid excessive cranial rod length and potentially reduce the risk of skin complications associated with PJK, particularly in thin elderly patients. Full article
Show Figures

Figure 1

26 pages, 6864 KB  
Article
OCDBMamba: A Robust and Efficient Road Pothole Detection Framework with Omnidirectional Context and Consensus-Based Boundary Modeling
by Feng Ling, Yunfeng Lin, Weijie Mao and Lixing Tang
Sensors 2026, 26(2), 632; https://doi.org/10.3390/s26020632 (registering DOI) - 17 Jan 2026
Abstract
Reliable road pothole detection remains challenging in complex environments, where low contrast, shadows, water films, and strong background textures cause frequent false alarms, missed detections, and boundary instability. Thin rims and adjacent objects further complicate localization, and model robustness often deteriorates across regions [...] Read more.
Reliable road pothole detection remains challenging in complex environments, where low contrast, shadows, water films, and strong background textures cause frequent false alarms, missed detections, and boundary instability. Thin rims and adjacent objects further complicate localization, and model robustness often deteriorates across regions and sensor domains. To address these issues, we propose OCDBMamba, a unified and efficient framework that integrates omnidirectional context modeling with consensus-driven boundary selection. Specifically, we introduce the following: (1) an Omnidirectional Channel-Selective Scanning (OCS) mechanism that aggregates long-range structural cues by performing multidirectional scans and channel similarity fusion with cross-directional consistency, capturing comprehensive spatial dependencies at near-linear complexity and (2) a Dual-Branch Consensus Thresholding (DBCT) module that enforces branch-level agreement with sparsity-regulated adaptive thresholds and boundary consistency constraints, effectively preserving true rims while suppressing reflections and redundant responses. Extensive experiments on normal, shadowed, wet, low-contrast, and texture-rich subsets yield 90.7% mAP50, 67.8% mAP50:95, a precision of 0.905, and a recall of 0.812 with 13.1 GFLOPs, outperforming YOLOv11n by 5.4% and 5.6%, respectively. The results demonstrate more stable localization and enhanced robustness under diverse conditions, validating the synergy of OCS and DBCT for practical road inspection and on-vehicle perception scenarios. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

21 pages, 4628 KB  
Article
Effect of Inclined Angles and Contouring Parameters on Upskin Surface Characteristics of Parts Made by Laser Powder-Bed Fusion
by Nismath Valiyakath Vadakkan Habeeb and Kevin Chou
Coatings 2026, 16(1), 119; https://doi.org/10.3390/coatings16010119 - 16 Jan 2026
Viewed by 161
Abstract
Surface finish plays a critical role in the tribological performance of additively manufactured engineering components. In exploring part characteristics in laser powder-bed fusion (L-PBF), this study investigates the effect of contouring strategies on the upskin surface of inclined specimens (30°, 45°, and 60°) [...] Read more.
Surface finish plays a critical role in the tribological performance of additively manufactured engineering components. In exploring part characteristics in laser powder-bed fusion (L-PBF), this study investigates the effect of contouring strategies on the upskin surface of inclined specimens (30°, 45°, and 60°) made with L-PBF, using post- and pre-contouring strategies with various levels of process parameters. The surface data of fabricated inclined specimens were acquired by white-light interferometry, followed by a quantitative analysis using surface images. The results show that post-contouring leads to better surface finishes, with the lowest Sa of 8.68 µm attained at the highest laser power (195 W) and the slowest scan speed (500 mm/s) on 30°-inclined specimens, likely due to increased remelting and less step-edges. In contrast, pre-contouring produces distinct surface textures on the upskin of L-PBF specimens, resulting in a rougher surface morphology, with a maximum Sa of 33.39 µm also from 30°-inclined specimens at the lowest power (100 W) and the highest speed (2000 mm/s), suggesting an insufficient remelting of surface defects. In comparative analysis, in general, post-contouring yields smoother upskin surfaces, with a 17%–30% reduction in Sa, than those from equivalent pre-contouring conditions, highlighting the potential of scan sequences for optimizing L-PBF to improve the surface finish of inclined structures. Full article
Show Figures

Figure 1

24 pages, 5801 KB  
Article
MEANet: A Novel Multiscale Edge-Aware Network for Building Change Detection in High-Resolution Remote Sensing Images
by Tao Chen, Linjin Huang, Wenyi Zhao, Shengjie Yu, Yue Yang and Antonio Plaza
Remote Sens. 2026, 18(2), 261; https://doi.org/10.3390/rs18020261 - 14 Jan 2026
Viewed by 172
Abstract
Remote sensing building change detection (RSBCD) is critical for land surface monitoring and understanding interactions between human activities and the ecological environment. However, existing deep learning-based RSBCD methods often result in mis-detected pixels concentrated around object boundaries, mainly due to ambiguous object shapes [...] Read more.
Remote sensing building change detection (RSBCD) is critical for land surface monitoring and understanding interactions between human activities and the ecological environment. However, existing deep learning-based RSBCD methods often result in mis-detected pixels concentrated around object boundaries, mainly due to ambiguous object shapes and complex spatial distributions. To address this problem, we propose a new Multiscale Edge-Aware change detection Network (MEANet) that accurately locates edge pixels of changed objects and enhances the separability between changed and unchanged pixels. Specifically, a high-resolution feature fusion network is adopted to preserve spatial details while integrating deep semantic information, and a multi-scale supervised contrastive loss (MSCL) is designed to jointly optimize pixel-level discrimination and embedding space separability. To further improve the handling of difficult samples, hard negative sampling is adopted in the contrastive learning process. We conduct comparative experiments on three benchmark datasets. Both Visual and quantitative results demonstrate that our new MEANet significantly reduces misclassified pixels at object boundaries and achieve superior detection accuracy compared to existing methods. Especially on the GZ-CD dataset, MEANet improves F1-Score and mIoU by more than 2% compared with ChangeFormer, demonstrating strong robustness in complex scenarios. It is worth noting that the performance of MEANet may still be affected by extremely complex edge textures or highly blurred boundaries. Future work will focus on further improving robustness under such challenges and extending the method to broader RSBCD scenarios. Full article
Show Figures

Figure 1

22 pages, 5744 KB  
Article
MCHB-DETR: An Efficient and Lightweight Inspection Framework for Ink Jet Printing Defects in Semiconductor Packaging
by Yibin Chen, Jiayi He, Zhuohao Shi, Yisong Pan and Weicheng Ou
Micromachines 2026, 17(1), 109; https://doi.org/10.3390/mi17010109 - 14 Jan 2026
Viewed by 147
Abstract
In semiconductor packaging and microelectronic manufacturing, inkjet printing technology is widely employed in critical processes such as conductive line fabrication and encapsulant dot deposition. However, dynamic printing defects, such as missing droplets and splashing can severely compromise circuit continuity and device reliability. Traditional [...] Read more.
In semiconductor packaging and microelectronic manufacturing, inkjet printing technology is widely employed in critical processes such as conductive line fabrication and encapsulant dot deposition. However, dynamic printing defects, such as missing droplets and splashing can severely compromise circuit continuity and device reliability. Traditional inspection methods struggle to detect such subtle and low-contrast defects. To address this challenge, we propose MCHB-DETR, a novel lightweight defect detection framework based on RT-DETR, aimed at improving product yield in inkjet printing for semiconductor packaging. MCHB-DETR features a lightweight backbone with enhanced multi-level feature extraction capabilities and a hybrid encoder designed to improve cross-scale and multi-frequency feature fusion. Experimental results on our inkjet dataset show a 29.1% reduction in parameters and a 36.7% reduction in FLOPs, along with improvements of 3.1% in mAP@50 and 3.5% in mAP@50:95. These results demonstrate its superior detection performance while maintaining efficient inference, highlighting its strong potential for enhancing yield in semiconductor packaging. Full article
(This article belongs to the Special Issue Emerging Technologies and Applications for Semiconductor Industry)
Show Figures

Figure 1

15 pages, 1527 KB  
Article
Learning Complementary Representations for Targeted Multimodal Sentiment Analysis
by Binfen Ding, Jieyu An and Yumeng Lei
Computers 2026, 15(1), 52; https://doi.org/10.3390/computers15010052 - 13 Jan 2026
Viewed by 108
Abstract
Targeted multimodal sentiment classification is frequently impeded by the semantic sparsity of social media content, where text is brief and context is implicit. Traditional methods that rely on direct concatenation of textual and visual features often fail to resolve the ambiguity of specific [...] Read more.
Targeted multimodal sentiment classification is frequently impeded by the semantic sparsity of social media content, where text is brief and context is implicit. Traditional methods that rely on direct concatenation of textual and visual features often fail to resolve the ambiguity of specific targets due to a lack of alignment between modalities. In this paper, we propose the Complementary Description Network (CDNet) to bridge this informational gap. CDNet incorporates automatically generated image descriptions as an additional semantic bridge, in contrast to methods that handle text and images as distinct streams. The framework enhances the input representation by directly translating visual content into text, allowing for more accurate interactions between the opinion target and the visual narrative. We further introduce a complementary reconstruction module that functions as a regularizer, forcing the model to retain deep semantic cues during fusion. Empirical results on the Twitter-2015 and Twitter-2017 benchmarks confirm that CDNet outperforms existing baselines. The findings suggest that visual-to-text augmentation is an effective strategy for compensating for the limited context inherent in short texts. Full article
(This article belongs to the Section AI-Driven Innovations)
Show Figures

Figure 1

24 pages, 3126 KB  
Article
Calibrated Transformer Fusion for Dual-View Low-Energy CESM Classification
by Ahmed A. H. Alkurdi and Amira Bibo Sallow
J. Imaging 2026, 12(1), 41; https://doi.org/10.3390/jimaging12010041 - 13 Jan 2026
Viewed by 159
Abstract
Contrast-enhanced spectral mammography (CESM) provides low-energy images acquired in standard craniocaudal (CC) and mediolateral oblique (MLO) views, and clinical interpretation relies on integrating both views. This study proposes a dual-view classification framework that combines deep CNN feature extraction with transformer-based fusion for breast-side [...] Read more.
Contrast-enhanced spectral mammography (CESM) provides low-energy images acquired in standard craniocaudal (CC) and mediolateral oblique (MLO) views, and clinical interpretation relies on integrating both views. This study proposes a dual-view classification framework that combines deep CNN feature extraction with transformer-based fusion for breast-side classification using low-energy (DM) images from CESM acquisitions (Normal vs. Tumorous; benign and malignant merged). The evaluation was conducted using 5-fold stratified group cross-validation with patient-level grouping to prevent leakage across folds. The final configuration (Model E) integrates dual-backbone feature extraction, transformer fusion, MC-dropout inference for uncertainty estimation, and post hoc logistic calibration. Across the five held-out test folds, Model E achieved a mean accuracy of 96.88% ± 2.39% and a mean F1-score of 97.68% ± 1.66%. The mean ROC-AUC and PR-AUC were 0.9915 ± 0.0098 and 0.9968 ± 0.0029, respectively. Probability quality was supported by a mean Brier score of 0.0236 ± 0.0145 and a mean expected calibration error (ECE) of 0.0334 ± 0.0171. An ablation study (Models A–E) was also reported to quantify the incremental contribution of dual-view input, transformer fusion, and uncertainty calibration. Within the limits of this retrospective single-center setting, these results suggest that dual-view transformer fusion can provide strong discrimination while also producing calibrated probabilities and uncertainty outputs that are relevant for decision support. Full article
Show Figures

Figure 1

30 pages, 42468 KB  
Article
From “Data Silos” to “Collaborative Symbiosis”: How Digital Technologies Empower Rural Built Environment and Landscapes to Bridge Socio-Ecological Divides: Based on a Comparative Study of the Yuanyang Hani Terraces and Yu Village in Anji
by Weiping Zhang and Yian Zhao
Buildings 2026, 16(2), 296; https://doi.org/10.3390/buildings16020296 - 10 Jan 2026
Viewed by 205
Abstract
Rural areas are currently facing a deepening “social-ecological divide,” where the fragmentation of natural, economic, and cultural data—often trapped in “data silos”—hinders effective systemic governance. To bridge this gap, in this study, the Rural Landscape Information Model (RLIM), an integrative framework designed to [...] Read more.
Rural areas are currently facing a deepening “social-ecological divide,” where the fragmentation of natural, economic, and cultural data—often trapped in “data silos”—hinders effective systemic governance. To bridge this gap, in this study, the Rural Landscape Information Model (RLIM), an integrative framework designed to reconfigure rural connections through data fusion, process coordination, and performance feedback, is proposed. We validate the framework’s effectiveness through a comparative analysis of two distinct rural archetypes in China: the innovation-driven Yu Village and the heritage-conservation-oriented Hani Terraces. Our results reveal that digital technologies drive distinct empowerment pathways moderated by regional contexts: (1) In the data domain, heterogeneous resources were successfully integrated into the framework in both cases (achieving a Monitoring Coverage > 80%), yet served divergent strategic ends—comprehensive territorial management in Yu Village versus precision heritage monitoring in the Hani Terraces. (2) In the process domain, digital platforms restructured social interactions differently. Yu Village achieved high individual participation (Participation Rate ≈ 0.85) via mobile governance apps, whereas the Hani Terraces relied on cooperative-mediated engagement to bridge the digital divide for elderly farmers. (3) In the performance domain, the interventions yielded contrasting but positive economic-ecological outcomes. Yu Village realized a 25% growth in tourism revenue through “industrial transformation” (Ecology+), while the Hani Terraces achieved a 12% value enhancement by stabilizing traditional agricultural ecosystems (Culture+). This study contributes a verifiable theoretical model and a set of operational tools, demonstrating that digital technologies are not merely instrumental add-ons but catalysts for fostering resilient, collaborative, and context-specific rural socio-ecological systems, ultimately offering scalable governance strategies for sustainable rural revitalization in the digital era. Full article
(This article belongs to the Special Issue Digital Technologies in Construction and Built Environment)
Show Figures

Figure 1

44 pages, 9272 KB  
Systematic Review
Toward a Unified Smart Point Cloud Framework: A Systematic Review of Definitions, Methods, and a Modular Knowledge-Integrated Pipeline
by Mohamed H. Salaheldin, Ahmed Shaker and Songnian Li
Buildings 2026, 16(2), 293; https://doi.org/10.3390/buildings16020293 - 10 Jan 2026
Viewed by 277
Abstract
Reality-capture has made point clouds a primary spatial data source, yet processing and integration limits hinder their potential. Prior reviews focus on isolated phases; by contrast, Smart Point Clouds (SPCs)—augmenting points with semantics, relations, and query interfaces to enable reasoning—received limited attention. This [...] Read more.
Reality-capture has made point clouds a primary spatial data source, yet processing and integration limits hinder their potential. Prior reviews focus on isolated phases; by contrast, Smart Point Clouds (SPCs)—augmenting points with semantics, relations, and query interfaces to enable reasoning—received limited attention. This systematic review synthesizes the state-of-the-art SPC terminology and methods to propose a modular pipeline. Following PRISMA, we searched Scopus, Web of Science, and Google Scholar up to June 2025. We included English-language studies in geomatics and engineering presenting novel SPC methods. Fifty-eight publications met eligibility criteria: Direct (n = 22), Indirect (n = 22), and New Use (n = 14). We formalize an operative SPC definition—queryable, ontology-linked, provenance-aware—and map contributions across traditional point cloud processing stages (from acquisition to modeling). Evidence shows practical value in cultural heritage, urban planning, and AEC/FM via semantic queries, rule checks, and auditable updates. Comparative qualitative analysis reveals cross-study trends: higher and more uniform density stabilizes features but increases computation, and hybrid neuro-symbolic classification improves long-tail consistency; however, methodological heterogeneity precluded quantitative synthesis. We distill a configurable eight-module pipeline and identify open challenges in data at scale, domain transfer, temporal (4D) updates, surface exports, query usability, and sensor fusion. Finally, we recommend lightweight reporting standards to improve discoverability and reuse. Full article
(This article belongs to the Section Construction Management, and Computers & Digitization)
Show Figures

Figure 1

32 pages, 12128 KB  
Article
YOLO-SMD: A Symmetrical Multi-Scale Feature Modulation Framework for Pediatric Pneumonia Detection
by Linping Du, Xiaoli Zhu, Zhongbin Luo and Yanping Xu
Symmetry 2026, 18(1), 139; https://doi.org/10.3390/sym18010139 - 10 Jan 2026
Viewed by 164
Abstract
Pediatric pneumonia detection faces the challenge of pathological asymmetry, where immature lung tissues present blurred boundaries and lesions exhibit extreme scale variations (e.g., small viral nodules vs. large bacterial consolidations). Conventional detectors often fail to address these imbalances. In this study, we propose [...] Read more.
Pediatric pneumonia detection faces the challenge of pathological asymmetry, where immature lung tissues present blurred boundaries and lesions exhibit extreme scale variations (e.g., small viral nodules vs. large bacterial consolidations). Conventional detectors often fail to address these imbalances. In this study, we propose YOLO-SMD, a detection framework built upon a symmetrical design philosophy to enforce balanced feature representation. We introduce three architectural innovations: (1) DySample (Content-Aware Upsampling): To address the blurred boundaries of pediatric lesions, this module replaces static interpolation with dynamic point sampling, effectively sharpening edge details that are typically smoothed out by standard upsamplers; (2) SAC2f (Cross-Dimensional Attention): To counteract background interference, this module enforces a symmetrical interaction between spatial and channel dimensions, allowing the model to suppress structural noise (e.g., rib overlaps) in low-contrast X-rays; (3) SDFM (Adaptive Gated Fusion): To resolve the extreme scale disparity, this unit employs a gated mechanism that symmetrically balances deep semantic features (crucial for large bacterial shapes) and shallow textural features (crucial for viral textures). Extensive experiments on a curated subset of 2611 images derived from the Chest X-ray Pneumonia Dataset demonstrate that YOLO-SMD achieves competitive performance with a focus on high sensitivity, attaining a Recall of 86.1% and an mAP@0.5 of 84.3%, thereby outperforming the state-of-the-art YOLOv12n by 2.4% in Recall under identical experimental conditions. The results validate that incorporating symmetry principles into feature modulation significantly enhances detection robustness in primary healthcare settings. Full article
(This article belongs to the Special Issue Symmetry/Asymmetry in Image Processing and Computer Vision)
Show Figures

Figure 1

21 pages, 58532 KB  
Article
Joint Inference of Image Enhancement and Object Detection via Cross-Domain Fusion Transformer
by Bingxun Zhao and Yuan Chen
Computers 2026, 15(1), 43; https://doi.org/10.3390/computers15010043 - 10 Jan 2026
Viewed by 113
Abstract
Underwater vision is fundamental to ocean exploration, yet it is frequently impaired by underwater degradation including low contrast, color distortion and blur, thereby presenting significant challenges for underwater object detection (UOD). Most existing methods employ underwater image enhancement as a preprocessing step to [...] Read more.
Underwater vision is fundamental to ocean exploration, yet it is frequently impaired by underwater degradation including low contrast, color distortion and blur, thereby presenting significant challenges for underwater object detection (UOD). Most existing methods employ underwater image enhancement as a preprocessing step to improve visual quality prior to detection. However, image enhancement and object detection are optimized for fundamentally different objectives, and directly cascading them leads to feature distribution mismatch. Moreover, prevailing dual-branch architectures process enhancement and detection independently, overlooking multi-scale interactions across domains and thus constraining the learning of cross-domain feature representation. To overcome these limitations, We propose an underwater cross-domain fusion Transformer detector (UCF-DETR). UCF-DETR jointly leverages image enhancement and object detection by exploiting the complementary information from the enhanced and original image domains. Specifically, an underwater image enhancement module is employed to improve visibility. We then design a cross-domain feature pyramid to integrate fine-grained structural details from the enhanced domain with semantic representations from the original domain. Cross-domain query interaction mechanism is introduced to model inter-domain query relationships, leading to accurate object localization and boundary delineation. Extensive experiments on the challenging DUO and UDD benchmarks demonstrate that UCF-DETR consistently outperforms state-of-the-art methods for UOD. Full article
(This article belongs to the Special Issue Advanced Image Processing and Computer Vision (2nd Edition))
Show Figures

Figure 1

29 pages, 79553 KB  
Article
A2Former: An Airborne Hyperspectral Crop Classification Framework Based on a Fully Attention-Based Mechanism
by Anqi Kang, Hua Li, Guanghao Luo, Jingyu Li and Zhangcai Yin
Remote Sens. 2026, 18(2), 220; https://doi.org/10.3390/rs18020220 - 9 Jan 2026
Viewed by 130
Abstract
Crop classification of farmland is of great significance for crop monitoring and yield estimation. Airborne hyperspectral systems can provide large-format hyperspectral farmland images. However, traditional machine learning-based classification methods rely heavily on handcrafted feature design, resulting in limited representation capability and poor computational [...] Read more.
Crop classification of farmland is of great significance for crop monitoring and yield estimation. Airborne hyperspectral systems can provide large-format hyperspectral farmland images. However, traditional machine learning-based classification methods rely heavily on handcrafted feature design, resulting in limited representation capability and poor computational efficiency when processing large-format data. Meanwhile, mainstream deep-learning-based hyperspectral image (HSI) classification methods primarily rely on patch-based input methods, where a label is assigned to each patch, limiting the full utilization of hyperspectral datasets in agricultural applications. In contrast, this paper focuses on the semantic segmentation task in the field of computer vision and proposes a novel HSI crop classification framework named All-Attention Transformer (A2Former), which combines CNN and Transformer based on a fully attention-based mechanism. First, a CNN-based encoder consisting of two blocks, the overlap-downsample and the spectral–spatial attention weights block (SSWB) is constructed to extract multi-scale spectral–spatial features effectively. Second, we propose a lightweight C-VIT block to enhance high-dimensional features while reducing parameter count and computational cost. Third, a Transformer-based decoder block with gated-style weighted fusion and interaction attention (WIAB), along with a fused segmentation head (FH), is developed to precisely model global and local features and align semantic information across multi-scale features, thereby enabling accurate segmentation. Finally, a checkerboard-style sampling strategy is proposed to avoid information leakage and ensure the objectivity and accuracy of model performance evaluation. Experimental results on two public HSI datasets demonstrate the accuracy and efficiency of the proposed A2Former framework, outperforming several well-known patch-free and patch-based methods on two public HSI datasets. Full article
(This article belongs to the Section Remote Sensing in Agriculture and Vegetation)
Show Figures

Figure 1

31 pages, 17740 KB  
Article
HR-UMamba++: A High-Resolution Multi-Directional Mamba Framework for Coronary Artery Segmentation in X-Ray Coronary Angiography
by Xiuhan Zhang, Peng Lu, Zongsheng Zheng and Wenhui Li
Fractal Fract. 2026, 10(1), 43; https://doi.org/10.3390/fractalfract10010043 - 9 Jan 2026
Viewed by 247
Abstract
Coronary artery disease (CAD) remains a leading cause of mortality worldwide, and accurate coronary artery segmentation in X-ray coronary angiography (XCA) is challenged by low contrast, structural ambiguity, and anisotropic vessel trajectories, which hinder quantitative coronary angiography. We propose HR-UMamba++, a U-Mamba-based framework [...] Read more.
Coronary artery disease (CAD) remains a leading cause of mortality worldwide, and accurate coronary artery segmentation in X-ray coronary angiography (XCA) is challenged by low contrast, structural ambiguity, and anisotropic vessel trajectories, which hinder quantitative coronary angiography. We propose HR-UMamba++, a U-Mamba-based framework centered on a rotation-aligned multi-directional state-space scan for modeling long-range vessel continuity across multiple orientations. To preserve thin distal branches, the framework is equipped with (i) a persistent high-resolution bypass that injects undownsampled structural details and (ii) a UNet++-style dense decoder topology for cross-scale topological fusion. On an in-house dataset of 739 XCA images from 374 patients, HR-UMamba++ is evaluated using eight segmentation metrics, fractal-geometry descriptors, and multi-view expert scoring. Compared with U-Net, Attention U-Net, HRNet, U-Mamba, DeepLabv3+, and YOLO11-seg, HR-UMamba++ achieves the best performance (Dice 0.8706, IoU 0.7794, HD95 16.99), yielding a relative Dice improvement of 6.0% over U-Mamba and reducing the deviation in fractal dimension by up to 57% relative to U-Net. Expert evaluation across eight angiographic views yields a mean score of 4.24 ± 0.49/5 with high inter-rater agreement. These results indicate that HR-UMamba++ produces anatomically faithful coronary trees and clinically useful segmentations that can serve as robust structural priors for downstream quantitative coronary analysis. Full article
Show Figures

Figure 1

15 pages, 1386 KB  
Article
Symmetry and Asymmetry Principles in Deep Speaker Verification Systems: Balancing Robustness and Discrimination Through Hybrid Neural Architectures
by Sundareswari Thiyagarajan and Deok-Hwan Kim
Symmetry 2026, 18(1), 121; https://doi.org/10.3390/sym18010121 - 8 Jan 2026
Viewed by 136
Abstract
Symmetry and asymmetry are foundational design principles in artificial intelligence, defining the balance between invariance and adaptability in multimodal learning systems. In audio-visual speaker verification, where speech and lip-motion features are jointly modeled to determine whether two utterances belong to the same individual, [...] Read more.
Symmetry and asymmetry are foundational design principles in artificial intelligence, defining the balance between invariance and adaptability in multimodal learning systems. In audio-visual speaker verification, where speech and lip-motion features are jointly modeled to determine whether two utterances belong to the same individual, these principles govern both fairness and discriminative power. In this work, we analyze how symmetry and asymmetry emerge within a gated-fusion architecture that integrates Time-Delay Neural Networks and Bidirectional Long Short-Term Memory encoders for speech, ResNet-based visual lip encoders, and a shared Conformer-based temporal backbone. Structural symmetry is preserved through weight-sharing across paired utterances and symmetric cosine-based scoring, ensuring verification consistency regardless of input order. In contrast, asymmetry is intentionally introduced through modality-dependent temporal encoding, multi-head attention pooling, and a learnable gating mechanism that dynamically re-weights the contribution of audio and visual streams at each timestep. This controlled asymmetry allows the model to rely on visual cues when speech is noisy, and conversely on speech when lip visibility is degraded, yielding adaptive robustness under cross-modal degradation. Experimental results demonstrate that combining symmetric embedding space design with adaptive asymmetric fusion significantly improves generalization, reducing Equal Error Rate (EER) to 3.419% on VoxCeleb-2 test dataset without sacrificing interpretability. The findings show that symmetry ensures stable and fair decision-making, while learnable asymmetry enables modality awareness together forming a principled foundation for next-generation audio-visual speaker verification systems. Full article
Show Figures

Figure 1

17 pages, 20645 KB  
Data Descriptor
Multimodal MRI–HSI Synthetic Brain Tissue Dataset Based on Agar Phantoms
by Manuel Villa, Jaime Sancho, Gonzalo Rosa-Olmeda, Aure Enkaoua, Sara Moccia and Eduardo Juarez
Data 2026, 11(1), 12; https://doi.org/10.3390/data11010012 - 8 Jan 2026
Viewed by 213
Abstract
Magnetic resonance imaging (MRI) and hyperspectral imaging (HSI) provide complementary information for image-guided neurosurgery, combining high-resolution anatomical detail with tissue-specific optical characterization. This work presents a novel multimodal phantom dataset specifically designed for MRI–HSI integration. The phantoms reproduce a three-layer tissue structure comprising [...] Read more.
Magnetic resonance imaging (MRI) and hyperspectral imaging (HSI) provide complementary information for image-guided neurosurgery, combining high-resolution anatomical detail with tissue-specific optical characterization. This work presents a novel multimodal phantom dataset specifically designed for MRI–HSI integration. The phantoms reproduce a three-layer tissue structure comprising white matter, gray matter, tumor, and superficial blood vessels, using agar-based compositions that mimic MRI contrasts of the rat brain while providing consistent hyperspectral signatures. The dataset includes two designs of phantoms with MRI, HSI, RGB-D, and tracking acquisitions, along with pixel-wise labels and corresponding 3D models, comprising 13 phantoms in total. The dataset facilitates the evaluation of registration, segmentation, and classification algorithms, as well as depth estimation, multimodal fusion, and tracking-to-camera calibration procedures. By providing reproducible, labeled multimodal data, these phantoms reduce the need for animal experiments in preclinical imaging research and serve as a versatile benchmark for MRI–HSI integration and other multimodal imaging studies. Full article
Show Figures

Figure 1

Back to TopTop