Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (567)

Search Parameters:
Keywords = voxelized model

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 4154 KB  
Article
Feasibility Domain Construction and Characterization Method for Intelligent Underground Mining Equipment Integrating ORB-SLAM3 and Depth Vision
by Siya Sun, Xiaotong Han, Hongwei Ma, Haining Yuan, Sirui Mao, Chuanwei Wang, Kexiang Ma, Yifeng Guo and Hao Su
Sensors 2026, 26(3), 966; https://doi.org/10.3390/s26030966 (registering DOI) - 2 Feb 2026
Abstract
To address the limited environmental perception capability and the difficulty of achieving consistent and efficient representation of the workspace feasible domain caused by high dust concentration, uneven illumination, and enclosed spaces in underground coal mines, this paper proposes a digital spatial construction and [...] Read more.
To address the limited environmental perception capability and the difficulty of achieving consistent and efficient representation of the workspace feasible domain caused by high dust concentration, uneven illumination, and enclosed spaces in underground coal mines, this paper proposes a digital spatial construction and representation method for underground environments by integrating RGB-D depth vision with ORB-SLAM3. First, a ChArUco calibration board with embedded ArUco markers is adopted to perform high-precision calibration of the RGB-D camera, improving the reliability of geometric parameters under weak-texture and non-uniform lighting conditions. On this basis, a “dense–sparse cooperative” OAK-DenseMapper Pro module is further developed; the module improves point-cloud generation using a mathematical projection model, and combines enhanced stereo matching with multi-stage depth filtering to achieve high-quality dense point-cloud reconstruction from RGB-D observations. The dense point cloud is then converted into a probabilistic octree occupancy map, where voxel-wise incremental updates are performed for observed space while unknown regions are retained, enabling a memory-efficient and scalable 3D feasible-space representation. Experiments are conducted in multiple representative coal-mine tunnel scenarios; compared with the original ORB-SLAM3, the number of points in dense mapping increases by approximately 38% on average; in trajectory evaluation on the TUM dataset, the root mean square error, mean error, and median error of the absolute pose error are reduced by 7.7%, 7.1%, and 10%, respectively; after converting the dense point cloud to an octree, the map memory footprint is only about 0.5% of the original point cloud, with a single conversion time of approximately 0.75 s. The experimental results demonstrate that, while ensuring accuracy, the proposed method achieves real-time, efficient, and consistent representation of the 3D feasible domain in complex underground environments, providing a reliable digital spatial foundation for path planning, safe obstacle avoidance, and autonomous operation. Full article
20 pages, 1142 KB  
Article
A Cross-Domain Benchmark of Intrinsic and Post Hoc Explainability for 3D Deep Learning Models
by Asmita Chakraborty, Gizem Karagoz and Nirvana Meratnia
J. Imaging 2026, 12(2), 63; https://doi.org/10.3390/jimaging12020063 - 30 Jan 2026
Viewed by 103
Abstract
Deep learning models for three-dimensional (3D) data are increasingly used in domains such as medical imaging, object recognition, and robotics. At the same time, the use of AI in these domains is increasing, while, due to their black-box nature, the need for explainability [...] Read more.
Deep learning models for three-dimensional (3D) data are increasingly used in domains such as medical imaging, object recognition, and robotics. At the same time, the use of AI in these domains is increasing, while, due to their black-box nature, the need for explainability has grown significantly. However, the lack of standardized and quantitative benchmarks for explainable artificial intelligence (XAI) in 3D data limits the reliable comparison of explanation quality. In this paper, we present a unified benchmarking framework to evaluate both intrinsic and post hoc XAI methods across three representative 3D datasets: volumetric CT scans (MosMed), voxelized CAD models (ModelNet40), and real-world point clouds (ScanObjectNN). The evaluated methods include Grad-CAM, Integrated Gradients, Saliency, Occlusion, and the intrinsic ResAttNet-3D model. We quantitatively assess explanations using the Correctness (AOPC), Completeness (AUPC), and Compactness metrics, consistently applied across all datasets. Our results show that explanation quality significantly varies across methods and domains, demonstrating that Grad-CAM and intrinsic attention performed best on medical CT scans, while gradient-based methods excelled on voxelized and point-based data. Statistical tests (Kruskal–Wallis and Mann–Whitney U) confirmed significant performance differences between methods. No single approach achieved superior results across all domains, highlighting the importance of multi-metric evaluation. This work provides a reproducible framework for standardized assessment of 3D explainability and comparative insights to guide future XAI method selection. Full article
(This article belongs to the Special Issue Explainable AI in Computer Vision)
Show Figures

Figure 1

16 pages, 6737 KB  
Article
Simulation-Driven Annotation-Free Deep Learning for Automated Detection and Segmentation of Airway Mucus Plugs on Non-Contrast CT Images
by Lucy Pu, Naciye Sinem Gezer, Tong Yu, Zehavit Kirshenboim, Emrah Duman, Rajeev Dhupar and Xin Meng
Bioengineering 2026, 13(2), 153; https://doi.org/10.3390/bioengineering13020153 - 28 Jan 2026
Viewed by 115
Abstract
Mucus plugs are airway-obstructing accumulations of inspissated secretions frequently observed in obstructive lung diseases (OLDs), including chronic obstructive pulmonary disease (COPD), severe asthma, and cystic fibrosis. Their presence on chest CT is strongly associated with airflow limitation, reduced lung function, and increased mortality, [...] Read more.
Mucus plugs are airway-obstructing accumulations of inspissated secretions frequently observed in obstructive lung diseases (OLDs), including chronic obstructive pulmonary disease (COPD), severe asthma, and cystic fibrosis. Their presence on chest CT is strongly associated with airflow limitation, reduced lung function, and increased mortality, making them emerging imaging biomarkers of disease burden and treatment response. However, manual delineation of mucus plugs is labor-intensive, subjective, and impractical for large cohorts, leading most prior studies to rely on coarse segment-level scoring systems that overlook lesion-level characteristics such as size, extent, and location. Automated plug-level quantification remains challenging due to substantial heterogeneity in plug morphology, overlap in attenuation with adjacent vessels and airway walls on non-contrast CT, and pronounced size imbalance in clinical datasets, which are typically dominated by small distal plugs. To address these challenges, we developed and validated a simulation-driven, annotation-free deep learning framework for automated detection and segmentation of airway mucus plugs on non-contrast chest CT. A total of 200 COPD CT scans were analyzed (98 plug-positive, 83 plug-negative, and 19 uncertain). Synthetic mucus plugs were generated within segmented airways by transferring voxel-intensity statistics from adjacent intrapulmonary vessels, preserving realistic morphology and texture while enabling controlled sampling of plug phenotypes. An nnU-Net trained exclusively on synthetic data (S-Model) was evaluated on an independent, expert-annotated test set and compared with an nnU-Net trained on manual annotations using 10-fold cross-validation (M-Model). The S-Model achieved significantly higher detection performance than the M-Model (sensitivity 0.837 [95% CI: 0.818–0.854] vs. 0.757 [95% CI: 0.737–0.776]; 1.91 false positives per scan vs. 3.68; p < 0.001), with performance gains most pronounced for medium-to-large plugs (≥6 mm). This simulation-driven approach enables accurate, scalable quantification of mucus plugs without voxel-wise manual annotation in thin-slice (<1.5 mm) non-contrast chest CT scans. While the framework could, in principle, be extended to other annotation-limited medical imaging tasks, its generalizability beyond this COPD cohort and imaging protocol has not yet been established, and future work is required to validate performance across diverse populations and scanning conditions. Full article
(This article belongs to the Special Issue Artificial Intelligence-Based Medical Imaging Processing)
Show Figures

Figure 1

23 pages, 14742 KB  
Article
Grapevine Canopy Volume Estimation from UAV Photogrammetric Point Clouds at Different Flight Heights
by Leilson Ferreira, Pedro Marques, Emanuel Peres, Raul Morais, Joaquim J. Sousa and Luís Pádua
Remote Sens. 2026, 18(3), 409; https://doi.org/10.3390/rs18030409 - 26 Jan 2026
Viewed by 281
Abstract
Vegetation volume is a useful indicator for assessing canopy structure and supporting vineyard management tasks such as foliar applications and canopy management. The photogrammetric processing of imagery acquired using unmanned aerial vehicles (UAVs) enables the generation of dense point clouds suitable for estimating [...] Read more.
Vegetation volume is a useful indicator for assessing canopy structure and supporting vineyard management tasks such as foliar applications and canopy management. The photogrammetric processing of imagery acquired using unmanned aerial vehicles (UAVs) enables the generation of dense point clouds suitable for estimating canopy volume, although point cloud quality depends on spatial resolution, which is influenced by flight height. This study evaluates the effect of three flight heights (30 m, 60 m, and 100 m) on grapevine canopy volume estimation using convex hull, alpha shape, and voxel-based models. UAV-based RGB imagery and field measurements were collected during three periods at different phenological stages in an experimental vineyard. The strongest agreement with field-measured volume occurred at 30 m, where point density was highest. Envelope-based methods showed reduced performance at higher flight heights, while voxel-based grids remained more stable when voxel size was adapted to point density. Estimator behavior also varied with canopy architecture and development. The results indicate appropriate parameter choices for different flight heights and confirm that UAV-based RGB imagery can provide reliable grapevine canopy volume estimates. Full article
Show Figures

Figure 1

22 pages, 24291 KB  
Article
AirwaySeekNet: Fine-Grained Segmentation and Completion of Peripheral Pulmonary Airways with Dynamic Reliability-Aware Supervision
by Peng Chen, Jianjun Zhu, Xiaodong Wang, Junchen Xiong, Chichi Li, Tao Han and Du Zhang
AI 2026, 7(2), 40; https://doi.org/10.3390/ai7020040 - 26 Jan 2026
Viewed by 240
Abstract
Accurate segmentation of the airway tree is crucial for the diagnosis and intervention of pulmonary disease; however, delineating small peripheral airways remains challenging. The small size and complex branching of distal airways, combined with the limitations of CT imaging (partial volume effects, noise), [...] Read more.
Accurate segmentation of the airway tree is crucial for the diagnosis and intervention of pulmonary disease; however, delineating small peripheral airways remains challenging. The small size and complex branching of distal airways, combined with the limitations of CT imaging (partial volume effects, noise), often lead to missed bronchial segments. To address these challenges, we propose AirwaySeekNet, a dual-decoder neural network. The model introduces a Voxel-Selective Supervision (VSS) mechanism, a dynamic reliability-aware strategy that focuses training on uncertain voxels, mitigating annotation bias, and enhancing fine-branch detection. We further incorporate a Signed Distance Field (SDF) loss to enforce tubular shape constraints, improving the boundary delineation and connectivity of the airway tree. In experiments on a pig CT dataset, AirwaySeekNet outperformed state-of-the-art models, achieving higher topological completeness and finer branch detection, and the TD metric increased by 5.55% and the BD metric increased by 8.14%. It maintained high overall segmentation accuracy (Dice), with only a minor increase in false positives from the exploration of the smallest bronchi. Overall, AirwaySeekNet markedly improves airway segmentation accuracy and topology preservation, providing a more complete and reliable mapping of the bronchial tree for clinical applications. Full article
Show Figures

Figure 1

16 pages, 1697 KB  
Article
MSHI-Mamba: A Multi-Stage Hierarchical Interaction Model for 3D Point Clouds Based on Mamba
by Zhiguo Zhou, Qian Wang and Xuehua Zhou
Appl. Sci. 2026, 16(3), 1189; https://doi.org/10.3390/app16031189 - 23 Jan 2026
Viewed by 154
Abstract
Mamba, based on the state space model (SSM), offers an efficient alternative to the quadratic complexity of attention, showing promise for long-sequence data processing and global modeling in 3D object detection. However, applying it to this domain presents specific challenges: traditional serialization methods [...] Read more.
Mamba, based on the state space model (SSM), offers an efficient alternative to the quadratic complexity of attention, showing promise for long-sequence data processing and global modeling in 3D object detection. However, applying it to this domain presents specific challenges: traditional serialization methods can compromise the spatial structure of 3D data, and the standard single-layer SSM design may limit cross-layer feature extraction. To address these issues, this paper proposes MSHI-Mamba, a Mamba-based multi-stage hierarchical interaction architecture for 3D backbone networks. We introduce a cross-layer complementary cross-attention module (C3AM) to mitigate feature redundancy in cross-layer encoding, as well as a bi-shift scanning strategy (BSS) that uses hybrid space-filling curves with shift scanning to better preserve spatial continuity and expand the receptive field during serialization. We also develop a voxel densifying downsampling module (VD-DS) to enhance local spatial information and foreground feature density. Experimental results obtained on the KITTI and nuScenes datasets demonstrate that our approach achieves competitive performance, with a 4.2% improvement in the mAP on KITTI, validating the effectiveness of the proposed components. Full article
Show Figures

Figure 1

17 pages, 1423 KB  
Article
Residual Motion Correction in Low-Dose Myocardial CT Perfusion Using CNN-Based Deformable Registration
by Mahmud Hasan, Aaron So and Mahmoud R. El-Sakka
Electronics 2026, 15(2), 450; https://doi.org/10.3390/electronics15020450 - 20 Jan 2026
Viewed by 163
Abstract
Dynamic myocardial CT perfusion imaging enables functional assessment of coronary artery stenosis and myocardial microvascular disease. However, it is susceptible to residual motion artifacts arising from cardiac and respiratory activity. These artifacts introduce temporal misalignments, distorting Time-Enhancement Curves (TECs) and leading to inaccurate [...] Read more.
Dynamic myocardial CT perfusion imaging enables functional assessment of coronary artery stenosis and myocardial microvascular disease. However, it is susceptible to residual motion artifacts arising from cardiac and respiratory activity. These artifacts introduce temporal misalignments, distorting Time-Enhancement Curves (TECs) and leading to inaccurate myocardial perfusion measurements. Traditional nonrigid registration methods can address such motion but are often computationally expensive and less effective when applied to low-dose images, which are prone to increased noise and structural degradation. In this work, we present a CNN-based motion-correction framework specifically trained for low-dose cardiac CT perfusion imaging. The model leverages spatiotemporal patterns to estimate and correct residual motion between time frames, aligning anatomical structures while preserving dynamic contrast behaviour. Unlike conventional methods, our approach avoids iterative optimization and manually defined similarity metrics, enabling faster, more robust corrections. Quantitative evaluation demonstrates significant improvements in temporal alignment, with reduced Target Registration Error (TRE) and increased correlation between voxel-wise TECs and reference curves. These enhancements enable more accurate myocardial perfusion measurements. Noise from low-dose scans affects registration performance, but this remains an open challenge. This work emphasizes the potential of learning-based methods to perform effective residual motion correction under challenging acquisition conditions, thereby improving the reliability of myocardial perfusion assessment. Full article
Show Figures

Figure 1

35 pages, 4376 KB  
Review
Clinical Image-Based Dosimetry of Actinium-225 in Targeted Alpha Therapy
by Kamo Ramonaheng, Kaluzi Banda, Milani Qebetu, Pryaska Goorhoo, Khomotso Legodi, Tshegofatso Masogo, Yashna Seebarruth, Sipho Mdanda, Sandile Sibiya, Yonwaba Mzizi, Cindy Davis, Liani Smith, Honest Ndlovu, Joseph Kabunda, Alex Maes, Christophe Van de Wiele, Akram Al-Ibraheem and Mike Sathekge
Cancers 2026, 18(2), 321; https://doi.org/10.3390/cancers18020321 - 20 Jan 2026
Viewed by 524
Abstract
Actinium-225 (225Ac) has emerged as a pivotal alpha-emitter in modern radiopharmaceutical therapy, offering potent cytotoxicity with the potential for precise tumour targeting. Accurate, patient-specific image-based dosimetry for 225Ac is essential to optimize therapeutic efficacy while minimizing radiation-induced toxicity. Establishing a [...] Read more.
Actinium-225 (225Ac) has emerged as a pivotal alpha-emitter in modern radiopharmaceutical therapy, offering potent cytotoxicity with the potential for precise tumour targeting. Accurate, patient-specific image-based dosimetry for 225Ac is essential to optimize therapeutic efficacy while minimizing radiation-induced toxicity. Establishing a robust dosimetry workflow is particularly challenging due to the complex decay chain, low administered activity, limited count statistics, and the indirect measurement of daughter gamma emissions. Clinical single-photon emission computed tomography/computed tomography protocols with harmonized acquisition parameters, combined with robust volume-of-interest segmentation, artificial intelligence (AI)-driven image processing, and voxel-level analysis, enable reliable time-activity curve generation and absorbed-dose calculation, while reduced mixed-model approaches improve workflow efficiency, reproducibility, and patient-centred implementation. Cadmium zinc telluride-based gamma cameras further enhance quantitative accuracy, enabling rapid whole-body imaging and precise activity measurement, supporting patient-friendly dosimetry. Complementing these advances, the cerium-134/lanthanum-134 positron emission tomography in vivo generator provides a unique theranostic platform to noninvasively monitor 225Ac progeny redistribution, evaluate alpha-decay recoil, and study tracer internalization, particularly for internalizing vectors. Together, these technological and methodological innovations establish a mechanistically informed framework for individualized 225Ac dosimetry in targeted alpha therapy, supporting optimized treatment planning and precise response assessment. Continued standardization and validation of imaging, reconstruction, and dosimetry workflows will be critical to translate these approaches into reproducible, patient-specific clinical care. Full article
(This article belongs to the Section Cancer Therapy)
Show Figures

Figure 1

22 pages, 5297 KB  
Article
A Space-Domain Gravity Forward Modeling Method Based on Voxel Discretization and Multiple Observation Surfaces
by Rui Zhang, Guiju Wu, Jiapei Wang, Yufei Xi, Fan Wang and Qinhong Long
Symmetry 2026, 18(1), 180; https://doi.org/10.3390/sym18010180 - 19 Jan 2026
Viewed by 244
Abstract
Geophysical forward modeling serves as a fundamental theoretical approach for characterizing subsurface structures and material properties, essentially involving the computation of gravity responses at surface or spatial observation points based on a predefined density distribution. With the rapid development of data-driven techniques such [...] Read more.
Geophysical forward modeling serves as a fundamental theoretical approach for characterizing subsurface structures and material properties, essentially involving the computation of gravity responses at surface or spatial observation points based on a predefined density distribution. With the rapid development of data-driven techniques such as deep learning in geophysical inversion, forward algorithms are facing increasing demands in terms of computational scale, observable types, and efficiency. To address these challenges, this study develops an efficient forward modeling method based on voxel discretization, the enabling rapid calculation of gravity anomalies and radial gravity gradients on multiple observational surfaces. Leveraging the parallel computing capabilities of graphics processing units (GPU), together with tensor acceleration, Compute Unified Device Architecture (CUDA) execution, and Just-in-time (JIT) compilation strategies, the method achieves high efficiency and automation in the forward computation process. Numerical experiments conducted on several typical theoretical models demonstrate the convergence and stability of the calculated results, indicating that the proposed method significantly reduces computation time while maintaining accuracy, thus being well-suited for large-scale 3D modeling and fast batch simulation tasks. This research can efficiently generate forward datasets with multi-view and multi-metric characteristics, providing solid data support and a scalable computational platform for deep-learning-based geophysical inversion studies. Full article
Show Figures

Figure 1

20 pages, 1826 KB  
Article
Tension-Dominant Orthodontic Loading and Buccal Periodontal Phenotype Preservation: An Integrative Mechanobiological Model Supported by FEM and a Proof-of-Concept CBCT
by Anna Ewa Kuc, Jacek Kotuła, Kamil Sybilski, Szymon Saternus, Jerzy Małachowski, Natalia Kuc, Grzegorz Hajduk, Joanna Lis, Beata Kawala, Michał Sarul and Magdalena Sulewska
J. Funct. Biomater. 2026, 17(1), 47; https://doi.org/10.3390/jfb17010047 - 16 Jan 2026
Viewed by 283
Abstract
Background: Adult patients with a thin buccal cortical plate and fragile periodontal phenotype are at high risk of dehiscence, fenestration and recession during transverse orthodontic expansion. Conventional mechanics often create a cervical compression-dominant environment that exceeds the adaptive capacity of the periodontal ligament [...] Read more.
Background: Adult patients with a thin buccal cortical plate and fragile periodontal phenotype are at high risk of dehiscence, fenestration and recession during transverse orthodontic expansion. Conventional mechanics often create a cervical compression-dominant environment that exceeds the adaptive capacity of the periodontal ligament (PDL)–bone complex. Objectives: This study proposes an integrative mechanobiological model in which a skeletal-anchorage-assisted loading protocol (Bone Protection System, BPS) transforms expansion into a tension-dominant regime that favours buccal phenotype preservation. Methods: Patient-specific finite element models were used to compare conventional expansion with a BPS-modified force system. Regional PDL stress patterns and crown/apex displacement vectors were analysed to distinguish tipping-dominant from translation-dominated mechanics. A pilot CBCT proof-of-concept (n = 1 thin-phenotype adult) with voxel-based registration quantified changes in maxillary and mandibular alveolar ridge width and buccal cortical plate thickness before and after BPS-assisted expansion. The mechanical findings were integrated with current evidence on compression- versus tension-driven inflammatory and osteogenic pathways in the PDL and cortical bone. Results: FEM demonstrated that conventional expansion concentrates high cervical compressive stress along the buccal PDL and cortical surface, accompanied by bending-like crown–root divergence. In contrast, the BPS protocol redirected forces to create a buccal tensile-favourable region and a more parallel crown–apex displacement pattern, indicative of translation-dominated movement. In the proof-of-concept (n = 1) CBCT case, BPS-assisted expansion was associated with preservation or increase of buccal ridge dimensions without radiographic signs of cortical breakdown. Conclusions: A tension-dominant orthodontic loading environment generated by a skeletal-anchorage-assisted force system may support buccal cortical preservation and vestibular phenotype reinforcement in thin-phenotype patients. The proposed mechanobiological model links these imaging and FEM findings to known molecular pathways of inflammation, angiogenesis and osteogenesis. It suggests a functional biomaterial-based strategy for widening the biological envelope of safe tooth movement. Full article
(This article belongs to the Special Issue Functional Dental Materials for Orthodontics and Implants)
Show Figures

Graphical abstract

19 pages, 9525 KB  
Article
Evaluating UAV and Handheld LiDAR Point Clouds for Radiative Transfer Modeling Using a Voxel-Based Point Density Proxy
by Takumi Fujiwara, Naoko Miura, Hiroki Naito and Fumiki Hosoi
Sensors 2026, 26(2), 590; https://doi.org/10.3390/s26020590 - 15 Jan 2026
Viewed by 268
Abstract
The potential of UAV-based LiDAR (UAV-LiDAR) and handheld LiDAR scanners (HLSs) for forest radiative transfer models (RTMs) was evaluated using a Voxel-Based Point Density Proxy (VPDP) as a diagnostic tool in a Larix kaempferi forest. Structural analysis-computed coverage gap ratio (CGR) revealed distinct [...] Read more.
The potential of UAV-based LiDAR (UAV-LiDAR) and handheld LiDAR scanners (HLSs) for forest radiative transfer models (RTMs) was evaluated using a Voxel-Based Point Density Proxy (VPDP) as a diagnostic tool in a Larix kaempferi forest. Structural analysis-computed coverage gap ratio (CGR) revealed distinct behaviors. UAV-LiDARs effectively captured canopy structures (10–45% CGR), whereas HLS provided superior understory coverage, but exhibited a high upper-canopy CGR (>40%). Integrating datasets reduced the CGR to below 10%, demonstrating strong complementarity. Radiative transfer simulations correlated well with Sentinel-2 NIR reflectance, with the UAV-LiDAR (r = 0.73–0.75) outperforming the HLS (r = 0.64–0.69). These results highlight the critical importance of upper-canopy modeling for nadir-viewing sensors. Although integrating HLS data did not improve correlation due to the dominance of upper-canopy signals, structural analysis confirmed that fusion is essential for achieving volumetric completeness. A voxel size range of 50–100 cm was identified as effective for balancing structural detail and radiative stability. These findings provide practical guidelines for selecting and integrating LiDAR platforms in forest monitoring, emphasizing that while aerial sensors suffice for top-of-canopy reflectance, multi-platform fusion is requisite for full 3D structural characterization. Full article
(This article belongs to the Special Issue Progress in LiDAR Technologies and Applications)
Show Figures

Figure 1

14 pages, 2106 KB  
Article
A Hierarchical Multi-Modal Fusion Framework for Alzheimer’s Disease Classification Using 3D MRI and Clinical Biomarkers
by Ting-An Chang, Chun-Cheng Yu, Yin-Hua Wang, Zi-Ping Lei and Chia-Hung Chang
Electronics 2026, 15(2), 367; https://doi.org/10.3390/electronics15020367 - 14 Jan 2026
Viewed by 221
Abstract
Accurate and interpretable staging of Alzheimer’s disease (AD) remains challenging due to the heterogeneous progression of neurodegeneration and the complementary nature of imaging and clinical biomarkers. This study implements and evaluates an optimized Hierarchical Multi-Modal Fusion Framework (HMFF) that systematically integrates 3D structural [...] Read more.
Accurate and interpretable staging of Alzheimer’s disease (AD) remains challenging due to the heterogeneous progression of neurodegeneration and the complementary nature of imaging and clinical biomarkers. This study implements and evaluates an optimized Hierarchical Multi-Modal Fusion Framework (HMFF) that systematically integrates 3D structural MRI with clinical assessment scales for robust three-class classification of cognitively normal (CN), mild cognitive impairment (MCI), and AD subjects. A standardized preprocessing pipeline, including N4 bias field correction, nonlinear registration to MNI space, ANTsNet-based skull stripping, voxel normalization, and spatial resampling, was employed to ensure anatomically consistent and high-quality MRI inputs. Within the proposed framework, volumetric imaging features were extracted using a 3D DenseNet-121 architecture, while structured clinical information was modeled via an XGBoost classifier to capture nonlinear clinical priors. These heterogeneous representations were hierarchically fused through a lightweight multilayer perceptron, enabling effective cross-modal interaction. To further enhance discriminative capability and model efficiency, a hierarchical feature selection strategy was incorporated to progressively refine high-dimensional imaging features. Experimental results demonstrated that performance consistently improved with feature refinement and reached an optimal balance at approximately 90 selected features. Under this configuration, the proposed HMFF achieved an accuracy of 0.94 (95% Confidence Interval: [0.918, 0.951]), a recall of 0.91, a precision of 0.94, and an F1-score of 0.92, outperforming unimodal and conventional multimodal baselines under comparable settings. Moreover, Grad-CAM visualization confirmed that the model focused on clinically relevant neuroanatomical regions, including the hippocampus and medial temporal lobe, enhancing interpretability and clinical plausibility. These findings indicate that hierarchical multimodal fusion with interpretable feature refinement offers a promising and extensible solution for reliable and explainable automated AD staging. Full article
(This article belongs to the Special Issue AI-Driven Medical Image/Video Processing)
Show Figures

Figure 1

16 pages, 1970 KB  
Article
LSON-IP: Lightweight Sparse Occupancy Network for Instance Perception
by Xinwang Zheng, Yuhang Cai, Lu Yang, Chengyu Lu and Guangsong Yang
World Electr. Veh. J. 2026, 17(1), 31; https://doi.org/10.3390/wevj17010031 - 7 Jan 2026
Viewed by 231
Abstract
The high computational demand of dense voxel representations severely limits current vision-centric 3D semantic occupancy prediction methods, despite their capacity for granular scene understanding. This challenge is particularly acute in safety-critical applications like autonomous driving, where accurately perceiving dynamic instances often takes precedence [...] Read more.
The high computational demand of dense voxel representations severely limits current vision-centric 3D semantic occupancy prediction methods, despite their capacity for granular scene understanding. This challenge is particularly acute in safety-critical applications like autonomous driving, where accurately perceiving dynamic instances often takes precedence over capturing the static background. This paper challenges the paradigm of dense prediction for such instance-focused tasks. We introduce the LSON-IP, a framework that strategically avoids the computational expense of dense 3D grids. LSON-IP operates on a sparse set of 3D instance queries, which are initialized directly from multi-view 2D images. These queries are then refined by our novel Sparse Instance Aggregator (SIA), an attention-based module. The SIA incorporates rich multi-view features while simultaneously modeling inter-query relationships to construct coherent object representations. Furthermore, to obviate the need for costly 3D annotations, we pioneer a Differentiable Sparse Rendering (DSR) technique. DSR innovatively defines a continuous field from the sparse voxel output, establishing a differentiable bridge between our sparse 3D representation and 2D supervision signals through volume rendering. Extensive experiments on major autonomous driving benchmarks, including SemanticKITTI and nuScenes, validate our approach. LSON-IP achieves strong performance on key dynamic instance categories and competitive overall semantic completion, all while reducing computational overhead by over 60% compared to dense baselines. Our work thus paves the way for efficient, high-fidelity instance-aware 3D perception. Full article
Show Figures

Figure 1

20 pages, 3202 KB  
Article
Voxel Normalization in LDCT Imaging: Its Significance in Texture Feature Selection for Pulmonary Nodule Malignancy Classification: Insights from Two Centers
by Chen-Hao Peng, Jhu-Fong Wu, Chu-Jen Kuo and Da-Chuan Cheng
Diagnostics 2026, 16(2), 186; https://doi.org/10.3390/diagnostics16020186 - 7 Jan 2026
Viewed by 356
Abstract
Background: Lung cancer is the leading cause of cancer-related mortality globally. Early detection via low-dose computed tomography (LDCT) can reduce mortality, but its implementation is challenged by the absence of objective diagnostic criteria and the necessity for extensive manual interpretation. Public datasets like [...] Read more.
Background: Lung cancer is the leading cause of cancer-related mortality globally. Early detection via low-dose computed tomography (LDCT) can reduce mortality, but its implementation is challenged by the absence of objective diagnostic criteria and the necessity for extensive manual interpretation. Public datasets like the Lung Image Database Consortium often lack pathology-confirmed diagnoses, which can lead to inaccuracies in ground truth labels. Variability in voxel sizes across these datasets also complicates feature extraction, undermining model reliability. Many existing methods for integrating nodule boundary annotations use deep learning models such as generative adversarial networks, which often lack interpretability. Methods: This study assesses the effect of voxel normalization on pulmonary nodule classification and introduces a Fast Fourier Transform-based contour fusion method as a more interpretable alternative. Utilizing pathology-confirmed LDCT data from 415 patients across two medical centers, both machine learning and deep learning models were developed using voxel-normalized images and attention mechanisms, including transformers. Results: The results demonstrated that voxel normalization significantly improved the overlap of features between datasets from two different centers by 64%, resulting in enhanced selection stability. In the ROI-based radiomics analysis, the top-performing machine-learning model achieved an accuracy of 92.6%, whereas the patch-based deep-learning models reached 98.5%. Notably, the FFT-based method provided a clinically interpretable integration of expert annotations, effectively addressing a major limitation of generative adversarial networks. Conclusions: Voxel normalization enhances reliability in pulmonary nodule classification while the FFT-based method offers a viable path toward interpretability in deep learning applications. Future research should explore its implications further in multi-center contexts. Full article
(This article belongs to the Special Issue A New Era in Diagnosis: From Biomarkers to Artificial Intelligence)
Show Figures

Figure 1

19 pages, 2298 KB  
Article
HFSA-Net: A 3D Object Detection Network with Structural Encoding and Attention Enhancement for LiDAR Point Clouds
by Xuehao Yin, Zhen Xiao, Jinju Shao, Zhimin Qiu and Lei Wang
Sensors 2026, 26(1), 338; https://doi.org/10.3390/s26010338 - 5 Jan 2026
Viewed by 398
Abstract
The inherent sparsity of LiDAR point cloud data presents a fundamental challenge for 3D object detection. During the feature encoding stage, especially in voxelization, existing methods find it difficult to effectively retain the critical geometric structural information contained in these sparse point clouds, [...] Read more.
The inherent sparsity of LiDAR point cloud data presents a fundamental challenge for 3D object detection. During the feature encoding stage, especially in voxelization, existing methods find it difficult to effectively retain the critical geometric structural information contained in these sparse point clouds, resulting in decreased detection performance. To address this problem, this paper proposes an enhanced 3D object detection framework. It first designs a Structured Voxel Feature Encoder that significantly enhances the initial feature representation through intra-voxel feature refinement and multi-scale neighborhood context aggregation. Second, it constructs a Hybrid-Domain Attention-Guided Sparse Backbone, which introduces a decoupled hybrid attention mechanism and a hierarchical integration strategy to realize dynamic weighting and focusing on key semantic and geometric features. Finally, a Scale-Aggregation Head is proposed to improve the model’s perception and localization capabilities for different-sized objects via multi-level feature pyramid fusion and cross-layer information interaction. Experimental results on the KITTI dataset show that the proposed algorithm increases the mean Average Precision (mAP) by 3.34% compared to the baseline model. Moreover, experiments on a vehicle platform with a lower-resolution LiDAR verify the effectiveness of the proposed method in improving 3D detection accuracy and its generalization ability. Full article
(This article belongs to the Special Issue Recent Advances in LiDAR Sensing Technology for Autonomous Vehicles)
Show Figures

Figure 1

Back to TopTop