MDPI - Publisher of Open Access Journals

20 pages, 1853 KB

Open AccessArticle

Enhanced U-Net for Spleen Segmentation in CT Scans: Integrating Multi-Slice Context and Grad-CAM Interpretability

by Sowad Rahman, Md Azad Hossain Raju, Abdullah Evna Jafar, Muslima Akter, Israt Jahan Suma and Jia Uddin

BioMedInformatics 2025, 5(4), 56; https://doi.org/10.3390/biomedinformatics5040056 - 8 Oct 2025

Viewed by 260

Abstract

Accurate spleen segmentation in abdominal CT scans remains a critical challenge in medical image analysis due to variable morphology, low tissue contrast, and proximity to similar anatomical structures. This paper presents an enhanced U-Net architecture that addresses these challenges through multi-slice contextual integration [...] Read more.

Accurate spleen segmentation in abdominal CT scans remains a critical challenge in medical image analysis due to variable morphology, low tissue contrast, and proximity to similar anatomical structures. This paper presents an enhanced U-Net architecture that addresses these challenges through multi-slice contextual integration and interpretable deep learning. Our approach incorporates three-channel inputs from adjacent CT slices, implements a hybrid loss function combining Dice and binary cross-entropy terms, and integrates Grad-CAM visualization for enhanced model interpretability. Comprehensive evaluation on the Medical Decathlon dataset demonstrates superior performance, with a Dice similarity coefficient of 0.923 ± 0.04, outperforming standard 2D approaches by 3.2%. The model exhibits robust performance across varying slice thicknesses, contrast phases, and pathological conditions. Grad-CAM analysis reveals focused attention on spleen–tissue interfaces and internal vascular structures, providing clinical insight into model decision-making. The system demonstrates practical applicability for automated splenic volumetry, trauma assessment, and surgical planning, with processing times suitable for clinical workflow integration. Full article

► Show Figures

Figure 1

13 pages, 6372 KB

Open AccessArticle

Oral Supplementation of Nicotinamide Mononucleotide (NMN) Improves Hair Quality and Subjective Perception of Hair Appearance in Middle-Aged Women

by Shuichi Fukumoto, Maiko Ito, Hiroyo Kunitomo, Takeshi Hataoka, Takuya Chiba, Osamu Nureki and Takahiro Fujimoto

Cosmetics 2025, 12(5), 204; https://doi.org/10.3390/cosmetics12050204 - 16 Sep 2025

Viewed by 1922

Abstract

Background: Nicotinamide mononucleotide (NMN) has gained attention as an anti-aging compound due to its ability to replenish NAD⁺ levels, which typically decline with age and stress. While improvements in skin conditions have been reported, clinical studies on human hair remain lacking. In [...] Read more.

Background: Nicotinamide mononucleotide (NMN) has gained attention as an anti-aging compound due to its ability to replenish NAD⁺ levels, which typically decline with age and stress. While improvements in skin conditions have been reported, clinical studies on human hair remain lacking. In this study, we evaluated the effects of NMN supplementation on hair conditions in middle-aged women and explored its association with quality-of-life (QOL) factors such as fatigue. Methods: Torula yeast-fermented NMN was evaluated in this clinical trial. A single-arm, pre-post intervention study was conducted involving 15 healthy Japanese women aged between 40 and 50 years who orally consumed NMN for 12 weeks. Hair growth cycles and hair shaft diameters were assessed using TrichoScan (TrichoGrabV3B) analysis and scanning electron microscopy (SEM). Hair metabolites and hormone levels were also measured. Subjective indices, including fatigue and hair texture, were evaluated using a visual analog scale (VAS) questionnaire. Results: Following NMN supplementation, anagen hair elongation density (hairs/cm²) significantly increased from 55.9 to 87.7 (p = 0.03). Hair diameter (µm) also significantly increased from 75.3 to 78.8 (p < 0.01), with improvements in hair cuticle condition. Metabolomic analyses revealed significant changes in amino acids and energy metabolism-related compounds. No marked changes were observed in hair hormone concentrations. The VAS questionnaire indicated improvements in subjective hair characteristics such as elasticity, gloss, and volume, as well as reductions in fatigue and perceived hair loss, suggesting enhanced QOL. Conclusions: Oral supplementation with NMN may be a beneficial strategy for promoting hair growth and improvement in hair cuticle condition in middle-aged women, thus potentially enhancing overall hair care and quality of life. Full article

► Show Figures

Graphical abstract

24 pages, 3314 KB

Open AccessArticle

Entropy as a Lens: Exploring Visual Behavior Patterns in Architects

by Renate Delucchi Danhier, Barbara Mertins, Holger Mertins and Gerold Schneider

J. Eye Mov. Res. 2025, 18(5), 43; https://doi.org/10.3390/jemr18050043 - 16 Sep 2025

Viewed by 359

Abstract

This study examines how architectural expertise shapes visual perception, extending the “Seeing for Speaking” hypothesis into a non-linguistic domain. Specifically, it investigates whether architectural training influences unconscious visual processing of architectural content. Using eye-tracking, 48 architects and 48 laypeople freely viewed 15 still [...] Read more.

This study examines how architectural expertise shapes visual perception, extending the “Seeing for Speaking” hypothesis into a non-linguistic domain. Specifically, it investigates whether architectural training influences unconscious visual processing of architectural content. Using eye-tracking, 48 architects and 48 laypeople freely viewed 15 still images of built, mixed, and natural environments. Visual behavior was analyzed using Shannon’s entropy scores based on dwell times within 16 × 16 grids during the first six seconds of viewing. Results revealed distinct visual attention patterns between groups. Architects showed lower entropy, indicating more focused and systematic gaze behavior, and their attention was consistently drawn to built structures. In contrast, laypeople exhibited more variable and less organized scanning patterns, with greater individual differences. Moreover, architects demonstrated higher intra-group similarity in their gaze behavior, suggesting a shared attentional schema shaped by professional training. These findings highlight that domain-specific expertise deeply influences perceptual processing, resulting in systematic and efficient attention allocation. Entropy-based metrics proved effective in capturing these differences, offering a robust tool for quantifying expert vs. non-expert visual strategies in architectural cognition. The visual patterns exhibited by architects are interpreted to reflect a “Grammar of Space”, i.e., a structured way of visually parsing spatial elements. Full article

► Show Figures

Figure 1

21 pages, 4721 KB

Open AccessArticle

Automated Brain Tumor MRI Segmentation Using ARU-Net with Residual-Attention Modules

by Erdal Özbay and Feyza Altunbey Özbay

Diagnostics 2025, 15(18), 2326; https://doi.org/10.3390/diagnostics15182326 - 13 Sep 2025

Viewed by 616

Abstract

Background/Objectives: Accurate segmentation of brain tumors in Magnetic Resonance Imaging (MRI) scans is critical for diagnosis and treatment planning due to their life-threatening nature. This study aims to develop a robust and automated method capable of precisely delineating heterogeneous tumor regions while improving [...] Read more.

Background/Objectives: Accurate segmentation of brain tumors in Magnetic Resonance Imaging (MRI) scans is critical for diagnosis and treatment planning due to their life-threatening nature. This study aims to develop a robust and automated method capable of precisely delineating heterogeneous tumor regions while improving segmentation accuracy and generalization. Methods: We propose Attention Res-UNet (ARU-Net), a novel Deep Learning (DL) architecture integrating residual connections, Adaptive Channel Attention (ACA), and Dimensional-space Triplet Attention (DTA) modules. The encoding module efficiently extracts and refines relevant feature information by applying ACA to the lower layers of convolutional and residual blocks. The DTA is fixed to the upper layers of the decoding module, decoupling channel weights to better extract and fuse multi-scale features, enhancing both performance and efficiency. Input MRI images are pre-processed using Contrast Limited Adaptive Histogram Equalization (CLAHE) for contrast enhancement, denoising filters, and Linear Kuwahara filtering to preserve edges while smoothing homogeneous regions. The network is trained using categorical cross-entropy loss with the Adam optimizer on the BTMRII dataset, and comparative experiments are conducted against baseline U-Net, DenseNet121, and Xception models. Performance is evaluated using accuracy, precision, recall, F1-score, Dice Similarity Coefficient (DSC), and Intersection over Union (IoU) metrics. Results: Baseline U-Net showed significant performance gains after adding residual connections and ACA modules, with DSC improving by approximately 3.3%, accuracy by 3.2%, IoU by 7.7%, and F1-score by 3.3%. ARU-Net further enhanced segmentation performance, achieving 98.3% accuracy, 98.1% DSC, 96.3% IoU, and a superior F1-score, representing additional improvements of 1.1–2.0% over the U-Net + Residual + ACA variant. Visualizations confirmed smoother boundaries and more precise tumor contours across all six tumor classes, highlighting ARU-Net’s ability to capture heterogeneous tumor structures and fine structural details more effectively than both baseline U-Net and other conventional DL models. Conclusions: ARU-Net, combined with an effective pre-processing strategy, provides a highly reliable and precise solution for automated brain tumor segmentation. Its improvements across multiple evaluation metrics over U-Net and other conventional models highlight its potential for clinical application and contribute novel insights to medical image analysis research. Full article

(This article belongs to the Special Issue Advances in Functional and Structural MR Image Analysis)

► Show Figures

Figure 1

16 pages, 1430 KB

Open AccessArticle

Assessing Smooth Pursuit Eye Movements Using Eye-Tracking Technology in Patients with Schizophrenia Under Treatment: A Pilot Study

by Luis Benigno Contreras-Chávez, Valdemar Emigdio Arce-Guevara, Luis Fernando Guerrero, Alfonso Alba, Miguel G. Ramírez-Elías, Edgar Roman Arce-Santana, Victor Hugo Mendez-Garcia, Jorge Jimenez-Cruz, Anna Maria Maddalena Bianchi and Martin O. Mendez

Sensors 2025, 25(16), 5212; https://doi.org/10.3390/s25165212 - 21 Aug 2025

Viewed by 1475

Abstract

Schizophrenia is a complex disorder that affects mental organization and cognitive functions, including concentration and memory. One notable manifestation of cognitive changes in schizophrenia is a diminished ability to scan and perform tasks related to visual inspection. From the three evaluable aspects of [...] Read more.

Schizophrenia is a complex disorder that affects mental organization and cognitive functions, including concentration and memory. One notable manifestation of cognitive changes in schizophrenia is a diminished ability to scan and perform tasks related to visual inspection. From the three evaluable aspects of the ocular movements (saccadic, smooth pursuit, and fixation) in particular, smooth pursuit eye movement (SPEM) involves the tracking of slow moving objects and is closely related to attention, visual memory, and processing speed. However, evaluating smooth pursuit in clinical settings is challenging due to the technical complexities of detecting these movements, resulting in limited research and clinical application. This pilot study investigates whether the quantitative metrics derived from eye-tracking data can distinguish between patients with schizophrenia under treatment and healthy controls. The study included nine healthy participants and nine individuals receiving treatment for schizophrenia. Gaze trajectories were recorded using an eye tracker during a controlled visual tracking task performed during a clinical visit. Spatiotemporal analysis of gaze trajectories was performed by evaluating three different features: polygonal area, colocalities, and direction difference. Subsequently, a support vector machine (SVM) was used to assess the separability between healthy individuals and those with schizophrenia based on the identified gaze trajectory features. The results show statistically significant differences between the control and subjects with schizophrenia for all the computed indexes (p < 0.05) and a high separability achieving around 90% of accuracy, sensitivity, and specificity. The results suggest the potential development of a valuable clinical tool for the evaluation of SPEM, offering utility in clinics to assess the efficacy of therapeutic interventions in individuals with schizophrenia. Full article

(This article belongs to the Special Issue Biomedical Imaging, Sensing and Signal Processing)

► Show Figures

Figure 1

23 pages, 7524 KB

Open AccessArticle

Analyzing Visual Attention in Virtual Crime Scene Investigations Using Eye-Tracking and VR: Insights for Cognitive Modeling

by Wen-Chao Yang, Chih-Hung Shih, Jiajun Jiang, Sergio Pallas Enguita and Chung-Hao Chen

Electronics 2025, 14(16), 3265; https://doi.org/10.3390/electronics14163265 - 17 Aug 2025

Viewed by 603

Abstract

Understanding human perceptual strategies in high-stakes environments, such as crime scene investigations, is essential for developing cognitive models that reflect expert decision-making. This study presents an immersive experimental framework that utilizes virtual reality (VR) and eye-tracking technologies to capture and analyze visual attention [...] Read more.

Understanding human perceptual strategies in high-stakes environments, such as crime scene investigations, is essential for developing cognitive models that reflect expert decision-making. This study presents an immersive experimental framework that utilizes virtual reality (VR) and eye-tracking technologies to capture and analyze visual attention during simulated forensic tasks. A360° panoramic crime scene, constructed using the Nikon KeyMission 360 camera, was integrated into a VR system with HTC Vive and Tobii Pro eye-tracking components. A total of 46 undergraduate students aged 19 to 24–23, from the National University of Singapore in Singapore and 23 from the Central Police University in Taiwan—participated in the study, generating over 2.6 million gaze samples (IRB No. 23-095-B). The collected eye-tracking data were analyzed using statistical summarization, temporal alignment techniques (Earth Mover’s Distance and Needleman-Wunsch algorithms), and machine learning models, including K-means clustering, random forest regression, and support vector machines (SVMs). Clustering achieved a classification accuracy of 78.26%, revealing distinct visual behavior patterns across participant groups. Proficiency prediction models reached optimal performance with a random forest regression (

R^{2}

= 0.7034), highlighting scan-path variability and fixation regularity as key predictive features. These findings demonstrate that eye-tracking metrics—particularly sequence-alignment-based features—can effectively capture differences linked to both experiential training and cultural context. Beyond its immediate forensic relevance, the study contributes a structured methodology for encoding visual attention strategies into analyzable formats, offering valuable insights for cognitive modeling, training systems, and human-centered design in future perceptual intelligence applications. Furthermore, our work advances the development of autonomous vehicles by modeling how humans visually interpret complex and potentially hazardous environments. By examining expert and novice gaze patterns during simulated forensic investigations, we provide insights that can inform the design of autonomous systems required to make rapid, safety-critical decisions in similarly unstructured settings. The extraction of human-like visual attention strategies not only enhances scene understanding, anomaly detection, and risk assessment in autonomous driving scenarios, but also supports accelerated learning of response patterns for rare, dangerous, or otherwise exceptional conditions—enabling autonomous driving systems to better anticipate and manage unexpected real-world challenges. Full article

(This article belongs to the Special Issue Autonomous and Connected Vehicles)

► Show Figures

Figure 1

42 pages, 6539 KB

Open AccessArticle

Multimodal Sparse Reconstruction and Deep Generative Networks: A Paradigm Shift in MR-PET Neuroimaging

by Krzysztof Malczewski

Appl. Sci. 2025, 15(15), 8744; https://doi.org/10.3390/app15158744 - 7 Aug 2025

Viewed by 1055

Abstract

A novel multimodal super-resolution framework is introduced, combining GAN-based synthesis, perceptual constraints, and joint low-rank sparsity regularization to noticeably enhance MR-PET image quality. The architecture integrates modality-specific ResNet encoders, a transformer-based attention fusion block, and a multi-scale PatchGAN discriminator. Training is guided by [...] Read more.

A novel multimodal super-resolution framework is introduced, combining GAN-based synthesis, perceptual constraints, and joint low-rank sparsity regularization to noticeably enhance MR-PET image quality. The architecture integrates modality-specific ResNet encoders, a transformer-based attention fusion block, and a multi-scale PatchGAN discriminator. Training is guided by a hybrid loss function incorporating adversarial, pixel-wise, perceptual (VGG19), and structured Hankel constraints. The proposed method outperforms all baselines in PSNR, SSIM, LPIPS, and diagnostic confidence metrics. Clinical PET metrics, such as SUV recovery and lesion detectability, show substantial improvement. A thorough analysis of computational complexity, dataset composition, training reproducibility, and motion compensation is provided. These findings are visually supported by processed scan panels and benchmark tables. This framework advances reproducible and interpretable hybrid neuroimaging with strong clinical and technical validation. Full article

(This article belongs to the Special Issue Advances in Machine Learning and Data Mining: Emerging Trends and Applications)

► Show Figures

Figure 1

36 pages, 25361 KB

Open AccessArticle

Remote Sensing Image Compression via Wavelet-Guided Local Structure Decoupling and Channel–Spatial State Modeling

by Jiahui Liu, Lili Zhang and Xianjun Wang

Remote Sens. 2025, 17(14), 2419; https://doi.org/10.3390/rs17142419 - 12 Jul 2025

Viewed by 950

Abstract

As the resolution and data volume of remote sensing imagery continue to grow, achieving efficient compression without sacrificing reconstruction quality remains a major challenge, given that traditional handcrafted codecs often fail to balance rate-distortion performance and computational complexity, while deep learning-based approaches offer [...] Read more.

As the resolution and data volume of remote sensing imagery continue to grow, achieving efficient compression without sacrificing reconstruction quality remains a major challenge, given that traditional handcrafted codecs often fail to balance rate-distortion performance and computational complexity, while deep learning-based approaches offer superior representational capacity. However, challenges remain in achieving a balance between fine-detail adaptation and computational efficiency. Mamba, a state–space model (SSM)-based architecture, offers linear-time complexity and excels at capturing long-range dependencies in sequences. It has been adopted in remote sensing compression tasks to model long-distance dependencies between pixels. However, despite its effectiveness in global context aggregation, Mamba’s uniform bidirectional scanning is insufficient for capturing high-frequency structures such as edges and textures. Moreover, existing visual state–space (VSS) models built upon Mamba typically treat all channels equally and lack mechanisms to dynamically focus on semantically salient spatial regions. To address these issues, we present an innovative architecture for distant sensing image compression, called the Multi-scale Channel Global Mamba Network (MGMNet). MGMNet integrates a spatial–channel dynamic weighting mechanism into the Mamba architecture, enhancing global semantic modeling while selectively emphasizing informative features. It comprises two key modules. The Wavelet Transform-guided Local Structure Decoupling (WTLS) module applies multi-scale wavelet decomposition to disentangle and separately encode low- and high-frequency components, enabling efficient parallel modeling of global contours and local textures. The Channel–Global Information Modeling (CGIM) module enhances conventional VSS by introducing a dual-path attention strategy that reweights spatial and channel information, improving the modeling of long-range dependencies and edge structures. We conducted extensive evaluations on three distinct remote sensing datasets to assess the MGMNet. The results of the investigations revealed that MGMNet outperforms the current SOTA models across various performance metrics. Full article

(This article belongs to the Special Issue New Insights in Remote Sensing Image Interpretation with Deep Learning)

► Show Figures

Figure 1

22 pages, 6194 KB

Open AccessArticle

KidneyNeXt: A Lightweight Convolutional Neural Network for Multi-Class Renal Tumor Classification in Computed Tomography Imaging

by Gulay Maçin, Fatih Genç, Burak Taşcı, Sengul Dogan and Turker Tuncer

J. Clin. Med. 2025, 14(14), 4929; https://doi.org/10.3390/jcm14144929 - 11 Jul 2025

Cited by 1 | Viewed by 1134

Abstract

Background: Renal tumors, encompassing benign, malignant, and normal variants, represent a significant diagnostic challenge in radiology due to their overlapping visual characteristics on computed tomography (CT) scans. Manual interpretation is time consuming and susceptible to inter-observer variability, emphasizing the need for automated, [...] Read more.

Background: Renal tumors, encompassing benign, malignant, and normal variants, represent a significant diagnostic challenge in radiology due to their overlapping visual characteristics on computed tomography (CT) scans. Manual interpretation is time consuming and susceptible to inter-observer variability, emphasizing the need for automated, reliable classification systems to support early and accurate diagnosis. Method and Materials: We propose KidneyNeXt, a custom convolutional neural network (CNN) architecture designed for the multi-class classification of renal tumors using CT imaging. The model integrates multi-branch convolutional pathways, grouped convolutions, and hierarchical feature extraction blocks to enhance representational capacity. Transfer learning with ImageNet 1K pretraining and fine tuning was employed to improve generalization across diverse datasets. Performance was evaluated on three CT datasets: a clinically curated retrospective dataset (3199 images), the Kaggle CT KIDNEY dataset (12,446 images), and the KAUH: Jordan dataset (7770 images). All images were preprocessed to 224 × 224 resolution without data augmentation and split into training, validation, and test subsets. Results: Across all datasets, KidneyNeXt demonstrated outstanding classification performance. On the clinical dataset, the model achieved 99.76% accuracy and a macro-averaged F1 score of 99.71%. On the Kaggle CT KIDNEY dataset, it reached 99.96% accuracy and a 99.94% F1 score. Finally, evaluation on the KAUH dataset yielded 99.74% accuracy and a 99.72% F1 score. The model showed strong robustness against class imbalance and inter-class similarity, with minimal misclassification rates and stable learning dynamics throughout training. Conclusions: The KidneyNeXt architecture offers a lightweight yet highly effective solution for the classification of renal tumors from CT images. Its consistently high performance across multiple datasets highlights its potential for real-world clinical deployment as a reliable decision support tool. Future work may explore the integration of clinical metadata and multimodal imaging to further enhance diagnostic precision and interpretability. Additionally, interpretability was addressed using Grad-CAM visualizations, which provided class-specific attention maps to highlight the regions contributing to the model’s predictions. Full article

(This article belongs to the Special Issue Artificial Intelligence and Deep Learning in Medical Imaging)

► Show Figures

Figure 1

22 pages, 2988 KB

Open AccessReview

Impact of Optical Coherence Tomography (OCT) for Periodontitis Diagnostics: Current Overview and Advances

by Pietro Rigotti, Alessandro Polizzi, Anna Elisa Verzì, Francesco Lacarrubba, Giuseppe Micali and Gaetano Isola

Dent. J. 2025, 13(7), 305; https://doi.org/10.3390/dj13070305 - 4 Jul 2025

Viewed by 1217

Abstract

Optical coherence tomography (OCT) is a non-invasive imaging technique that provides high-resolution, real-time visualization of soft and hard periodontal tissues. It offers micrometer-level resolution (typically ~10–15 μm) and a scan depth ranging from approximately 0.5 to 2 mm, depending on tissue type and [...] Read more.

Optical coherence tomography (OCT) is a non-invasive imaging technique that provides high-resolution, real-time visualization of soft and hard periodontal tissues. It offers micrometer-level resolution (typically ~10–15 μm) and a scan depth ranging from approximately 0.5 to 2 mm, depending on tissue type and system configuration. The field of view generally spans a few millimeters, which is sufficient for imaging gingiva, sulcus, and superficial bone contours. Over the past two decades, its application in periodontology has gained increasing attention due to its ability to detect structural changes in gingival and alveolar tissues without the need for ionizing radiation. Various OCT modalities, including time-domain, Fourier-domain, and swept-source OCT, have been explored for periodontal assessment, offering valuable insights into tissue morphology, disease progression, and treatment outcomes. Recent innovations include the development of three-dimensional (3D) OCT imaging and OCT angiography (OCTA), enabling the volumetric visualization of periodontal structures and microvascular patterns in vivo. Compared to conventional imaging techniques, such as radiography and cone beam computed tomography (CBCT), OCT offers superior soft tissue contrast and the potential for dynamic in vivo monitoring of periodontal conditions. Recent advancements, including the integration of artificial intelligence (AI) and the development of portable OCT systems, have further expanded its diagnostic capabilities. However, challenges, such as limited penetration depth, high costs, and the need for standardized clinical protocols, must be addressed before widespread clinical implementation. This narrative review provides an updated overview of the principles, applications, and technological advancements of OCT in periodontology. The current limitations and future perspectives of this technology are also discussed, with a focus on its potential role in improving periodontal diagnostics and personalized treatment approaches. Full article

(This article belongs to the Special Issue Optical Coherence Tomography (OCT) in Dentistry)

► Show Figures

Graphical abstract

28 pages, 114336 KB

Open AccessArticle

Mamba-STFM: A Mamba-Based Spatiotemporal Fusion Method for Remote Sensing Images

by Qiyuan Zhang, Xiaodan Zhang, Chen Quan, Tong Zhao, Wei Huo and Yuanchen Huang

Remote Sens. 2025, 17(13), 2135; https://doi.org/10.3390/rs17132135 - 21 Jun 2025

Viewed by 1177

Abstract

Spatiotemporal fusion techniques can generate remote sensing imagery with high spatial and temporal resolutions, thereby facilitating Earth observation. However, traditional methods are constrained by linear assumptions; generative adversarial networks suffer from mode collapse; convolutional neural networks struggle to capture global context; and Transformers [...] Read more.

Spatiotemporal fusion techniques can generate remote sensing imagery with high spatial and temporal resolutions, thereby facilitating Earth observation. However, traditional methods are constrained by linear assumptions; generative adversarial networks suffer from mode collapse; convolutional neural networks struggle to capture global context; and Transformers are hard to scale due to quadratic computational complexity and high memory consumption. To address these challenges, this study introduces an end-to-end remote sensing image spatiotemporal fusion approach based on the Mamba architecture (Mamba-spatiotemporal fusion model, Mamba-STFM), marking the first application of Mamba in this domain and presenting a novel paradigm for spatiotemporal fusion model design. Mamba-STFM consists of a feature extraction encoder and a feature fusion decoder. At the core of the encoder is the visual state space-FuseCore-AttNet block (VSS-FCAN block), which deeply integrates linear complexity cross-scan global perception with a channel attention mechanism, significantly reducing quadratic-level computation and memory overhead while improving inference throughput through parallel scanning and kernel fusion techniques. The decoder’s core is the spatiotemporal mixture-of-experts fusion module (STF-MoE block), composed of our novel spatial expert and temporal expert modules. The spatial expert adaptively adjusts channel weights to optimize spatial feature representation, enabling precise alignment and fusion of multi-resolution images, while the temporal expert incorporates a temporal squeeze-and-excitation mechanism and selective state space model (SSM) techniques to efficiently capture short-range temporal dependencies, maintain linear sequence modeling complexity, and further enhance overall spatiotemporal fusion throughput. Extensive experiments on public datasets demonstrate that Mamba-STFM outperforms existing methods in fusion quality; ablation studies validate the effectiveness of each core module; and efficiency analyses and application comparisons further confirm the model’s superior performance. Full article

► Show Figures

Figure 1

16 pages, 3367 KB

Open AccessArticle

Sound Localization Training and Induced Brain Plasticity: An fMRI Investigation

by Ranjita Kumari, Sukhan Lee, Pradeep Kumar Anand and Jitae Shin

Diagnostics 2025, 15(12), 1558; https://doi.org/10.3390/diagnostics15121558 - 18 Jun 2025

Viewed by 892

Abstract

Background/Objectives: Neuroimaging techniques have been increasingly utilized to explore neuroplasticity induced by various training regimens. Magnetic resonance imaging (MRI) enables to study these changes non-invasively. While visual and motor training have been widely studied, less is known about how auditory training affects brain [...] Read more.

Background/Objectives: Neuroimaging techniques have been increasingly utilized to explore neuroplasticity induced by various training regimens. Magnetic resonance imaging (MRI) enables to study these changes non-invasively. While visual and motor training have been widely studied, less is known about how auditory training affects brain activity. Our objective was to investigate the effects of sound localization training on brain activity and identify brain regions exhibiting significant changes in activation pre- and post-training to understand how sound localization training induces plasticity in the brain. Method: Six blindfolded participants each underwent 30-minute sound localization training sessions twice a week for three weeks. All participants completed functional MRI (fMRI) testing before and after the training. Results: fMRI scans revealed that sound localization training led to increased activation in several cortical areas, including the superior frontal gyrus, superior temporal gyrus, middle temporal gyrus, parietal lobule, precentral gyrus, and postcentral gyrus. These regions are associated with cognitive processes such as auditory processing, spatial working memory, planning, decision-making, error detection, and motor control. Conversely, a decrease in activation was observed in the left middle temporal gyrus, a region linked to language comprehension and semantic memory. Conclusions: These findings suggest that sound localization training enhances neural activity in areas involved in higher-order cognitive functions, spatial attention, and motor execution, while potentially reducing reliance on regions involved in basic sensory processing. This study provides evidence of training-induced neuroplasticity, highlighting the brain’s capacity to adapt through targeted auditory training intervention. Full article

(This article belongs to the Special Issue Brain MRI: Current Development and Applications)

► Show Figures

Figure 1

22 pages, 3059 KB

Open AccessReview

Rapid Eye Movements in Sleep Furnish a Unique Probe into the Ontogenetic and Phylogenetic Development of the Visual Brain: Implications for Autism Research

by Charles Chong-Hwa Hong

Brain Sci. 2025, 15(6), 574; https://doi.org/10.3390/brainsci15060574 - 26 May 2025

Viewed by 1372

Abstract

With positron emission tomography followed by functional magnetic resonance imaging (fMRI), we demonstrated that rapid eye movements (REMs) in sleep are saccades that scan dream imagery. The brain “sees” essentially the same way while awake and while dreaming in REM sleep. As expected, [...] Read more.

With positron emission tomography followed by functional magnetic resonance imaging (fMRI), we demonstrated that rapid eye movements (REMs) in sleep are saccades that scan dream imagery. The brain “sees” essentially the same way while awake and while dreaming in REM sleep. As expected, an event-related fMRI study (events = REMs) showed activation time-locked to REMs in sleep (“REM-locked” activation) in the oculomotor circuit that controls saccadic eye movements and visual attention. More crucially, the fMRI study provided a series of unexpected findings, including REM-locked multisensory integration. REMs in sleep index the processing of endogenous visual information and the hierarchical generation of dream imagery through multisensory integration. The neural processes concurrent with REMs overlap extensively with those reported to be atypical in autism spectrum disorder (ASD). Studies on ASD have shown atypical visual processing and multisensory integration, emerging early in infancy and subsequently developing into autistic symptoms. MRI studies of infants at high risk for ASD are typically conducted during natural sleep. Simply timing REMs may improve the accuracy of early detection and identify markers for stratification in heterogeneous ASD patients. REMs serve as a task-free probe useful for studying both infants and animals, who cannot comply with conventional visual activation tasks. Note that REM-probe studies would be easier to implement in early infancy because REM sleep, which is markedly preponderant in the last trimester of pregnancy, is still pronounced in early infancy. The brain may practice seeing the world during REM sleep in utero before birth. The REM-probe controls the level of attention across both the lifespan and typical-atypical neurodevelopment. Longitudinal REM-probe studies may elucidate how the brain develops the ability to “see” and how this goes awry in autism. REMs in sleep may allow a straightforward comparison of animal and human data. REM-probe studies of animal models of autism have great potential. This narrative review puts forth every reason to believe that employing REMs as a probe into the development of the visual brain will have far-reaching implications. Full article

(This article belongs to the Special Issue Multimodal Imaging in Brain Development)

► Show Figures

Figure 1

22 pages, 8310 KB

Open AccessReview

Pore-Scale Gas–Water Two-Phase Flow Mechanisms for Underground Hydrogen Storage: A Mini Review of Theory, Experiment, and Simulation

by Xiao He, Yao Wang, Yuanshu Zheng, Wenjie Zhang, Yonglin Dai and Hao Zou

Appl. Sci. 2025, 15(10), 5657; https://doi.org/10.3390/app15105657 - 19 May 2025

Viewed by 1447

Abstract

In recent years, underground hydrogen storage (UHS) has become a hot topic in the field of deep energy storage. Green hydrogen, produced using surplus electricity during peak production, can be injected and stored in underground reservoirs and extracted during periods of high demand. [...] Read more.

In recent years, underground hydrogen storage (UHS) has become a hot topic in the field of deep energy storage. Green hydrogen, produced using surplus electricity during peak production, can be injected and stored in underground reservoirs and extracted during periods of high demand. A profound understanding of the mechanisms of the gas–water two-phase flow at the pore scale is of great significance for evaluating the sealing integrity of UHS reservoirs and optimizing injection, as well as the storage space. The pore structure of rocks, as the storage space and flow channels for fluids, has a significant impact on fluid injection, production, and storage processes. This paper systematically summarizes the methods for characterizing the micro-pore structure of reservoir rocks. The applicability of different techniques was evaluated and compared. A detailed comparative analysis was made of the advantages and disadvantages of various numerical simulation methods in tracking two-phase flow interfaces, along with an assessment of their suitability. Subsequently, the microscopic visualization seepage experimental techniques, including microfluidics, NMR-based, and CT scanning-based methods, were reviewed and discussed in terms of the microscopic dynamic mechanisms of complex fluid transport behaviors. Due to the high resolution, non-contact, and non-destructive, as well as the scalable in situ high-temperature and high-pressure experimental conditions, CT scanning-based visualization technology has received increasing attention. The research presented in this paper can provide theoretical guidance for further understanding the characterization of the micro-pore structure of reservoir rocks and the mechanisms of two-phase flow at the pore scale. Full article

(This article belongs to the Topic Exploitation and Underground Storage of Oil and Gas)

► Show Figures

Figure 1

34 pages, 13580 KB

Open AccessArticle

A Novel MaxViT Model for Accelerated and Precise Soybean Leaf and Seed Disease Identification

by Al Shahriar Uddin Khondakar Pranta, Hasib Fardin, Jesika Debnath, Amira Hossain, Anamul Haque Sakib, Md. Redwan Ahmed, Rezaul Haque, Ahmed Wasif Reza and M. Ali Akber Dewan

Computers 2025, 14(5), 197; https://doi.org/10.3390/computers14050197 - 18 May 2025

Cited by 4 | Viewed by 1174

Abstract

Timely diagnosis of soybean diseases is essential to protect yields and limit global economic loss, yet current deep learning approaches suffer from small, imbalanced datasets, single-organ focus, and limited interpretability. We propose MaxViT-XSLD (MaxViT XAI-Seed–Leaf-Diagnostic), a Vision Transformer that integrates multiaxis attention with [...] Read more.

Timely diagnosis of soybean diseases is essential to protect yields and limit global economic loss, yet current deep learning approaches suffer from small, imbalanced datasets, single-organ focus, and limited interpretability. We propose MaxViT-XSLD (MaxViT XAI-Seed–Leaf-Diagnostic), a Vision Transformer that integrates multiaxis attention with MBConv layers to jointly classify soybean leaf and seed diseases while remaining lightweight and explainable. Two benchmark datasets were upscaled through elastic deformation, Gaussian noise, brightness shifts, rotation, and flipping, enlarging ASDID from 10,722 to 16,000 images (eight classes) and the SD set from 5513 to 10,000 images (five classes). Under identical augmentation and hyperparameters, MaxViT-XSLD delivered 99.82% accuracy on ASDID and 99.46% on SD, surpassing competitive ViT, CNN, and lightweight SOTA variants. High PR-AUC and MCC values, confirmed via 10-fold stratified cross-validation and Wilcoxon tests, demonstrate robust generalization across data splits. Explainable AI (XAI) techniques further enhanced interpretability by highlighting biologically relevant features influencing predictions. Its modular design also enables future model compression for edge deployment in resource-constrained settings. Finally, we deploy the model in SoyScan, a real-time web tool that streams predictions and visual explanations to growers and agronomists. These findings establishes a scalable, interpretable system for precision crop health monitoring and lay the groundwork for edge-oriented, multimodal agricultural diagnostics. Full article

(This article belongs to the Special Issue Advanced Image Processing and Computer Vision (2nd Edition))

► Show Figures

Figure 1

Search Results (104)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (104)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI