Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (2,588)

Search Parameters:
Keywords = 1D CNN

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 1863 KB  
Article
A Low-Power Piglet Crushing Detection System Based on Multi-Modal Fusion
by Hao Liu, Haopu Li, Yue Cao, Riliang Cao, Guangying Hu and Zhenyu Liu
Agriculture 2026, 16(7), 753; https://doi.org/10.3390/agriculture16070753 (registering DOI) - 28 Mar 2026
Abstract
Accidental crushing by sows is the primary cause of pre-weaning piglet mortality in intensive production, often due to the spatiotemporal lag of manual inspection. While Internet of Things (IoT) solutions exist, they frequently face challenges such as vision occlusion, high hardware costs, and [...] Read more.
Accidental crushing by sows is the primary cause of pre-weaning piglet mortality in intensive production, often due to the spatiotemporal lag of manual inspection. While Internet of Things (IoT) solutions exist, they frequently face challenges such as vision occlusion, high hardware costs, and latency. To address these, this study developed a low-cost multi-modal edge computing system based on TinyML. Using an ESP32-S3 microcontroller, the system employs a “Motion-Gated Acoustic Detection” strategy, activating a lightweight 1D-CNN model to identify piglet screams only when an IMU detects high-risk postural transitions of the sow. Results show the quantized model (5.1 KB) achieves 95.56% accuracy and 2 ms inference latency. The total end-to-end response latency is within 179 ms, ensuring intervention within the early “golden rescue window.” The low-power design enables the battery life to cover the entire lactation period. Field tests demonstrated that the system intercepted identified crushing risks within the monitored cohort, supporting its potential for significantly improving piglet survival probability. This research overcomes the limitations of single-modal monitoring and provides a scalable, cost-effective engineering intervention for enhancing animal welfare and achieving intelligent, unattended supervision in precision livestock farming. Full article
20 pages, 5234 KB  
Article
Performance of Neural Networks in Automated Detection of Wood Features in CT Images
by Tomáš Gergeľ, Ondrej Vacek, Miloš Gejdoš, Diana Zraková, Peter Balogh and Emil Ješko
Forests 2026, 17(4), 425; https://doi.org/10.3390/f17040425 - 27 Mar 2026
Abstract
Computed tomography (CT) enables non-destructive insight into internal log structure, yet fully automated interpretation of CT images remains limited by inconsistent annotations, boundary ambiguity, and insufficient spatial context in 2D slice-based analysis. These challenges restrict the industrial deployment of deep learning for wood [...] Read more.
Computed tomography (CT) enables non-destructive insight into internal log structure, yet fully automated interpretation of CT images remains limited by inconsistent annotations, boundary ambiguity, and insufficient spatial context in 2D slice-based analysis. These challenges restrict the industrial deployment of deep learning for wood quality assessment. This study applies artificial intelligence (AI) and deep learning to the automated analysis of computed tomography (CT) scans of wood logs for detecting internal qualitative features and segmenting bark. Using convolutional neural networks (CNNs), trained models accurately distinguish healthy and damaged regions and segment bark, including discontinuous parts. We introduce a novel pseudo-spatial representation by merging consecutive slices into red–green–blue (RGB) format, which improves prediction accuracy and model robustness across logs. To enhance interpretability, Gradient-weighted Class Activation Mapping (Grad-CAM) highlights regions contributing most to defect detection, particularly knots. Comprehensive evaluation using Sørensen–Dice similarity coefficients and confusion matrices confirms the effectiveness of the proposed approach under industrial conditions. These findings demonstrate that AI-driven CT image analysis can address key limitations of current log-grading workflows and enable more reliable, objective, and scalable quality assessment for timber-dependent economies. Full article
(This article belongs to the Special Issue Wood Quality, Smart Timber Harvesting, and Forestry Machinery)
Show Figures

Figure 1

24 pages, 3376 KB  
Article
EMDiC: Physics-Informed Conditional Diffusion Denoising for Frequency-Domain Electromagnetic Signals
by Zhenlin Du, Miaomiao Gao, Zhijie Qu and Xiaojuan Zhang
Appl. Sci. 2026, 16(7), 3249; https://doi.org/10.3390/app16073249 - 27 Mar 2026
Abstract
Frequency-domain electromagnetic (FDEM) measurements for shallow subsurface exploration are frequently corrupted by noise, which masks weak secondary-field responses and degrades interpretation. We propose an electromagnetic diffusion CNN (EMDiC) for 1D multi-frequency FDEM denoising, where denoising is formulated as conditional diffusion-based generation. EMDiC combines [...] Read more.
Frequency-domain electromagnetic (FDEM) measurements for shallow subsurface exploration are frequently corrupted by noise, which masks weak secondary-field responses and degrades interpretation. We propose an electromagnetic diffusion CNN (EMDiC) for 1D multi-frequency FDEM denoising, where denoising is formulated as conditional diffusion-based generation. EMDiC combines an analytic frequency–spatial encoder, a Feature-wise Linear Modulation (FiLM)-conditioned convolutional hourglass backbone, and a physics-informed composite loss built on velocity loss to improve waveform reconstruction under severe noise. A reproducible synthetic dataset is constructed through layered-earth forward modeling with concentric Transmitter–Receiver (TX–RX) geometry, multiple target categories, and mixed noise waveforms. On synthetic benchmarks covering multiple noise levels and material types, EMDiC achieves the best overall performance in Root Mean Square Error (RMSE), Signal-to-Noise Ratio (SNR), and Normalized cross-correlation (NCC) among 1D U-Net, diffusion-based variants, and representative neural baselines, with the clearest gains under medium-to-strong noise and for targets with pronounced induction responses. Ablation experiments verify the complementary contributions of electromagnetic positional encoding (EMPE), FiLM conditioning, and the composite loss. Field data validation with a self-developed GEM-3 system further shows that EMDiC improves cross-frequency coherence and suppresses oscillations while preserving the main response characteristics. Full article
Show Figures

Figure 1

15 pages, 1915 KB  
Article
Structural Health Diagnosis Using Advanced Spectrum Analysis and Artificial Intelligence of Ground Penetrating Radar Signals
by Wael Zatar, Hien Nghiem, Feng Xiao and Gang Chen
Buildings 2026, 16(7), 1330; https://doi.org/10.3390/buildings16071330 - 27 Mar 2026
Abstract
This paper aims to present a non-destructive, optimized variational mode decomposition (VMD)-based ground-penetrating radar (GPR) method developed for identifying void defects in reinforced concrete (RC) structures. This study also presents an enhanced framework for defect detection in RC by integrating advanced spectrum analysis [...] Read more.
This paper aims to present a non-destructive, optimized variational mode decomposition (VMD)-based ground-penetrating radar (GPR) method developed for identifying void defects in reinforced concrete (RC) structures. This study also presents an enhanced framework for defect detection in RC by integrating advanced spectrum analysis with deep learning techniques. A GPR investigation was conducted on an RC bridge deck with known structural defects to generate a representative dataset reflecting both intact and void-defective conditions. In addition to conventional spectral techniques such as fast Fourier transform (FFT), spectrogram, and scalogram, an optimized variational mode decomposition (VMD) method was implemented. The VMD approach decomposes GPR signals into intrinsic mode functions, enabling refined feature extraction beyond traditional spectral methods and allowing clear differentiation between intact and defective signals. The limited availability and quality of GPR small datasets have restricted the application of a functional 1D-CNN which generally requires at least several hundred datasets. To address this challenge, a data augmentation strategy is adopted. FFT-based features were successfully utilized to train a one-dimensional convolutional neural network (1D-CNN) for automated defect identification. The results demonstrate that both the advanced spectrum-based approach and the hybrid framework combining spectral analysis with deep learning significantly improve defect detection performance. Overall, the proposed methodology provides an effective and intelligent solution to support timely, data-driven decision-making for maintenance and safety assurance of bridge infrastructure. Full article
(This article belongs to the Section Building Structures)
Show Figures

Figure 1

10 pages, 873 KB  
Proceeding Paper
Utilizing Residual Network 50 Convolutional Neural Network Architecture for Enhanced Philippine Regional Language Classification on Jetson Orin Nano
by John Paul T. Cruz, Aaron B. Abadiano, FP O. Sangilan, Emmy Grace T. Requillo and Roben C. Juanatas
Eng. Proc. 2026, 134(1), 2; https://doi.org/10.3390/engproc2026134002 - 26 Mar 2026
Abstract
Visual speech recognition systems encounter significant challenges in multilingual nations such as the Philippines, where numerous regional languages, including Cebuano and Ilocano, feature distinct phonetic-visual characteristics. Deep learning models such as the Lip Reading Network and the Lightweight Crowd Segmentation Network have demonstrated [...] Read more.
Visual speech recognition systems encounter significant challenges in multilingual nations such as the Philippines, where numerous regional languages, including Cebuano and Ilocano, feature distinct phonetic-visual characteristics. Deep learning models such as the Lip Reading Network and the Lightweight Crowd Segmentation Network have demonstrated strong performance with 3D Convolutional Neural Networks (CNNs). However, their substantial computational requirements restrict deployment on portable edge devices. We introduce a more efficient alternative that integrates a 2D Residual Network 50 architecture with a Long Short-Term Memory network and Connectionist Temporal Classification for lip-reading classification of Philippine regional languages. The proposed model is deployed on the Jetson Orin Nano, a high-performance edge device optimized for real-time inference through Compute Unified Device Architecture acceleration. Using a dataset of 2000 annotated videos encompassing 10 lexicons each for Cebuano and Ilocano, the model’s effectiveness was evaluated. Results achieved a regional language classification accuracy of 90%, with lexicon-level accuracies of 74% for Cebuano and 66% for Ilocano. This work represents a step toward developing accessible and scalable communication aids for deaf communities in linguistically diverse environments, leveraging transfer learning on pretrained models. Full article
Show Figures

Figure 1

31 pages, 9451 KB  
Article
Quantitative Microstructure Characterization in Additively Manufactured Nickel Alloy 625 Using Image Segmentation and Deep Learning
by Tuğrul Özel, Sijie Ding, Amit Ramasubramanian, Franco Pieri and Doruk Eskicorapci
Machines 2026, 14(4), 366; https://doi.org/10.3390/machines14040366 - 26 Mar 2026
Abstract
Laser Powder Bed Fusion for metals (PBF-LB/M) is a complex additive manufacturing process in which metal powder is selectively melted layer-by-layer to fabricate 3D parts. Process parameters critically influence the resulting microstructure in nickel alloys, with features such as melt pool marks, grain [...] Read more.
Laser Powder Bed Fusion for metals (PBF-LB/M) is a complex additive manufacturing process in which metal powder is selectively melted layer-by-layer to fabricate 3D parts. Process parameters critically influence the resulting microstructure in nickel alloys, with features such as melt pool marks, grain size and orientation, porosity, and cracks serving as key process signatures. These features are typically analyzed post-process to identify suboptimal conditions. This research aims to develop automated post-process measurement and analysis techniques using image processing, pattern recognition, and statistical learning to correlate process parameters with part quality. Optical microscopy images of build surfaces are analyzed using machine learning algorithms to evaluate porosity, grain size, and relative density in fabricated test coupons. Effect plots are generated to identify trends related to increasing energy density. A novel deep learning approach based on Mask R-CNN is used to detect and segment melt pool regions in optical microscopy images. From the segmented regions, melt pool dimensions—such as width, depth, and area—are extracted using bounding geometry coordinates. Manually labeled images (Type I and Type II) are used to train the model. A comparison between ResNet-50 and ResNet-101 backbones shows that the ResNet-50-based model (Model 2) achieves superior performance, with lower training loss (0.1781 vs. 0.1907) and validation loss (8.6140 vs. 9.4228). Quantitative evaluation using the Jaccard index, precision, and recall metrics shows that the ResNet-101 backbone outperforms ResNet-50, achieving about 4% higher mean Intersection-over-Union, with values of 0.85 for Type I and 0.82 for Type II melt pools, where Type I is detected more accurately due to its more regular morphology and clearer boundaries. By extending Faster R-CNNs with a mask prediction branch, the method allows for precise melt pool measurements, providing valuable insights into process quality and dimensional accuracy, and aiding in the detection of defects in PBF-LB-fabricated parts. Full article
(This article belongs to the Special Issue Artificial Intelligence in Mechanical Engineering Applications)
Show Figures

Figure 1

19 pages, 4320 KB  
Article
Principal Component Analysis-Based Convolutional Neural Networks for Atmospheric Turbulence Aberration Correction and the Optimal Preprocessing Strategy Research
by Jiangpuzhen Wang, Danni Zhang, Ying Zhang, Wanhong Yin, Bing Yu, Tao Jiang, Yunlong Mo, Chengyu Fan and Jinghui Zhang
Photonics 2026, 13(4), 326; https://doi.org/10.3390/photonics13040326 - 26 Mar 2026
Abstract
This study proposes a principal component analysis-based convolutional neural network (PC-CNN) to correct atmospheric turbulence-induced aberrations. Unlike traditional Zernike polynomials (ZPs)-based methods (ZP-CNN), PC-CNN avoids mode aliasing and cross-coupling via the strict orthogonality of principal components (PCs). A coefficient magnification strategy is incorporated [...] Read more.
This study proposes a principal component analysis-based convolutional neural network (PC-CNN) to correct atmospheric turbulence-induced aberrations. Unlike traditional Zernike polynomials (ZPs)-based methods (ZP-CNN), PC-CNN avoids mode aliasing and cross-coupling via the strict orthogonality of principal components (PCs). A coefficient magnification strategy is incorporated to further enhance efficacy, maximally preserving the intrinsic physical information within the PCs coefficients. A series of systematic experiments was conducted under conditions from weak to strong turbulence, characterized by D/r0 from 1 to 25, where D is the pupil diameter and r0 is the atmospheric coherence length. Quantitative results show PC-CNN achieves a lower mean relative error (MRE) in coefficient prediction than ZP-CNN under equivalent conditions. It also yields a higher Strehl ratio, reduced speckles, and enhanced spot clarity while requiring fewer basis terms, demonstrating high stability and robustness in strong turbulence. These findings emphasize that basis function orthogonality and physically informed preprocessing are critical design principles for deep-learning-based wavefront sensor-less adaptive optics (AO), establishing a robust foundation for real-time intelligent AO systems in astronomy and free-space optical communications. Full article
(This article belongs to the Special Issue Emerging Topics in Atmospheric Optics)
Show Figures

Figure 1

35 pages, 2376 KB  
Article
Efficient Word-Level Sign Language Recognition Using Quantized Spatiotemporal Deep Learning for Low-Power Microcontrollers
by Samuel Longwani Kimpinde and Peter O. Olukanmi
Algorithms 2026, 19(4), 248; https://doi.org/10.3390/a19040248 (registering DOI) - 25 Mar 2026
Viewed by 102
Abstract
Deploying efficient sign language recognition models on edge devices advances inclusive, affordable, and privacy-preserving human–computer interaction. Yet most state-of-the-art architectures target server-class hardware and fail under the strict memory, computation, and energy constraints of microcontrollers. This work introduces S3D-Conv1D, a separable spatiotemporal architecture [...] Read more.
Deploying efficient sign language recognition models on edge devices advances inclusive, affordable, and privacy-preserving human–computer interaction. Yet most state-of-the-art architectures target server-class hardware and fail under the strict memory, computation, and energy constraints of microcontrollers. This work introduces S3D-Conv1D, a separable spatiotemporal architecture for isolated word-level sign language recognition, tailored for TinyML deployment. While the idea of separating spatial and temporal processing has been explored in earlier models, the novelty here lies in a deployment pipeline designed from the outset for microcontroller-class constraints: every operator has native INT8 support in TensorFlow Lite, CMSIS-NN, and NNoM; the architecture achieves full integer-only execution with competitive accuracy; and the evaluation scale (100 and 300 classes) substantially exceeds prior TinyML sign language recognition studies. Evaluations on datasets show that S3D-Conv1D achieves 98.96% float32 accuracy on WLASL100 with stable cross-dataset generalization (82.5% on SemLex100). After INT8 quantization, accuracy remains high (98.7% on WLASL100) while compressing to 883 KB, the smallest across all evaluated architectures. An ultralight variant further reduces size to 24.7 KB while sustaining 98.5% accuracy on WLASL100 and 77.2% on WLASL300. Quantization-aware training improves stability, particularly at larger vocabulary scales. Among baselines, S3D achieves strong performances but negligible compression (30.3 MB) due to non-quantization-friendly operators. The MobileNet variant generalizes better with 99.4% on WLASL100 and 97.6% accuracy on SemLex100 but remains large at 2.71 MB in INT8 form. CNN + RNN and e-LSTM depend on unsupported recurrent or attention operators. In contrast, S3D-Conv1D meets all operator compatibility requirements, delivers full INT8 execution with a compact sub-1 MB footprint, and real-time performance. These results demonstrate that competitive word-level sign language recognition is achievable under embedded constraints when architectural design prioritizes quantization stability, operator compatibility, and deployment feasibility from the outset. Full article
Show Figures

Graphical abstract

24 pages, 5780 KB  
Article
A Deep Learning-Guided Ensemble Empirical Mode Decomposition Method for Single-Channel Fetal Electrocardiogram Extraction
by Xiaojian Xu, Yifan Zhang, Yufei Rao, Yinru Xu, Yang Gao and Huating Tu
Sensors 2026, 26(7), 2037; https://doi.org/10.3390/s26072037 (registering DOI) - 25 Mar 2026
Viewed by 101
Abstract
The fetal electrocardiogram (FECG) is critical for assessing fetal cardiac electrophysiology and detecting fetal distress and arrhythmias. Single-channel abdominal electrocardiogram (AECG) enables home-based monitoring but faces challenges posed by weak fetal signals, maternal interference, and the lack of spatial information. Ensemble Empirical Mode [...] Read more.
The fetal electrocardiogram (FECG) is critical for assessing fetal cardiac electrophysiology and detecting fetal distress and arrhythmias. Single-channel abdominal electrocardiogram (AECG) enables home-based monitoring but faces challenges posed by weak fetal signals, maternal interference, and the lack of spatial information. Ensemble Empirical Mode Decomposition (EEMD) is suitable for nonstationary AECG signals but relies on accurate selection of intrinsic mode functions (IMFs). In this study, a deep learning-guided method was proposed: a one-dimensional convolutional neural network (1D CNN) scored and selected EEMD-derived IMFs, followed by maternal QRS template subtraction and secondary EEMD purification to achieve automatic FECG extraction. Leave-one-subject-out (LOSO) cross-validation was performed on 15 simulated cases and 5 ADFECGDB records, yielding a mean AUC of 0.9282 ± 0.0189 for the IMF classifier. On the independent DaISy and NIFEA arrhythmia datasets, the proposed CNN-2×EEMD method achieved correlation coefficients of 0.94–0.96, F1-scores of 0.8372–0.9565 for fetal R-peak detection, and SNR improvements of 13.39–15.88 dB. This method outperformed conventional automatic selection methods and matched the performance of manual selection. Ablation studies validated the optimal network design and IMF selection strategy, while complexity analysis (0.08 GFLOPs, 2.24 ms latency) confirmed its suitability for real-time wearable deployment. Full article
Show Figures

Figure 1

14 pages, 851 KB  
Article
Fully Automated AI-Based Lymph Node Measurements in Chest CT: Accuracy and Reproducibility Compared with Multi-Reader Assessment
by Andra-Iza Iuga, Heike Carolus, Liliana Lourenco Caldeira, Jonathan Kottlors, Miriam Rinneburger, Mathilda Weisthoff, Philipp Fervers, Philip Rauen, Florian Fichter, Lukas Goertz, Pia Niederau, Florian Siedek, Carola Heneweer, Carsten Gietzen, Lenhard Pennig, Anja Dobrostal, Fabian Laqua, Piotr Woznicki, David Maintz, Bettina Baessler and Thorsten Persigehladd Show full author list remove Hide full author list
Diagnostics 2026, 16(7), 967; https://doi.org/10.3390/diagnostics16070967 - 24 Mar 2026
Viewed by 90
Abstract
Background/Objectives: Accurate and reproducible lymph node (LN) measurement is essential for oncologic staging and therapy monitoring but is subject to inter-reader variability. This study evaluated the accuracy and reproducibility of a fully automated artificial intelligence (AI)-based LN measurement workflow in contrast-enhanced chest [...] Read more.
Background/Objectives: Accurate and reproducible lymph node (LN) measurement is essential for oncologic staging and therapy monitoring but is subject to inter-reader variability. This study evaluated the accuracy and reproducibility of a fully automated artificial intelligence (AI)-based LN measurement workflow in contrast-enhanced chest CT, using multi-reader manual measurements as reference. Methods: Sixty thoracic LNs from seven patients were independently measured by 13 radiologists in two reading rounds. The median of all measurements served as the ground truth (GT). Automated short- and long-axis diameters were derived from fully automated 3D CNN-based segmentations. Agreement between AI and manual measurements was assessed using Friedman testing, intraclass correlation coefficients (ICCs), and concordance correlation coefficients (CCCs). Measurement stability was evaluated across repeated runs on different hardware systems. Results: A total of 2280 manual measurements were analyzed. Manual assessment showed significant inter-reader variability (p < 0.01), while intra-reader agreement was high. No significant differences were observed between AI-based measurements and the GT (all p > 0.01). Agreement was good, with CCC values of 0.86 (SAD) and 0.79 (LAD). AI-based measurements were numerically stable across hardware configurations. Conclusions: Fully automated AI-based LN measurements in chest CT scans provide strong agreement with multi-reader consensus and high numerical stability. Automated measurement may support more standardized and reproducible oncologic imaging assessment. Full article
(This article belongs to the Special Issue AI for Medical Diagnosis: From Algorithms to Clinical Integration)
Show Figures

Figure 1

20 pages, 4497 KB  
Article
Remote Sensing Identification of Benggang Using a Two-Stream Network with Multimodal Feature Enhancement and Sparse Attention
by Xuli Rao, Qihao Chen, Kexin Zhu, Zhide Chen, Jinshi Lin and Yanhe Huang
Electronics 2026, 15(6), 1331; https://doi.org/10.3390/electronics15061331 - 23 Mar 2026
Viewed by 131
Abstract
Benggang (Benggang), a typical landform characterized by severe erosion and a geohazard in the red-soil hilly regions of southern China, is characterized by a fragmented texture, irregular boundaries, and high similarity to background objects such as bare soil and roads, which poses a [...] Read more.
Benggang (Benggang), a typical landform characterized by severe erosion and a geohazard in the red-soil hilly regions of southern China, is characterized by a fragmented texture, irregular boundaries, and high similarity to background objects such as bare soil and roads, which poses a dual challenge of “multiscale variability + strong noise” for automated identification at regional scales. To address insufficient information from a single modality and the limited representation of cross-scale features, this study proposes a dual-stream feature-fusion network (DF-Net) for multisource data consisting of a digital orthophoto map (DOM) and a digital elevation model (DEM). The method adopts ResNeSt50d as the backbone of the two branches: on the DOM side, a Canny-edge channel is stacked to enhance high-frequency boundary information; on the DEM side, derived terrain factors, including slope, aspect, curvature, and hillshade, are introduced to provide morphological constraints. In the cross-modal fusion stage, a multiscale sparse attention fusion module is designed, which acquires contextual information via multiwindow average pooling and suppresses noise interference through top-K sparsification. In the decision stage, a multibranch ensemble is employed to improve classification stability. Taking Anxi County, Fujian Province, as the study area, a coregistered dataset of GF-2 (1 m) DOM and ALOS (12.5 m) DEMs is constructed, and a zonal partitioning strategy is adopted to evaluate the model’s generalization ability. The experimental results show that DF-Net achieves 97.44% accuracy, 85.71% recall, and an 82.98% F1 score in the independent test zone, outperforming multiple mainstream CNN/transformer classification models. This study indicates that the strategy of “multimodal feature enhancement + sparse attention fusion” tailored to Benggang erosional landforms can significantly improve recognition performance under complex backgrounds, providing technical support for rapid Benggang surveys and governance-effectiveness assessments. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

17 pages, 3640 KB  
Article
A 3D Global-Patch Transformer for Brain Age Prediction Using T1-Weighted MRI with Gray and White Matter Maps
by Seung-Jun Lee, Myungeun Lee, Yoo Ri Kim and Hyung-Jeong Yang
Appl. Sci. 2026, 16(6), 3004; https://doi.org/10.3390/app16063004 - 20 Mar 2026
Viewed by 117
Abstract
With the increasing prevalence of neurodegenerative diseases driven by population aging, imaging-based biomarkers are needed to quantify brain aging at an early stage. Brain age, which estimates structural brain aging relative to chronological age, has emerged as a useful indicator. Prior work has [...] Read more.
With the increasing prevalence of neurodegenerative diseases driven by population aging, imaging-based biomarkers are needed to quantify brain aging at an early stage. Brain age, which estimates structural brain aging relative to chronological age, has emerged as a useful indicator. Prior work has mainly used T1-weighted MRI with deep learning models such as convolutional neural networks (CNNs) or transformers; however, many approaches insufficiently capture three-dimensional structural continuity and localized anatomical patterns, and tissue-specific aging in gray matter (GM) and white matter (WM) is often treated as auxiliary. To address these limitations, we propose a 3D Global–Patch Transformer framework for brain age prediction that directly processes volumetric data while jointly learning global brain structure and local anatomical features. Our model runs global and patch pathways in parallel and explicitly incorporates GM and WM structural maps alongside T1-weighted MRI to encode tissue-specific aging signals. Experiments on multiple public datasets, including IXI and OASIS, show that the proposed method reduces mean absolute error (MAE) by approximately 10–15% compared with CNN-based and single-input transformer baselines, with notably improved performance in older populations, highlighting the value of tissue-level structural information for brain age estimation. Full article
(This article belongs to the Special Issue MR-Based Neuroimaging, 2nd Edition)
Show Figures

Figure 1

18 pages, 3377 KB  
Article
Can 3D T1 Post-Contrast MRI in A Radiomics-Machine Learning Model Distinguish Infective from Neoplastic Ring-Enhancing Brain Lesions? An Exploratory Study
by Edwin Chong Yu Sng, Minh Bao Kha, Min Jia Wong, Nicholas Kuan Hsien Lee, Jonathan Cheng Yao Goh, So Jeong Park, Darren Cheng Han Teo, Wei Ming Chua, May Yi Shan Lim, Septian Hartono, Lester Chee Hoe Lee, Candice Yuen Yue Chan, Hwee Kuan Lee and Ling Ling Chan
Diagnostics 2026, 16(6), 926; https://doi.org/10.3390/diagnostics16060926 - 20 Mar 2026
Viewed by 299
Abstract
Background/Objectives: Rapid and accurate classification of ring-enhancing brain lesions (REBLs) into infection or neoplasm is key to clinical triaging for expedited diagnostics in the former to enhance treatment outcomes, especially in the immunocompromised patients. High-resolution three-dimensional (3D) T1 post-contrast (T1+C) MRI provides [...] Read more.
Background/Objectives: Rapid and accurate classification of ring-enhancing brain lesions (REBLs) into infection or neoplasm is key to clinical triaging for expedited diagnostics in the former to enhance treatment outcomes, especially in the immunocompromised patients. High-resolution three-dimensional (3D) T1 post-contrast (T1+C) MRI provides high-dimensional volumetric data for radiomics analysis. While radiomics is useful in brain neoplasm characterization, its utility in central nervous system infection remains under-explored. In this exploratory study, we aim to determine if a radiomics-machine learning model, based solely on a 3D T1+C MRI dataset, can distinguish infective from neoplastic REBLs. Methods: 92 patients (infection, n = 26; neoplasm, n = 66) with 402 REBLs, who fulfilled criteria for “definite” or “probable” infective or neoplastic REBLs, were identified from scans performed at our hospital over four years and formed the training/validation dataset. All REBLs were manually annotated on T1+C MRI images under radiological supervision. In total, 1197 radiomics features were extracted, feature selection performed using mutual information, and nine machine learning classifiers applied to assess patient-level infection vs. neoplasm classification performance. End-to-end 2D CNN baselines and hybrid radiomics–CNN configurations were additionally evaluated under the same protocol for comparative benchmarking. Model performance was tested on an external holdout dataset of 57 patients (infection, n = 25; neoplasm, n = 32) with 454 REBLs from another hospital. Results: The Multi-layer Perceptron (MLP) model using the Original + LoG + Wavelet feature group demonstrated superior performance. In the cross-validation cohort, it achieved a mean AUC of 0.80 ± 0.02, sensitivity of 0.83 ± 0.09, specificity of 0.77 ± 0.08, and balanced accuracy of 0.80 ± 0.02. On external holdout data, the same configuration showed stable and sustainable performance with an AUC of 0.84, sensitivity of 0.84, specificity of 0.75, and balanced accuracy of 0.80. Conclusions: Our radiomics-machine learning model, based solely on a high-resolution 3D T1+C dataset, shows potential for distinguishing infective REBLs from neoplastic REBLs. Further study, with additional MR sequences and clinical data in a multimodal MRI radiomics-machine learning model, is warranted. Full article
(This article belongs to the Special Issue Neurological Diseases: Biomarkers, Diagnosis and Prognosis)
Show Figures

Figure 1

30 pages, 9811 KB  
Article
Audio-Based Screening of Respiratory Diseases Using Machine Learning: A Methodological Framework Evaluated on a Clinically Validated COVID-19 Cough Dataset
by Arley Magnolia Aquino-García, Humberto Pérez-Espinosa, Javier Andreu-Perez and Ansel Y. Rodríguez González
Mach. Learn. Knowl. Extr. 2026, 8(3), 80; https://doi.org/10.3390/make8030080 - 20 Mar 2026
Viewed by 184
Abstract
The development of AI-driven computational methods has enabled rapid and non-invasive analysis of respiratory sounds using acoustic data, particularly cough recordings. Although the COVID-19 pandemic accelerated research on cough-based acoustic analysis, many early studies were limited by insufficient data quality, lack of standardized [...] Read more.
The development of AI-driven computational methods has enabled rapid and non-invasive analysis of respiratory sounds using acoustic data, particularly cough recordings. Although the COVID-19 pandemic accelerated research on cough-based acoustic analysis, many early studies were limited by insufficient data quality, lack of standardized protocols, and limited reproducibility due to data scarcity. In this study, we propose an audio analysis framework for cough-based respiratory disease screening research using COVID-19 as a clinically validated case dataset. All analyses were conducted on a single clinically acquired multicentric dataset collected under standardized conditions in certified laboratories in Mexico and Spain, comprising cough recordings from 1105 individuals. Model training and testing were performed exclusively within this dataset. The framework incorporates signal preprocessing and a comparative evaluation of segmentation strategies, showing that segmented cough analysis significantly outperforms full-signal analysis. Class imbalance was addressed using the Synthetic Minority Over-sampling Technique (SMOTE) for CNN2D models and the supervised Resample filter implemented in WEKA for classical machine learning models, both applied exclusively to the training subset to generate balanced training sets and prevent data leakage. Feature extraction and classification were carried out using Random Forest, Support Vector Machine (SVM), XGBoost, and a 2D Convolutional Neural Network (CNN2D), with hyperparameter optimization via AutoML. The proposed framework achieved a best balanced screening performance of 85.58% sensitivity and 86.65% specificity (Random Forest with GeMAPSvB01), while the highest-specificity configuration reached 93.90% specificity with 18.14% sensitivity (CNN2D with SMOTE and AutoML). These results demonstrate the methodological feasibility of the proposed framework under the evaluated conditions. Full article
Show Figures

Figure 1

27 pages, 28242 KB  
Article
Physics-Informed Side-Scan Sonar Perception: Tackling Weak Targets and Sparse Debris via Geometric and Frequency Decoupling
by Bojian Yu, Rongsheng Lin, Hanxiang Zhou, Jianxiong Zhang and Xinwei Zhang
Sensors 2026, 26(6), 1938; https://doi.org/10.3390/s26061938 - 19 Mar 2026
Viewed by 149
Abstract
Side-scan sonar (SSS) serves as the primary perceptual instrument for Autonomous Underwater Vehicles (AUVs) in large-scale marine search and rescue (SAR) operations. However, the detection of critical targets is frequently hindered by severe hydro-acoustic noise, the spatial discontinuity of wreckage, and the weak [...] Read more.
Side-scan sonar (SSS) serves as the primary perceptual instrument for Autonomous Underwater Vehicles (AUVs) in large-scale marine search and rescue (SAR) operations. However, the detection of critical targets is frequently hindered by severe hydro-acoustic noise, the spatial discontinuity of wreckage, and the weak visual signatures of small targets. To surmount these challenges, this paper presents WPG-DetNet. First, we introduce a Wavelet-Embedded Residual Backbone (WERB) to reconstruct the conventional downsampling paradigm. By substituting standard pooling with the Discrete Wavelet Transform (DWT), this architecture explicitly disentangles high-frequency noise from structural information in the frequency domain, thereby achieving the adaptive preservation of edge fidelity for large human-made targets while filtering out speckle interference. Then, addressing the distinct challenge of discontinuous aircraft wreckage, the framework further incorporates a Debris Graph Reasoning Module (D-GRM). This module models scattered fragments as nodes in a topological graph to capture long-range semantic dependencies, transforming isolated instance recognition into context-aware scene understanding. Finally, to bridge the gap between AI and underwater physics, we design a Shadow-Aided Decoupling Head (SADH) equipped with a physics-informed geometric loss. By enforcing mathematical consistency between target height and acoustic shadow length, this mechanism establishes a rigorous discriminative criterion capable of distinguishing weak-echo human bodies from seabed rocks based on shadow geometry. Experiments on the SCTD dataset demonstrate that WPG-DetNet achieves a mean Average Precision (mAP50) of 97.5% and a Recall of 96.9%. Quantitative analysis reveals that our framework outperforms the classic Faster R-CNN by a margin of 12.8% in mAP50 and surpasses the Transformer-based RT-DETR-R18 by 5.6% in high-precision localization metrics (mAP50:95). Simultaneously, WPG-DetNet maintains superior efficiency with an inference speed of 62.5 FPS and a lightweight parameter count of 16.8 M, striking an optimal balance between robust perception and the real-time constraints of AUV operations. Full article
(This article belongs to the Section Physical Sensors)
Show Figures

Figure 1

Back to TopTop