Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (3,449)

Search Parameters:
Keywords = resnet

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
28 pages, 6624 KiB  
Article
YoloMal-XAI: Interpretable Android Malware Classification Using RGB Images and YOLO11
by Chaymae El Youssofi and Khalid Chougdali
J. Cybersecur. Priv. 2025, 5(3), 52; https://doi.org/10.3390/jcp5030052 (registering DOI) - 1 Aug 2025
Abstract
As Android malware grows increasingly sophisticated, traditional detection methods struggle to keep pace, creating an urgent need for robust, interpretable, and real-time solutions to safeguard mobile ecosystems. This study introduces YoloMal-XAI, a novel deep learning framework that transforms Android application files into RGB [...] Read more.
As Android malware grows increasingly sophisticated, traditional detection methods struggle to keep pace, creating an urgent need for robust, interpretable, and real-time solutions to safeguard mobile ecosystems. This study introduces YoloMal-XAI, a novel deep learning framework that transforms Android application files into RGB image representations by mapping DEX (Dalvik Executable), Manifest.xml, and Resources.arsc files to distinct color channels. Evaluated on the CICMalDroid2020 dataset using YOLO11 pretrained classification models, YoloMal-XAI achieves 99.87% accuracy in binary classification and 99.56% in multi-class classification (Adware, Banking, Riskware, SMS, and Benign). Compared to ResNet-50, GoogLeNet, and MobileNetV2, YOLO11 offers competitive accuracy with at least 7× faster training over 100 epochs. Against YOLOv8, YOLO11 achieves comparable or superior accuracy while reducing training time by up to 3.5×. Cross-corpus validation using Drebin and CICAndMal2017 further confirms the model’s generalization capability on previously unseen malware. An ablation study highlights the value of integrating DEX, Manifest, and Resources components, with the full RGB configuration consistently delivering the best performance. Explainable AI (XAI) techniques—Grad-CAM, Grad-CAM++, Eigen-CAM, and HiRes-CAM—are employed to interpret model decisions, revealing the DEX segment as the most influential component. These results establish YoloMal-XAI as a scalable, efficient, and interpretable framework for Android malware detection, with strong potential for future deployment on resource-constrained mobile devices. Full article
Show Figures

Figure 1

21 pages, 28885 KiB  
Article
Assessment of Yellow Rust (Puccinia striiformis) Infestations in Wheat Using UAV-Based RGB Imaging and Deep Learning
by Atanas Z. Atanasov, Boris I. Evstatiev, Asparuh I. Atanasov and Plamena D. Nikolova
Appl. Sci. 2025, 15(15), 8512; https://doi.org/10.3390/app15158512 (registering DOI) - 31 Jul 2025
Abstract
Yellow rust (Puccinia striiformis) is a common wheat disease that significantly reduces yields, particularly in seasons with cooler temperatures and frequent rainfall. Early detection is essential for effective control, especially in key wheat-producing regions such as Southern Dobrudja, Bulgaria. This study [...] Read more.
Yellow rust (Puccinia striiformis) is a common wheat disease that significantly reduces yields, particularly in seasons with cooler temperatures and frequent rainfall. Early detection is essential for effective control, especially in key wheat-producing regions such as Southern Dobrudja, Bulgaria. This study presents a UAV-based approach for detecting yellow rust using only RGB imagery and deep learning for pixel-based classification. The methodology involves data acquisition, preprocessing through histogram equalization, model training, and evaluation. Among the tested models, a UnetClassifier with ResNet34 backbone achieved the highest accuracy and reliability, enabling clear differentiation between healthy and infected wheat zones. Field experiments confirmed the approach’s potential for identifying infection patterns suitable for precision fungicide application. The model also showed signs of detecting early-stage infections, although further validation is needed due to limited ground-truth data. The proposed solution offers a low-cost, accessible tool for small and medium-sized farms, reducing pesticide use while improving disease monitoring. Future work will aim to refine detection accuracy in low-infection areas and extend the model’s application to other cereal diseases. Full article
(This article belongs to the Special Issue Advanced Computational Techniques for Plant Disease Detection)
Show Figures

Figure 1

21 pages, 1928 KiB  
Article
A CNN-Transformer Hybrid Framework for Multi-Label Predator–Prey Detection in Agricultural Fields
by Yifan Lyu, Feiyu Lu, Xuaner Wang, Yakui Wang, Zihuan Wang, Yawen Zhu, Zhewei Wang and Min Dong
Sensors 2025, 25(15), 4719; https://doi.org/10.3390/s25154719 (registering DOI) - 31 Jul 2025
Abstract
Accurate identification of predator–pest relationships is essential for implementing effective and sustainable biological control in agriculture. However, existing image-based methods struggle to recognize insect co-occurrence under complex field conditions, limiting their ecological applicability. To address this challenge, we propose a hybrid deep learning [...] Read more.
Accurate identification of predator–pest relationships is essential for implementing effective and sustainable biological control in agriculture. However, existing image-based methods struggle to recognize insect co-occurrence under complex field conditions, limiting their ecological applicability. To address this challenge, we propose a hybrid deep learning framework that integrates convolutional neural networks (CNNs) and Transformer architectures for multi-label recognition of predator–pest combinations. The model leverages a novel co-occurrence attention mechanism to capture semantic relationships between insect categories and employs a pairwise label matching loss to enhance ecological pairing accuracy. Evaluated on a field-constructed dataset of 5,037 images across eight categories, the model achieved an F1-score of 86.5%, mAP50 of 85.1%, and demonstrated strong generalization to unseen predator–pest pairs with an average F1-score of 79.6%. These results outperform several strong baselines, including ResNet-50, YOLOv8, and Vision Transformer. This work contributes a robust, interpretable approach for multi-object ecological detection and offers practical potential for deployment in smart farming systems, UAV-based monitoring, and precision pest management. Full article
(This article belongs to the Special Issue Sensor and AI Technologies in Intelligent Agriculture: 2nd Edition)
Show Figures

Figure 1

17 pages, 920 KiB  
Article
Enhancing Early GI Disease Detection with Spectral Visualization and Deep Learning
by Tsung-Jung Tsai, Kun-Hua Lee, Chu-Kuang Chou, Riya Karmakar, Arvind Mukundan, Tsung-Hsien Chen, Devansh Gupta, Gargi Ghosh, Tao-Yuan Liu and Hsiang-Chen Wang
Bioengineering 2025, 12(8), 828; https://doi.org/10.3390/bioengineering12080828 - 30 Jul 2025
Abstract
Timely and accurate diagnosis of gastrointestinal diseases (GIDs) remains a critical bottleneck in clinical endoscopy, particularly due to the limited contrast and sensitivity of conventional white light imaging (WLI) in detecting early-stage mucosal abnormalities. To overcome this, this research presents Spectrum Aided Vision [...] Read more.
Timely and accurate diagnosis of gastrointestinal diseases (GIDs) remains a critical bottleneck in clinical endoscopy, particularly due to the limited contrast and sensitivity of conventional white light imaging (WLI) in detecting early-stage mucosal abnormalities. To overcome this, this research presents Spectrum Aided Vision Enhancer (SAVE), an innovative, software-driven framework that transforms standard WLI into high-fidelity hyperspectral imaging (HSI) and simulated narrow-band imaging (NBI) without any hardware modification. SAVE leverages advanced spectral reconstruction techniques, including Macbeth Color Checker-based calibration, principal component analysis (PCA), and multivariate polynomial regression, achieving a root mean square error (RMSE) of 0.056 and structural similarity index (SSIM) exceeding 90%. Trained and validated on the Kvasir v2 dataset (n = 6490) using deep learning models like ResNet-50, ResNet-101, EfficientNet-B2, both EfficientNet-B5 and EfficientNetV2-B0 were used to assess diagnostic performance across six key GI conditions. Results demonstrated that SAVE enhanced imagery and consistently outperformed raw WLI across precision, recall, and F1-score metrics, with EfficientNet-B2 and EfficientNetV2-B0 achieving the highest classification accuracy. Notably, this performance gain was achieved without the need for specialized imaging hardware. These findings highlight SAVE as a transformative solution for augmenting GI diagnostics, with the potential to significantly improve early detection, streamline clinical workflows, and broaden access to advanced imaging especially in resource constrained settings. Full article
Show Figures

Figure 1

14 pages, 2727 KiB  
Article
A Multimodal MRI-Based Model for Colorectal Liver Metastasis Prediction: Integrating Radiomics, Deep Learning, and Clinical Features with SHAP Interpretation
by Xin Yan, Furui Duan, Lu Chen, Runhong Wang, Kexin Li, Qiao Sun and Kuang Fu
Curr. Oncol. 2025, 32(8), 431; https://doi.org/10.3390/curroncol32080431 - 30 Jul 2025
Abstract
Purpose: Predicting colorectal cancer liver metastasis (CRLM) is essential for prognostic assessment. This study aims to develop and validate an interpretable multimodal machine learning framework based on multiparametric MRI for predicting CRLM, and to enhance the clinical interpretability of the model through [...] Read more.
Purpose: Predicting colorectal cancer liver metastasis (CRLM) is essential for prognostic assessment. This study aims to develop and validate an interpretable multimodal machine learning framework based on multiparametric MRI for predicting CRLM, and to enhance the clinical interpretability of the model through SHapley Additive exPlanations (SHAP) analysis and deep learning visualization. Methods: This multicenter retrospective study included 463 patients with pathologically confirmed colorectal cancer from two institutions, divided into training (n = 256), internal testing (n = 111), and external validation (n = 96) sets. Radiomics features were extracted from manually segmented regions on axial T2-weighted imaging (T2WI) and diffusion-weighted imaging (DWI). Deep learning features were obtained from a pretrained ResNet101 network using the same MRI inputs. A least absolute shrinkage and selection operator (LASSO) logistic regression classifier was developed for clinical, radiomics, deep learning, and combined models. Model performance was evaluated by AUC, sensitivity, specificity, and F1-score. SHAP was used to assess feature contributions, and Grad-CAM was applied to visualize deep feature attention. Results: The combined model integrating features across the three modalities achieved the highest performance across all datasets, with AUCs of 0.889 (training), 0.838 (internal test), and 0.822 (external validation), outperforming single-modality models. Decision curve analysis (DCA) revealed enhanced clinical net benefit from the integrated model, while calibration curves confirmed its good predictive consistency. SHAP analysis revealed that radiomic features related to T2WI texture (e.g., LargeDependenceLowGrayLevelEmphasis) and clinical biomarkers (e.g., CA19-9) were among the most predictive for CRLM. Grad-CAM visualizations confirmed that the deep learning model focused on tumor regions consistent with radiological interpretation. Conclusions: This study presents a robust and interpretable multiparametric MRI-based model for noninvasively predicting liver metastasis in colorectal cancer patients. By integrating handcrafted radiomics and deep learning features, and enhancing transparency through SHAP and Grad-CAM, the model provides both high predictive performance and clinically meaningful explanations. These findings highlight its potential value as a decision-support tool for individualized risk assessment and treatment planning in the management of colorectal cancer. Full article
(This article belongs to the Section Gastrointestinal Oncology)
Show Figures

Graphical abstract

18 pages, 4411 KiB  
Article
Research on Enhancing Target Recognition Rate Based on Orbital Angular Momentum Spectrum with Assistance of Neural Network
by Guanxu Chen, Hongyang Wang, Hao Yun, Zhanpeng Shi, Zijing Zhang, Chengshuai Cui, Di Wu, Xinran Lyu and Yuan Zhao
Photonics 2025, 12(8), 771; https://doi.org/10.3390/photonics12080771 - 30 Jul 2025
Abstract
In this paper, the single-mode vortex beam is used to illuminate targets of different shapes, and the targets are recognized using machine learning algorithms based on the orbital angular momentum (OAM) spectral information of the echo signal. We innovatively utilize three neural networks—multilayer [...] Read more.
In this paper, the single-mode vortex beam is used to illuminate targets of different shapes, and the targets are recognized using machine learning algorithms based on the orbital angular momentum (OAM) spectral information of the echo signal. We innovatively utilize three neural networks—multilayer perceptron (MLP), convolutional neural network (CNN) and residual neural network (ResNet)—to train extensive echo OAM spectrum data. The trained models can rapidly and accurately classify the OAM spectrum data of different targets’ echo signals. The results show that the residual network (ResNet) performs best under all turbulence intensities and can achieve a high recognition rate when Cn2=1×1013 m2/3. In addition, even when the target size is η=0.3, the recognition rate of ResNet can reach 97%, while the robustness of MLP and CNN to the target size is lower; the recognition rates are 91.75% and 91%, respectively. However, although the recognition performance of CNN and MLP is slightly lower than that of ResNet, their training time is much lower than that of ResNet, which can achieve a good balance between recognition performance and training time cost. This research has a promising future in the fields of target recognition and intelligent navigation based on multi-dimensional information. Full article
(This article belongs to the Special Issue Advancements in Optics and Laser Measurement)
Show Figures

Figure 1

16 pages, 2784 KiB  
Article
Development of Stacked Neural Networks for Application with OCT Data, to Improve Diabetic Retinal Health Care Management
by Pedro Rebolo, Guilherme Barbosa, Eduardo Carvalho, Bruno Areias, Ana Guerra, Sónia Torres-Costa, Nilza Ramião, Manuel Falcão and Marco Parente
Information 2025, 16(8), 649; https://doi.org/10.3390/info16080649 - 30 Jul 2025
Viewed by 24
Abstract
Background: Retinal diseases are becoming an important public health issue, with early diagnosis and timely intervention playing a key role in preventing vision loss. Optical coherence tomography (OCT) remains the leading non-invasive imaging technique for identifying retinal conditions. However, distinguishing between diabetic macular [...] Read more.
Background: Retinal diseases are becoming an important public health issue, with early diagnosis and timely intervention playing a key role in preventing vision loss. Optical coherence tomography (OCT) remains the leading non-invasive imaging technique for identifying retinal conditions. However, distinguishing between diabetic macular edema (DME) and macular edema resulting from retinal vein occlusion (RVO) can be particularly challenging, especially for clinicians without specialized training in retinal disorders, as both conditions manifest through increased retinal thickness. Due to the limited research exploring the application of deep learning methods, particularly for RVO detection using OCT scans, this study proposes a novel diagnostic approach based on stacked convolutional neural networks. This architecture aims to enhance classification accuracy by integrating multiple neural network layers, enabling more robust feature extraction and improved differentiation between retinal pathologies. Methods: The VGG-16, VGG-19, and ResNet50 models were fine-tuned using the Kermany dataset to classify the OCT images and afterwards were trained using a private OCT dataset. Four stacked models were then developed using these models: a model using the VGG-16 and VGG-19 networks, a model using the VGG-16 and ResNet50 networks, a model using the VGG-19 and ResNet50 models, and finally a model using all three networks. The performance metrics of the model includes accuracy, precision, recall, F2-score, and area under of the receiver operating characteristic curve (AUROC). Results: The stacked neural network using all three models achieved the best results, having an accuracy of 90.7%, precision of 99.2%, a recall of 90.7%, and an F2-score of 92.3%. Conclusions: This study presents a novel method for distinguishing retinal disease by using stacked neural networks. This research aims to provide a reliable tool for ophthalmologists to improve diagnosis accuracy and speed. Full article
(This article belongs to the Special Issue AI-Based Biomedical Signal Processing)
Show Figures

Figure 1

18 pages, 5013 KiB  
Article
Enhancing Document Forgery Detection with Edge-Focused Deep Learning
by Yong-Yeol Bae, Dae-Jea Cho and Ki-Hyun Jung
Symmetry 2025, 17(8), 1208; https://doi.org/10.3390/sym17081208 - 30 Jul 2025
Viewed by 48
Abstract
Detecting manipulated document images is essential for verifying the authenticity of official records and preventing document forgery. However, forgery artifacts are often subtle and localized in fine-grained regions, such as text boundaries or character outlines, where visual symmetry and structural regularity are typically [...] Read more.
Detecting manipulated document images is essential for verifying the authenticity of official records and preventing document forgery. However, forgery artifacts are often subtle and localized in fine-grained regions, such as text boundaries or character outlines, where visual symmetry and structural regularity are typically expected. These manipulations can disrupt the inherent symmetry of document layouts, making the detection of such inconsistencies crucial for forgery identification. Conventional CNN-based models face limitations in capturing such edge-level asymmetric features, as edge-related information tends to weaken through repeated convolution and pooling operations. To address this issue, this study proposes an edge-focused method composed of two components: the Edge Attention (EA) layer and the Edge Concatenation (EC) layer. The EA layer dynamically identifies channels that are highly responsive to edge features in the input feature map and applies learnable weights to emphasize them, enhancing the representation of boundary-related information, thereby emphasizing structurally significant boundaries. Subsequently, the EC layer extracts edge maps from the input image using the Sobel filter and concatenates them with the original feature maps along the channel dimension, allowing the model to explicitly incorporate edge information. To evaluate the effectiveness and compatibility of the proposed method, it was initially applied to a simple CNN architecture to isolate its impact. Subsequently, it was integrated into various widely used models, including DenseNet121, ResNet50, Vision Transformer (ViT), and a CAE-SVM-based document forgery detection model. Experiments were conducted on the DocTamper, Receipt, and MIDV-2020 datasets to assess classification accuracy and F1-score using both original and forged text images. Across all model architectures and datasets, the proposed EA–EC method consistently improved model performance, particularly by increasing sensitivity to asymmetric manipulations around text boundaries. These results demonstrate that the proposed edge-focused approach is not only effective but also highly adaptable, serving as a lightweight and modular extension that can be easily incorporated into existing deep learning-based document forgery detection frameworks. By reinforcing attention to structural inconsistencies often missed by standard convolutional networks, the proposed method provides a practical solution for enhancing the robustness and generalizability of forgery detection systems. Full article
Show Figures

Figure 1

22 pages, 16421 KiB  
Article
Deep Neural Network with Anomaly Detection for Single-Cycle Battery Lifetime Prediction
by Junghwan Lee, Longda Wang, Hoseok Jung, Bukyu Lim, Dael Kim, Jiaxin Liu and Jong Lim
Batteries 2025, 11(8), 288; https://doi.org/10.3390/batteries11080288 - 30 Jul 2025
Viewed by 137
Abstract
Large-scale battery datasets often contain anomalous data due to sensor noise, communication errors, and operational inconsistencies, which degrade the accuracy of data-driven prognostics. However, many existing studies overlook the impact of such anomalies or apply filtering heuristically without rigorous benchmarking, which can potentially [...] Read more.
Large-scale battery datasets often contain anomalous data due to sensor noise, communication errors, and operational inconsistencies, which degrade the accuracy of data-driven prognostics. However, many existing studies overlook the impact of such anomalies or apply filtering heuristically without rigorous benchmarking, which can potentially introduce biases into training and evaluation pipelines. This study presents a deep learning framework that integrates autoencoder-based anomaly detection with a residual neural network (ResNet) to achieve state-of-the-art prediction of remaining useful life at the cycle level using only a single-cycle input. The framework systematically filters out anomalous samples using multiple variants of convolutional and sequence-to-sequence autoencoders, thereby enhancing data integrity before optimizing and training the ResNet-based models. Benchmarking against existing deep learning approaches demonstrates a significant performance improvement, with the best model achieving a mean absolute percentage error of 2.85% and a root mean square error of 40.87 cycles, surpassing prior studies. These results indicate that autoencoder-based anomaly filtering significantly enhances prediction accuracy, reinforcing the importance of systematic anomaly detection in battery prognostics. The proposed method provides a scalable and interpretable solution for intelligent battery management in electric vehicles and energy storage systems. Full article
(This article belongs to the Special Issue Machine Learning for Advanced Battery Systems)
Show Figures

Figure 1

20 pages, 4093 KiB  
Article
CNN Input Data Configuration Method for Fault Diagnosis of Three-Phase Induction Motors Based on D-Axis Current in D-Q Synchronous Reference Frame
by Yeong-Jin Goh
Appl. Sci. 2025, 15(15), 8380; https://doi.org/10.3390/app15158380 - 28 Jul 2025
Viewed by 116
Abstract
This study proposes a novel approach to input data configuration for the fault diagnosis of three-phase induction motors. Conventional neural network (CNN)-based diagnostic methods often employ three-phase current signals and apply various image transformation techniques, such as RGB mapping, wavelet transforms, and short-time [...] Read more.
This study proposes a novel approach to input data configuration for the fault diagnosis of three-phase induction motors. Conventional neural network (CNN)-based diagnostic methods often employ three-phase current signals and apply various image transformation techniques, such as RGB mapping, wavelet transforms, and short-time Fourier transform (STFT), to construct multi-channel input data. While such approaches outperform 1D-CNNs or grayscale-based 2D-CNNs due to their rich informational content, they require multi-channel data and involve an increased computational complexity. Accordingly, this study transforms the three-phase currents into the D-Q synchronous reference frame and utilizes the D-axis current (Id) for image transformation. The Id is used to generate input data using the same image processing techniques, allowing for a direct performance comparison under identical CNN architectures. Experiments were conducted under consistent conditions using both three-phase-based and Id-based methods, each applied to RGB mapping, DWT, and STFT. The classification accuracy was evaluated using a ResNet50-based CNN. Results showed that the Id-STFT achieved the highest performance, with a validation accuracy of 99.6% and a test accuracy of 99.0%. While the RGB representation of three-phase signals has traditionally been favored for its information richness and diagnostic performance, this study demonstrates that a high-performance CNN-based fault diagnosis is achievable even with grayscale representations of a single current. Full article
Show Figures

Figure 1

27 pages, 11177 KiB  
Article
Robust Segmentation of Lung Proton and Hyperpolarized Gas MRI with Vision Transformers and CNNs: A Comparative Analysis of Performance Under Artificial Noise
by Ramtin Babaeipour, Matthew S. Fox, Grace Parraga and Alexei Ouriadov
Bioengineering 2025, 12(8), 808; https://doi.org/10.3390/bioengineering12080808 - 28 Jul 2025
Viewed by 241
Abstract
Accurate segmentation in medical imaging is essential for disease diagnosis and monitoring, particularly in lung imaging using proton and hyperpolarized gas MRI. However, image degradation due to noise and artifacts—especially in hyperpolarized gas MRI, where scans are acquired during breath-holds—poses challenges for conventional [...] Read more.
Accurate segmentation in medical imaging is essential for disease diagnosis and monitoring, particularly in lung imaging using proton and hyperpolarized gas MRI. However, image degradation due to noise and artifacts—especially in hyperpolarized gas MRI, where scans are acquired during breath-holds—poses challenges for conventional segmentation algorithms. This study evaluates the robustness of deep learning segmentation models under varying Gaussian noise levels, comparing traditional convolutional neural networks (CNNs) with modern Vision Transformer (ViT)-based models. Using a dataset of proton and hyperpolarized gas MRI slices from 56 participants, we trained and tested Feature Pyramid Network (FPN) and U-Net architectures with both CNN (VGG16, VGG19, ResNet152) and ViT (MiT-B0, B3, B5) backbones. Results showed that ViT-based models, particularly those using the SegFormer backbone, consistently outperformed CNN-based counterparts across all metrics and noise levels. The performance gap was especially pronounced in high-noise conditions, where transformer models retained higher Dice scores and lower boundary errors. These findings highlight the potential of ViT-based architectures for deployment in clinically realistic, low-SNR environments such as hyperpolarized gas MRI, where segmentation reliability is critical. Full article
Show Figures

Figure 1

22 pages, 1359 KiB  
Article
Fall Detection Using Federated Lightweight CNN Models: A Comparison of Decentralized vs. Centralized Learning
by Qasim Mahdi Haref, Jun Long and Zhan Yang
Appl. Sci. 2025, 15(15), 8315; https://doi.org/10.3390/app15158315 - 25 Jul 2025
Viewed by 200
Abstract
Fall detection is a critical task in healthcare monitoring systems, especially for elderly populations, for whom timely intervention can significantly reduce morbidity and mortality. This study proposes a privacy-preserving and scalable fall-detection framework that integrates federated learning (FL) with transfer learning (TL) to [...] Read more.
Fall detection is a critical task in healthcare monitoring systems, especially for elderly populations, for whom timely intervention can significantly reduce morbidity and mortality. This study proposes a privacy-preserving and scalable fall-detection framework that integrates federated learning (FL) with transfer learning (TL) to train deep learning models across decentralized data sources without compromising user privacy. The pipeline begins with data acquisition, in which annotated video-based fall-detection datasets formatted in YOLO are used to extract image crops of human subjects. These images are then preprocessed, resized, normalized, and relabeled into binary classes (fall vs. non-fall). A stratified 80/10/10 split ensures balanced training, validation, and testing. To simulate real-world federated environments, the training data is partitioned across multiple clients, each performing local training using pretrained CNN models including MobileNetV2, VGG16, EfficientNetB0, and ResNet50. Two FL topologies are implemented: a centralized server-coordinated scheme and a ring-based decentralized topology. During each round, only model weights are shared, and federated averaging (FedAvg) is applied for global aggregation. The models were trained using three random seeds to ensure result robustness and stability across varying data partitions. Among all configurations, decentralized MobileNetV2 achieved the best results, with a mean test accuracy of 0.9927, F1-score of 0.9917, and average training time of 111.17 s per round. These findings highlight the model’s strong generalization, low computational burden, and suitability for edge deployment. Future work will extend evaluation to external datasets and address issues such as client drift and adversarial robustness in federated environments. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

30 pages, 5542 KiB  
Article
SVRG-AALR: Stochastic Variance-Reduced Gradient Method with Adaptive Alternating Learning Rate for Training Deep Neural Networks
by Shiyun Zou, Hua Qin, Guolin Yang and Pengfei Wang
Electronics 2025, 14(15), 2979; https://doi.org/10.3390/electronics14152979 - 25 Jul 2025
Viewed by 170
Abstract
The stochastic variance-reduced gradient (SVRG) theory is particularly well-suited for addressing gradient variance in deep neural network (DNN) training; however, its direct application to DNN training is hindered by adaptation challenges. To tackle this issue, the present paper proposes a series of strategies [...] Read more.
The stochastic variance-reduced gradient (SVRG) theory is particularly well-suited for addressing gradient variance in deep neural network (DNN) training; however, its direct application to DNN training is hindered by adaptation challenges. To tackle this issue, the present paper proposes a series of strategies focused on adaptive alternating learning rates to effectively adapt SVRG for DNN training. Firstly, within the outer loop of SVRG, both the full gradient and the learning rate specific to DNN training are computed. For two distinct formulas used for calculating the learning rate, an alternating strategy is introduced that employs them alternately across iterations. This approach allows for simultaneous provision of diverse guidance information regarding parameter change rates and gradient change rates during DNN weight updates. Additionally, a threshold method is utilized to correct the learning rate into an appropriate range, thereby accelerating convergence. Secondly, in the inner loop of SVRG, DNN weights are updated using mini-batch average gradient along with the proposed learning rate. Concurrently, mini-batch average gradients from each iteration within the inner loop are refined and aggregated into a single gradient exhibiting reduced variance through an inertia strategy. This refined gradient is then relayed back to the outer loop to recalculate the new learning rate. The efficacy of the proposed algorithm has been validated on models including LeNet, VGG11, ResNet34, and DenseNet121 while being compared against several classic and advanced optimizers. Experimental results demonstrate that the proposed algorithm exhibits remarkable training robustness across DNN models with diverse characteristics. In terms of training convergence, the proposed algorithm demonstrates competitiveness with state-of-the-art algorithms, such as Lion, developed by the Google Brain team. Full article
(This article belongs to the Special Issue Advances in Machine Learning for Image Classification)
Show Figures

Figure 1

14 pages, 1419 KiB  
Article
GhostBlock-Augmented Lightweight Gaze Tracking via Depthwise Separable Convolution
by Jing-Ming Guo, Yu-Sung Cheng, Yi-Chong Zeng and Zong-Yan Yang
Electronics 2025, 14(15), 2978; https://doi.org/10.3390/electronics14152978 - 25 Jul 2025
Viewed by 155
Abstract
This paper proposes a lightweight gaze-tracking architecture named GhostBlock-Augmented Look to Coordinate Space (L2CS), which integrates GhostNet-based modules and depthwise separable convolution to achieve a better trade-off between model accuracy and computational efficiency. Conventional lightweight gaze-tracking models often suffer from degraded accuracy due [...] Read more.
This paper proposes a lightweight gaze-tracking architecture named GhostBlock-Augmented Look to Coordinate Space (L2CS), which integrates GhostNet-based modules and depthwise separable convolution to achieve a better trade-off between model accuracy and computational efficiency. Conventional lightweight gaze-tracking models often suffer from degraded accuracy due to aggressive parameter reduction. To address this issue, we introduce GhostBlocks, a custom-designed convolutional unit that combines intrinsic feature generation with ghost feature recomposition through depthwise operations. Our method enhances the original L2CS architecture by replacing each ResNet block with GhostBlocks, thereby significantly reducing the number of parameters and floating-point operations. The experimental results on the Gaze360 dataset demonstrate that the proposed model reduces FLOPs from 16.527 × 108 to 8.610 × 108 and parameter count from 2.387 × 105 to 1.224 × 105 while maintaining comparable gaze estimation accuracy, with MAE increasing only slightly from 10.70° to 10.87°. This work highlights the potential of GhostNet-augmented designs for real-time gaze tracking on edge devices, providing a practical solution for deployment in resource-constrained environments. Full article
Show Figures

Figure 1

18 pages, 2885 KiB  
Article
Research on Microseismic Magnitude Prediction Method Based on Improved Residual Network and Transfer Learning
by Huaixiu Wang and Haomiao Wang
Appl. Sci. 2025, 15(15), 8246; https://doi.org/10.3390/app15158246 - 24 Jul 2025
Viewed by 184
Abstract
To achieve more precise and effective microseismic magnitude estimation, a classification model based on transfer learning with an improved deep residual network is proposed for predicting microseismic magnitudes. Initially, microseismic waveform images are preprocessed through cropping and blurring before being used as inputs [...] Read more.
To achieve more precise and effective microseismic magnitude estimation, a classification model based on transfer learning with an improved deep residual network is proposed for predicting microseismic magnitudes. Initially, microseismic waveform images are preprocessed through cropping and blurring before being used as inputs to the model. Subsequently, the microseismic waveform image dataset is divided into training, testing, and validation sets. By leveraging the pretrained ResNet18 model weights from ImageNet, a transfer learning strategy is implemented, involving the retraining of all layers from scratch. Following this, the CBAM is introduced for model optimization, resulting in a new network model. Finally, this model is utilized in seismic magnitude classification research to enable microseismic magnitude prediction. The model is validated and compared with other commonly used neural network models. The experiment uses microseismic waveform data and images of magnitudes 0–3 from the Stanford Earthquake Dataset (STEAD) as training samples. The results indicate that the model achieves an accuracy of 87% within an error range of ±0.2 and 94.7% within an error range of ±0.3. This model demonstrates enhanced stability and reliability, effectively addressing the issue of missing data labels. It validates that using ResNet transfer learning combined with an attention mechanism yields higher accuracy in microseismic magnitude prediction, as well as confirming the effectiveness of the CBAM. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

Back to TopTop