Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (302)

Search Parameters:
Keywords = dilated convolutional neural networks

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 1835 KB  
Article
Towards Robust Medical Image Segmentation with Hybrid CNN–Linear Mamba
by Xiao Ma and Guangming Lu
Electronics 2025, 14(23), 4726; https://doi.org/10.3390/electronics14234726 - 30 Nov 2025
Viewed by 141
Abstract
Problem: Medical image segmentation faces critical challenges in balancing global context modeling and computational efficiency. While conventional neural networks struggle with long-range dependencies, Transformers incur quadratic complexity. Although Mamba-based architectures achieve linear complexity, they lack adaptive mechanisms for heterogeneous medical images and demonstrate [...] Read more.
Problem: Medical image segmentation faces critical challenges in balancing global context modeling and computational efficiency. While conventional neural networks struggle with long-range dependencies, Transformers incur quadratic complexity. Although Mamba-based architectures achieve linear complexity, they lack adaptive mechanisms for heterogeneous medical images and demonstrate insufficient local feature extraction capabilities. Method: We propose Linear Context-Aware Robust Mamba (LCAR–Mamba) to address these dual limitations through adaptive resource allocation and enhanced multi-scale extraction. LCAR–Mamba integrates two synergistic modules: the Context-Aware Linear Mamba Module (CALM) for adaptive global–local fusion, and the Multi-scale Partial Dilated Convolution Module (MSPD) for efficient multi-scale feature refinement. Core Innovations: CALM module implements content-driven resource allocation through four-stage processing: (1) analyzing spatial complexity via gradient and activation statistics, (2) computing allocation weights to dynamically balance global and local processing branches, (3) parallel dual-path processing with linear attention and convolution, and (4) adaptive fusion guided by complexity weights. MSPD module employs statistics-based channel selection and multi-scale partial dilated convolutions to capture features at multiple receptive scales while reducing computational cost. Key Results: On ISIC2017 and ISIC2018 datasets, mIoU improvements of 0.81%/1.44% confirm effectiveness across 2D benchmarks. On the Synapse dataset, LCAR–Mamba achieves 85.56% DSC, outperforming the former best Mamba baseline by 0.48% with 33% fewer parameters. Significance: LCAR–Mamba demonstrates that adaptive resource allocation and statistics-driven multi-scale extraction can address critical limitations in linear-complexity architectures, establishing a promising direction for efficient medical image segmentation. Full article
(This article belongs to the Special Issue Target Tracking and Recognition Techniques and Their Applications)
Show Figures

Figure 1

38 pages, 79090 KB  
Article
NPPCast: A Compact CNN Integrating Satellite Data for Global Ocean Net Primary Production Forecasts
by Zeming Li, Bizhi Wu, Ziqi Yin, Ruiying Chen and Shanlin Wang
Remote Sens. 2025, 17(23), 3806; https://doi.org/10.3390/rs17233806 - 24 Nov 2025
Viewed by 311
Abstract
Skillful prediction of marine net primary production (NPP) on seasonal to multi-year timescales is essential for assessing the ocean’s role in the global carbon cycle and managing marine resources. We introduce NPPCast, a compact convolutional neural network using causal dilated convolutions, and compare [...] Read more.
Skillful prediction of marine net primary production (NPP) on seasonal to multi-year timescales is essential for assessing the ocean’s role in the global carbon cycle and managing marine resources. We introduce NPPCast, a compact convolutional neural network using causal dilated convolutions, and compare its performance with four representative UNet-family models (UNet, VNet, AttUNet, R2UNet). Each model is pre-trained on 36-month output from either Community Earth System Model version 2 forced-ocean–sea-ice (CESM2-FOSI) or interannual varying forcing (CESM2-GIAF) and fine-tuned using three satellite-derived NPP products (the Standard Vertically Generalized Production Model (SVGPM), the Eppley Vertically Generalized Production Model (EVGPM), and the Carbon-based Productivity Model (CbPM)) as well as their multi-product mean (MEAN). Across most tests, NPPCast outperforms the baselines, reducing global root mean square error (RMSE) by 30–56% on MEAN/EVGPM/SVGPM and improving the anomaly correlation coefficient (ACC) by 0.32–0.49 over the best UNet-based alternative. NPPCast also achieves the highest structural similarity to observations and low bias, as seen in scatter and spatial analyses, and attains the highest or tied-highest Nash–Sutcliffe efficiency (NSE) in three of four products. Crucially, NPPCast’s performance remains stable when switching between FOSI and GIAF pre-training datasets, with RMSE changing by at most 2.17%, whereas UNet-family models vary from −41.6% to +42.5%. We show that NPPCast consistently outperforms the Earth system model, sustaining significant predictive skill in contrast to the rapid decline observed in the latter. These results demonstrate that an architecture that maintains performance across different pre-training datasets (CESM2–FOSI and CESM2–GIAF) can yield more accurate and reliable long-range global NPP forecasts than UNet-family models. Full article
Show Figures

Figure 1

18 pages, 2013 KB  
Article
Deep Learning-Based Human Activity Recognition Using Dilated CNN and LSTM on Video Sequences of Various Actions Dataset
by Bakht Alam Khan and Jin-Woo Jung
Appl. Sci. 2025, 15(22), 12173; https://doi.org/10.3390/app152212173 - 17 Nov 2025
Viewed by 367
Abstract
Human Activity Recognition (HAR) plays a critical role across various fields, including surveillance, healthcare, and robotics, by enabling systems to interpret and respond to human behaviors. In this research, we present an innovative method for HAR that leverages the strengths of Dilated Convolutional [...] Read more.
Human Activity Recognition (HAR) plays a critical role across various fields, including surveillance, healthcare, and robotics, by enabling systems to interpret and respond to human behaviors. In this research, we present an innovative method for HAR that leverages the strengths of Dilated Convolutional Neural Networks (CNNs) integrated with Long Short-Term Memory (LSTM) networks. The proposed architecture achieves an impressive accuracy of 94.9%, surpassing the conventional CNN-LSTM approach, which achieves 93.7% accuracy on the challenging UCF 50 dataset. The use of dilated CNNs significantly enhances the model’s ability to capture extensive spatial–temporal features by expanding the receptive field, thus enabling the recognition of intricate human activities. This approach effectively preserves fine-grained details without increasing computational costs. The inclusion of LSTM layers further strengthens the model’s performance by capturing temporal dependencies, allowing for a deeper understanding of action sequences over time. To validate the robustness of our model, we assessed its generalization capabilities on an unseen YouTube video, demonstrating its adaptability to real-world applications. The superior performance and flexibility of our approach suggests its potential to advance HAR applications in areas like surveillance, human–computer interaction, and healthcare monitoring. Full article
Show Figures

Figure 1

23 pages, 6381 KB  
Article
Temporal Convolutional and LSTM Networks for Complex Mechanical Drilling Speed Prediction
by Yang Huang, Wu Yang, Junrui Hu and Yihang Zhao
Symmetry 2025, 17(11), 1962; https://doi.org/10.3390/sym17111962 - 14 Nov 2025
Viewed by 316
Abstract
Accurate prediction of drilling speed is essential in mechanical drilling operations, as it improves operational efficiency, enhances safety, and reduces overall costs. Traditional prediction methods, however, are often constrained by delayed responsiveness, limited exploitation of real-time parameters, and inadequate capability to model complex [...] Read more.
Accurate prediction of drilling speed is essential in mechanical drilling operations, as it improves operational efficiency, enhances safety, and reduces overall costs. Traditional prediction methods, however, are often constrained by delayed responsiveness, limited exploitation of real-time parameters, and inadequate capability to model complex temporal dependencies, ultimately resulting in suboptimal performance. To overcome these limitations, this study introduces a novel model termed CTLSF (CNN-TCN-LSTM with Self-Attention), which integrates multiple neural network architectures within a symmetry-aware framework. The model achieves architectural symmetry through the coordinated interplay of spatial and temporal learning modules, each contributing complementary strengths to the prediction task. Specifically, Convolutional Neural Networks (CNNs) extract localized spatial features from sequential drilling data, while Temporal Convolutional Networks (TCNs) capture long-range temporal dependencies through dilated convolutions and residual connections. In parallel, Long Short-Term Memory (LSTM) networks model unidirectional temporal dynamics, and a self-attention mechanism adaptively highlights salient temporal patterns. Furthermore, a sliding window strategy is employed to enable real-time prediction on streaming data. Comprehensive experiments conducted on the Volve oilfield dataset demonstrate that the proposed CTLSF model substantially outperforms conventional data-driven approaches, achieving a low Mean Absolute Error (MAE) of 0.8439, a Mean Absolute Percentage Error (MAPE) of 2.19%, and a high coefficient of determination (R2) of 0.9831. These results highlight the effectiveness, robustness, and symmetry-aware design of the CTLSF model in predicting mechanical drilling speed under complex real-world conditions. Full article
(This article belongs to the Special Issue Symmetry and Asymmetry Study in Graph Theory)
Show Figures

Figure 1

19 pages, 2680 KB  
Article
ESSTformer: A CNN-Transformer Hybrid with Decoupled Spatial Spectral Transformers for Hyperspectral Image Super-Resolution
by Hehuan Li, Chen Yi, Jiming Liu, Zhen Zhang and Yu Dong
Appl. Sci. 2025, 15(21), 11738; https://doi.org/10.3390/app152111738 - 4 Nov 2025
Viewed by 455
Abstract
Hyperspectral images (HSIs) are crucial for ground object classification, target detection, and related applications due to their rich spatial spectral information. However, hardware limitations in imaging systems make it challenging to directly acquire HSIs with a high spatial resolution. While deep learning-based single [...] Read more.
Hyperspectral images (HSIs) are crucial for ground object classification, target detection, and related applications due to their rich spatial spectral information. However, hardware limitations in imaging systems make it challenging to directly acquire HSIs with a high spatial resolution. While deep learning-based single hyperspectral image super-resolution (SHSR) methods have made significant progress, existing approaches primarily rely on convolutional neural networks (CNNs) with fixed geometric kernels, which struggle to model global spatial spectral dependencies effectively. To address this, we propose ESSTformer, a novel SHSR framework that synergistically integrates CNNs’ local feature extraction and Transformers’ global modeling capabilities. Specifically, we design a multi-scale spectral attention module (MSAM) based on dilated convolutions to capture local multi-scale spatial spectral features. Considering the inherent differences between spatial and spectral information, we adopt a decoupled processing strategy by constructing separate spatial and Spectral Transformers. The Spatial Transformer employs window attention mechanisms and an improved convolutional multi-layer perceptron (CMLP) to model long-range spatial dependencies, while the Spectral Transformer utilizes self-attention mechanisms combined with a spectral enhancement module to focus on discriminative spectral features. Extensive experiments on three hyperspectral datasets demonstrate that the proposed ESSTformer achieves a superior performance in super-resolution reconstruction compared to state-of-the-art methods. Full article
(This article belongs to the Special Issue Advances in Optical Imaging and Deep Learning)
Show Figures

Figure 1

21 pages, 3806 KB  
Article
An Improved YOLO-Based Algorithm for Aquaculture Object Detection
by Yunfan Fu, Wei Shi, Danwei Chen, Jianping Zhu and Chunfeng Lv
Appl. Sci. 2025, 15(21), 11724; https://doi.org/10.3390/app152111724 - 3 Nov 2025
Viewed by 669
Abstract
Object detection technology plays a vital role in monitoring the growth status of aquaculture organisms and serves as a key enabler for the automated robotic capture of target species. Existing models for underwater biological detection often suffer from low accuracy and high model [...] Read more.
Object detection technology plays a vital role in monitoring the growth status of aquaculture organisms and serves as a key enabler for the automated robotic capture of target species. Existing models for underwater biological detection often suffer from low accuracy and high model complexity. To address these limitations, we propose AOD-YOLO—an enhanced model based on YOLOv11s. The improvements are fourfold: First, the SPFE (Sobel and Pooling Feature Enhancement) module incorporates Sobel operators and pooling operations to effectively extract target edge information and global structural features, thereby strengthening feature representation. Second, the RGL (RepConv and Ghost Lightweight) module reduces redundancy in intermediate feature mappings of the convolutional neural network, decreasing parameter size and computational cost while further enhancing feature extraction capability through RepConv. Third, the MDCS (Multiple Dilated Convolution Sharing Module) module replaces the SPPF structure by integrating parameter-shared dilated convolutions, improving multi-scale target recognition. Finally, we upgrade the C2PSA module to C2PSA-M (Cascade Pyramid Spatial Attention—Mona) by integrating the Mona mechanism. This upgraded module introduces multi-cognitive filters to enhance visual signal processing and employs a distribution adaptation layer to optimize input information distribution. Experiments conducted on the URPC2020 and RUOD datasets demonstrate that AOD-YOLO achieves an accuracy of 86.6% on URPC2020, representing a 2.6% improvement over YOLOv11s, and 88.1% on RUOD, a 2.4% increase. Moreover, the model maintains relatively low complexity with only 8.73 M parameters and 21.4 GFLOPs computational cost. Experimental results show that our model achieves high accuracy for aquaculture targets while maintaining low complexity. This demonstrates its strong potential for reliable use in intelligent aquaculture monitoring systems. Full article
Show Figures

Figure 1

23 pages, 7286 KB  
Article
Multi-Level Supervised Network with Attention Mechanism for Lung Segmentation
by Yahao Wen and Yongjie Wang
Electronics 2025, 14(21), 4249; https://doi.org/10.3390/electronics14214249 - 30 Oct 2025
Viewed by 361
Abstract
Accurate segmentation of lung contours from computed tomography (CT) scans is essential for developing reliable computer-aided diagnostic systems. Although deep learning models, especially convolutional neural networks, have advanced the automation of pulmonary region extraction, their performance is often limited by low contrast and [...] Read more.
Accurate segmentation of lung contours from computed tomography (CT) scans is essential for developing reliable computer-aided diagnostic systems. Although deep learning models, especially convolutional neural networks, have advanced the automation of pulmonary region extraction, their performance is often limited by low contrast and atypical anatomical appearances in CT images. This paper presents MSDC-AM U-Net, a hierarchically supervised segmentation framework built upon the U-Net architecture, integrated with a newly designed Multi-Scale Dilated Convolution (MSDC) module and an Attention Module (AM). The MSDC component employs dilated convolutions with varying receptive fields to improve edge detection and counteract contrast-related ambiguities. Furthermore, spatial attention mechanisms applied across different dimensions guide the model to focus more effectively on lung areas, thereby increasing localization precision. Extensive evaluations on multiple public lung imaging datasets (Luna16, Montgomery County, JSRT) confirm the superiority of the proposed approach. Our MSDC-AM U-Net achieved leading performance, notably attaining a Dice Coefficient of 0.974 on the Luna16 CT dataset and 0.981 on the JSRT X-ray dataset, thereby exceeding current leading methods in both qualitative and quantitative assessments. Full article
Show Figures

Figure 1

24 pages, 1741 KB  
Article
Remaining Useful Life Estimation of Lithium-Ion Batteries Using Alpha Evolutionary Algorithm-Optimized Deep Learning
by Fei Li, Danfeng Yang, Jinghan Li, Shuzhen Wang, Chao Wu, Mingwei Li, Chuanfeng Li, Pengcheng Han and Huafei Qian
Batteries 2025, 11(10), 385; https://doi.org/10.3390/batteries11100385 - 20 Oct 2025
Viewed by 1781
Abstract
The precise prediction of the remaining useful life (RUL) of lithium-ion batteries is of great significance for improving energy management efficiency and extending battery lifespan, and it is widely applied in the fields of new energy and electric vehicles. However, accurate RUL prediction [...] Read more.
The precise prediction of the remaining useful life (RUL) of lithium-ion batteries is of great significance for improving energy management efficiency and extending battery lifespan, and it is widely applied in the fields of new energy and electric vehicles. However, accurate RUL prediction still faces significant challenges. Although various methods based on deep learning have been proposed, the performance of their neural networks is strongly correlated with the hyperparameters. To overcome this limitation, this study proposes an innovative approach that combines the Alpha evolutionary (AE) algorithm with a deep learning model. Specifically, this hybrid deep learning architecture consists of convolutional neural network (CNN), time convolutional network (TCN), bidirectional long short-term memory (BiLSTM) and multi-scale attention mechanism, which extracts the spatial features, long-term temporal dependencies, and key degradation information of battery data, respectively. To optimize the model performance, the AE algorithm is introduced to automatically optimize the hyperparameters of the hybrid model, including the number and size of convolutional kernels in CNN, the dilation rate in TCN, the number of units in BiLSTM, and the parameters of the fusion layer in the attention mechanism. Experimental results demonstrate that our method significantly enhances prediction accuracy and model robustness compared to conventional deep learning techniques. This approach not only improves the accuracy and robustness of battery RUL prediction but also provides new ideas for solving the parameter tuning problem of neural networks. Full article
Show Figures

Figure 1

20 pages, 20080 KB  
Article
Symmetric Combined Convolution with Convolutional Long Short-Term Memory for Monaural Speech Enhancement
by Yang Xian, Yujin Fu, Peixu Xing, Hongwei Tao and Yang Sun
Symmetry 2025, 17(10), 1768; https://doi.org/10.3390/sym17101768 - 20 Oct 2025
Viewed by 390
Abstract
Deep neural network-based approaches have obtained remarkable progress in monaural speech enhancement. Nevertheless, current cutting-edge approaches remain vulnerable to complex acoustic scenarios. We propose a Symmetric Combined Convolution Network with ConvLSTM (SCCN) for monaural speech enhancement. Specifically, the Combined Convolution Block utilizes parallel [...] Read more.
Deep neural network-based approaches have obtained remarkable progress in monaural speech enhancement. Nevertheless, current cutting-edge approaches remain vulnerable to complex acoustic scenarios. We propose a Symmetric Combined Convolution Network with ConvLSTM (SCCN) for monaural speech enhancement. Specifically, the Combined Convolution Block utilizes parallel convolution branches, including standard convolution and two different depthwise separable convolutions, to reinforce feature extraction in depthwise and channelwise. Similarly, Combined Deconvolution Blocks are stacked to construct the convolutional decoder. Moreover, we introduce the exponentially increasing dilation between convolutional kernel elements in the encoder and decoder, which expands receptive fields. Meanwhile, the grouped ConvLSTM layers are exploited to extract the interdependency of spatial and temporal information. The experimental results demonstrate that the proposed SCCN method obtains on average 86.00% in STOI and 2.43 in PESQ, which outperforms the state-of-the-art baseline methods, confirming the effectiveness in enhancing speech quality. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

20 pages, 4914 KB  
Article
Dual-Channel Parallel Multimodal Feature Fusion for Bearing Fault Diagnosis
by Wanrong Li, Haichao Cai, Xiaokang Yang, Yujun Xue, Jun Ye and Xiangyi Hu
Machines 2025, 13(10), 950; https://doi.org/10.3390/machines13100950 - 15 Oct 2025
Viewed by 658
Abstract
In recent years, the powerful feature extraction capabilities of deep learning have attracted widespread attention in the field of bearing fault diagnosis. To address the limitations of single-modal and single-channel feature extraction methods, which often result in incomplete information representation and difficulty in [...] Read more.
In recent years, the powerful feature extraction capabilities of deep learning have attracted widespread attention in the field of bearing fault diagnosis. To address the limitations of single-modal and single-channel feature extraction methods, which often result in incomplete information representation and difficulty in obtaining high-quality fault features, this paper proposes a dual-channel parallel multimodal feature fusion model for bearing fault diagnosis. In this method, the one-dimensional vibration signals are first transformed into two-dimensional time-frequency representations using continuous wavelet transform (CWT). Subsequently, both the one-dimensional vibration signals and the two-dimensional time-frequency representations are fed simultaneously into the dual-branch parallel model. Within this architecture, the first branch employs a combination of a one-dimensional convolutional neural network (1DCNN) and a bidirectional gated recurrent unit (BiGRU) to extract temporal features from the one-dimensional vibration signals. The second branch utilizes a dilated convolutional to capture spatial time–frequency information from the CWT-derived two-dimensional time–frequency representations. The features extracted by both branches were are input into the feature fusion layer. Furthermore, to leverage fault features more comprehensively, a channel attention mechanism is embedded after the feature fusion layer. This enables the network to focus more effectively on salient features across channels while suppressing interference from redundant features, thereby enhancing the performance and accuracy of the dual-branch network. Finally, the fused fault features are passed to a softmax classifier for fault classification. Experimental results demonstrate that the proposed method achieved an average accuracy of 99.50% on the Case Western Reserve University (CWRU) bearing dataset and 97.33% on the Southeast University (SEU) bearing dataset. These results confirm that the suggested model effectively improves fault diagnosis accuracy and exhibits strong generalization capability. Full article
(This article belongs to the Section Machines Testing and Maintenance)
Show Figures

Figure 1

18 pages, 9355 KB  
Article
Two-Dimensional Image Lempel–Ziv Complexity Calculation Method and Its Application in Defect Detection
by Jiancheng Yin, Wentao Sui, Xuye Zhuang, Yunlong Sheng and Yongbo Li
Entropy 2025, 27(10), 1014; https://doi.org/10.3390/e27101014 - 27 Sep 2025
Viewed by 526
Abstract
Although Lempel–Ziv complexity (LZC) can reflect changes in object characteristics by measuring changes in independent patterns in the signal, it can only be applied to one-dimensional time series and cannot be directly applied to two-dimensional images. To address this issue, this paper proposed [...] Read more.
Although Lempel–Ziv complexity (LZC) can reflect changes in object characteristics by measuring changes in independent patterns in the signal, it can only be applied to one-dimensional time series and cannot be directly applied to two-dimensional images. To address this issue, this paper proposed a two-dimensional Lempel–Ziv complexity by combining the concept of local receptive field in convolutional neural networks. This extends the application scenario of LZC from one-dimensional time series to two-dimensional images, further broadening the scope of application of LZC. First, the pixels and size of the image were normalized. Then, the image was encoded according to the sorting of normalized values within the 4 × 4 region. Next, the encoding result of the image was rearranged into a vector by row. Finally, the Lempel–Ziv complexity of the image could be obtained based on the rearranged vector. The proposed method was further used for defect detection in conjunction with the dilation operator and Sobel operator, and validated by two practical cases. The results showed that the proposed method can effectively identify independent pattern changes in images and can be used for defect detection. The accuracy rate of defect detection can reach 100%. Full article
(This article belongs to the Special Issue Complexity and Synchronization in Time Series)
Show Figures

Figure 1

22 pages, 1250 KB  
Article
Entity Span Suffix Classification for Nested Chinese Named Entity Recognition
by Jianfeng Deng, Ruitong Zhao, Wei Ye and Suhong Zheng
Information 2025, 16(10), 822; https://doi.org/10.3390/info16100822 - 23 Sep 2025
Viewed by 467
Abstract
Named entity recognition (NER) is one of the fundamental tasks in building knowledge graphs. For some domain-specific corpora, the text descriptions exhibit limited standardization, and some entity structures have entity nesting. The existing entity recognition methods have problems such as word matching noise [...] Read more.
Named entity recognition (NER) is one of the fundamental tasks in building knowledge graphs. For some domain-specific corpora, the text descriptions exhibit limited standardization, and some entity structures have entity nesting. The existing entity recognition methods have problems such as word matching noise interference and difficulty in distinguishing different entity labels for the same character in sequence label prediction. This paper proposes a span-based feature reuse stacked bidirectional long short term memory network (BiLSTM) nested named entity recognition (SFRSN) model, which transforms the entity recognition of sequence prediction into the problem of entity span suffix category classification. Firstly, character feature embedding is generated through bidirectional encoder representation of transformers (BERT). Secondly, a feature reuse stacked BiLSTM is proposed to obtain deep context features while alleviating the problem of deep network degradation. Thirdly, the span feature is obtained through the dilated convolution neural network (DCNN), and at the same time, a single-tail selection function is introduced to obtain the classification feature of the entity span suffix, with the aim of reducing the training parameters. Fourthly, a global feature gated attention mechanism is proposed, integrating span features and span suffix classification features to achieve span suffix classification. The experimental results on four Chinese-specific domain datasets demonstrate the effectiveness of our approach: SFRSN achieves micro-F1 scores of 83.34% on ontonotes, 73.27% on weibo, 96.90% on resume, and 86.77% on the supply chain management dataset. This represents a maximum improvement of 1.55%, 4.94%, 2.48%, and 3.47% over state-of-the-art baselines, respectively. The experimental results demonstrate the effectiveness of the model in addressing nested entities and entity label ambiguity issues. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Graphical abstract

36 pages, 8122 KB  
Article
Human Activity Recognition via Attention-Augmented TCN-BiGRU Fusion
by Ji-Long He, Jian-Hong Wang, Chih-Min Lo and Zhaodi Jiang
Sensors 2025, 25(18), 5765; https://doi.org/10.3390/s25185765 - 16 Sep 2025
Cited by 1 | Viewed by 1576
Abstract
With the widespread application of wearable sensors in health monitoring and human–computer interaction, deep learning-based human activity recognition (HAR) research faces challenges such as the effective extraction of multi-scale temporal features and the enhancement of robustness against noise in multi-source data. This study [...] Read more.
With the widespread application of wearable sensors in health monitoring and human–computer interaction, deep learning-based human activity recognition (HAR) research faces challenges such as the effective extraction of multi-scale temporal features and the enhancement of robustness against noise in multi-source data. This study proposes the TGA-HAR (TCN-GRU-Attention-HAR) model. The TGA-HAR model integrates Temporal Convolutional Neural Networks and Recurrent Neural Networks by constructing a hierarchical feature abstraction architecture through cascading Temporal Convolutional Network (TCN) and Bidirectional Gated Recurrent Unit (BiGRU) layers for complex activity recognition. This study utilizes TCN layers with dilated convolution kernels to extract multi-order temporal features. This study utilizes BiGRU layers to capture bidirectional temporal contextual correlation information. To further optimize feature representation, the TGA-HAR model introduces residual connections to enhance the stability of gradient propagation and employs an adaptive weighted attention mechanism to strengthen feature representation. The experimental results of this study demonstrate that the model achieved test accuracies of 99.37% on the WISDM dataset, 95.36% on the USC-HAD dataset, and 96.96% on the PAMAP2 dataset. Furthermore, we conducted tests on datasets collected in real-world scenarios. This method provides a highly robust solution for complex human activity recognition tasks. Full article
(This article belongs to the Section Biomedical Sensors)
Show Figures

Figure 1

16 pages, 4653 KB  
Article
Automated Detection and Segmentation of Ascending Aorta Dilation on a Non-ECG-Gated Chest CT Using Deep Learning
by Fargana Aghayeva, Yusuf Abdi, Ahmad Uzair and Ayaz Aghayev
Diagnostics 2025, 15(18), 2336; https://doi.org/10.3390/diagnostics15182336 - 15 Sep 2025
Viewed by 730
Abstract
Background/Objectives: Ascending aortic (AA) dilation (diameter ≥ 4.0 cm) is a significant risk factor for aortic dissection, yet it often goes unnoticed in routine chest CT scans performed for other indications. This study aimed to develop and evaluate a deep learning pipeline for [...] Read more.
Background/Objectives: Ascending aortic (AA) dilation (diameter ≥ 4.0 cm) is a significant risk factor for aortic dissection, yet it often goes unnoticed in routine chest CT scans performed for other indications. This study aimed to develop and evaluate a deep learning pipeline for automated AA segmentation using non-ECG-gated chest CT scans. Methods: We designed a two-stage pipeline integrating a convolutional neural network (CNN) for focus-slice classification and a U-Net-based segmentation model to extract the aortic region. The model was trained and validated on a dataset of 500 non-ECG-gated chest CT scans, encompassing over 50,000 individual slices. Results: On the held-out test set (10%), the model achieved a Dice similarity coefficient (DSC) score of 99.21%, an Intersection over Union (IoU) of 98.45%, and a focus-slice classification accuracy of 98.18%. Compared with traditional rule-based and prior CNN-based methods, the proposed approach achieved markedly higher overlap metrics while maintaining low computational overhead. Conclusions: A lightweight CNN+U-Net deep learning model can enhance diagnostic accuracy, reduce radiologist workload, and enable opportunistic detection of AA dilation in routine chest CT imaging. Full article
(This article belongs to the Special Issue Artificial Intelligence Approaches for Medical Diagnostics in the USA)
Show Figures

Figure 1

16 pages, 1585 KB  
Proceeding Paper
Design of Pentagon-Shaped THz Photonic Crystal Fiber Biosensor for Early Detection of Crop Pathogens Using Decision Cascaded 3D Return Dilated Secretary-Bird Aligned Convolutional Transformer Network
by Sreemathy Jayaprakash, Prasath Nithiyanandam and Rajesh Kumar Dhanaraj
Eng. Proc. 2025, 106(1), 9; https://doi.org/10.3390/engproc2025106009 - 12 Sep 2025
Viewed by 347
Abstract
Crop pathogens threaten global agriculture by causing severe yield and economic losses. Conventional detection methods are often slow and inaccurate, limiting timely intervention. This study introduces a pentagon-shaped terahertz photonic crystal fiber (THz PCF) biosensor, optimized with the decision cascaded 3D return dilated [...] Read more.
Crop pathogens threaten global agriculture by causing severe yield and economic losses. Conventional detection methods are often slow and inaccurate, limiting timely intervention. This study introduces a pentagon-shaped terahertz photonic crystal fiber (THz PCF) biosensor, optimized with the decision cascaded 3D return dilated secretary-bird aligned convolutional transformer network (DC3D-SBA-CTN). The biosensor is designed to detect a broad spectrum of pathogens, including fungi (e.g., Fusarium spp.) and bacteria (e.g., Xanthomonas spp.), by identifying their unique refractive index signatures. Integrating advanced neural networks and optimization algorithms, the biosensor achieves a detection accuracy of 99.87%, precision of 99.65%, sensitivity of 99.77%, and specificity of 99.83%, as validated by a 5-fold cross-validation protocol. It offers high sensitivity (up to 7340 RIU−1), low signal loss, and robust performance against morphological variations, making it adaptable for diverse agricultural settings. This innovation enables rapid, precise monitoring of crop pathogens, revolutionizing plant disease management. Full article
(This article belongs to the Proceedings of The 5th International Electronic Conference on Biosensors)
Show Figures

Figure 1

Back to TopTop