MDPI - Publisher of Open Access Journals

15 pages, 4874 KiB

Open AccessArticle

A Novel 3D Convolutional Neural Network-Based Deep Learning Model for Spatiotemporal Feature Mapping for Video Analysis: Feasibility Study for Gastrointestinal Endoscopic Video Classification

by Mrinal Kanti Dhar, Mou Deb, Poonguzhali Elangovan, Keerthy Gopalakrishnan, Divyanshi Sood, Avneet Kaur, Charmy Parikh, Swetha Rapolu, Gianeshwaree Alias Rachna Panjwani, Rabiah Aslam Ansari, Naghmeh Asadimanesh, Shiva Sankari Karuppiah, Scott A. Helgeson, Venkata S. Akshintala and Shivaram P. Arunachalam

J. Imaging 2025, 11(7), 243; https://doi.org/10.3390/jimaging11070243 - 18 Jul 2025

Viewed by 450

Abstract

Accurate analysis of medical videos remains a major challenge in deep learning (DL) due to the need for effective spatiotemporal feature mapping that captures both spatial detail and temporal dynamics. Despite advances in DL, most existing models in medical AI focus on static [...] Read more.

Accurate analysis of medical videos remains a major challenge in deep learning (DL) due to the need for effective spatiotemporal feature mapping that captures both spatial detail and temporal dynamics. Despite advances in DL, most existing models in medical AI focus on static images, overlooking critical temporal cues present in video data. To bridge this gap, a novel DL-based framework is proposed for spatiotemporal feature extraction from medical video sequences. As a feasibility use case, this study focuses on gastrointestinal (GI) endoscopic video classification. A 3D convolutional neural network (CNN) is developed to classify upper and lower GI endoscopic videos using the hyperKvasir dataset, which contains 314 lower and 60 upper GI videos. To address data imbalance, 60 matched pairs of videos are randomly selected across 20 experimental runs. Videos are resized to 224 × 224, and the 3D CNN captures spatiotemporal information. A 3D version of the parallel spatial and channel squeeze-and-excitation (P-scSE) is implemented, and a new block called the residual with parallel attention (RPA) block is proposed by combining P-scSE3D with a residual block. To reduce computational complexity, a (2 + 1)D convolution is used in place of full 3D convolution. The model achieves an average accuracy of 0.933, precision of 0.932, recall of 0.944, F1-score of 0.935, and AUC of 0.933. It is also observed that the integration of P-scSE3D increased the F1-score by 7%. This preliminary work opens avenues for exploring various GI endoscopic video-based prospective studies. Full article

(This article belongs to the Special Issue Clinical and Pathological Imaging in the Era of Artificial Intelligence: New Insights and Perspectives—2nd Edition)

► Show Figures

Figure 1

18 pages, 70320 KiB

Open AccessArticle

RIS-UNet: A Multi-Level Hierarchical Framework for Liver Tumor Segmentation in CT Images

by Yuchai Wan, Lili Zhang and Murong Wang

Entropy 2025, 27(7), 735; https://doi.org/10.3390/e27070735 - 9 Jul 2025

Viewed by 424

Abstract

The deep learning-based analysis of liver CT images is expected to provide assistance for clinicians in the diagnostic decision-making process. However, the accuracy of existing methods still falls short of clinical requirements and needs to be further improved. Therefore, in this work, we [...] Read more.

The deep learning-based analysis of liver CT images is expected to provide assistance for clinicians in the diagnostic decision-making process. However, the accuracy of existing methods still falls short of clinical requirements and needs to be further improved. Therefore, in this work, we propose a novel multi-level hierarchical framework for liver tumor segmentation. In the first level, we integrate inter-slice spatial information by a 2.5D network to resolve the accuracy–efficiency trade-off inherent in conventional 2D/3D segmentation strategies for liver tumor segmentation. Then, the second level extracts the inner-slice global and local features for enhancing feature representation. We propose the Res-Inception-SE Block, which combines residual connections, multi-scale Inception modules, and squeeze-excitation attention to capture comprehensive global and local features. Furthermore, we design a hybrid loss function combining Binary Cross Entropy (BCE) and Dice loss to solve the category imbalance problem and accelerate convergence. Extensive experiments on the LiTS17 dataset demonstrate the effectiveness of our method on accuracy, efficiency, and visual results for liver tumor segmentation. Full article

(This article belongs to the Special Issue Cutting-Edge AI in Computational Bioinformatics)

► Show Figures

Figure 1

21 pages, 3406 KiB

Open AccessArticle

ResNet-SE-CBAM Siamese Networks for Few-Shot and Imbalanced PCB Defect Classification

by Chao-Hsiang Hsiao, Huan-Che Su, Yin-Tien Wang, Min-Jie Hsu and Chen-Chien Hsu

Sensors 2025, 25(13), 4233; https://doi.org/10.3390/s25134233 - 7 Jul 2025

Viewed by 573

Abstract

Defect detection in mass production lines often involves small and imbalanced datasets, necessitating the use of few-shot learning methods. Traditional deep learning-based approaches typically rely on large datasets, limiting their applicability in real-world scenarios. This study explores few-shot learning models for detecting product [...] Read more.

Defect detection in mass production lines often involves small and imbalanced datasets, necessitating the use of few-shot learning methods. Traditional deep learning-based approaches typically rely on large datasets, limiting their applicability in real-world scenarios. This study explores few-shot learning models for detecting product defects using limited data, enhancing model generalization and stability. Unlike previous deep learning models that require extensive datasets, our approach effectively performs defect detection with minimal data. We propose a Siamese network that integrates Residual blocks, Squeeze and Excitation blocks, and Convolution Block Attention Modules (ResNet-SE-CBAM Siamese network) for feature extraction, optimized through triplet loss for embedding learning. The ResNet-SE-CBAM Siamese network incorporates two primary features: attention mechanisms and metric learning. The recently developed attention mechanisms enhance the convolutional neural network operations and significantly improve feature extraction performance. Meanwhile, metric learning allows for the addition or removal of feature classes without the need to retrain the model, improving its applicability in industrial production lines with limited defect samples. To further improve training efficiency with imbalanced datasets, we introduce a sample selection method based on the Structural Similarity Index Measure (SSIM). Additionally, a high defect rate training strategy is utilized to reduce the False Negative Rate (FNR) and ensure no missed defect detections. At the classification stage, a K-Nearest Neighbor (KNN) classifier is employed to mitigate overfitting risks and enhance stability in few-shot conditions. The experimental results demonstrate that with a good-to-defect ratio of 20:40, the proposed system achieves a classification accuracy of 94% and an FNR of 2%. Furthermore, when the number of defective samples increases to 80, the system achieves zero false negatives (FNR = 0%). The proposed metric learning approach outperforms traditional deep learning models, such as parametric-based YOLO series models in defect detection, achieving higher accuracy and lower miss rates, highlighting its potential for high-reliability industrial deployment. Full article

(This article belongs to the Special Issue Artificial Intelligence and Sensor-Enhanced Fault Diagnosis for Industrial Application)

► Show Figures

Figure 1

20 pages, 1935 KiB

Open AccessArticle

Residual Attention Network with Atrous Spatial Pyramid Pooling for Soil Element Estimation in LUCAS Hyperspectral Data

by Yun Deng, Yuchen Cao, Shouxue Chen and Xiaohui Cheng

Appl. Sci. 2025, 15(13), 7457; https://doi.org/10.3390/app15137457 - 3 Jul 2025

Viewed by 301

Abstract

Visible and near-infrared (Vis–NIR) spectroscopy enables the rapid prediction of soil properties but faces three limitations with conventional machine learning: information loss and overfitting from high-dimensional spectral features; inadequate modeling of nonlinear soil–spectra relationships; and failure to integrate multi-scale spatial features. To address [...] Read more.

Visible and near-infrared (Vis–NIR) spectroscopy enables the rapid prediction of soil properties but faces three limitations with conventional machine learning: information loss and overfitting from high-dimensional spectral features; inadequate modeling of nonlinear soil–spectra relationships; and failure to integrate multi-scale spatial features. To address these challenges, we propose ReSE-AP Net, a multi-scale attention residual network with spatial pyramid pooling. Built on convolutional residual blocks, the model incorporates a squeeze-and-excitation channel attention mechanism to recalibrate feature weights and an atrous spatial pyramid pooling (ASPP) module to extract multi-resolution spectral features. This architecture synergistically represents weak absorption peaks (400–1000 nm) and broad spectral bands (1000–2500 nm), overcoming single-scale modeling limitations. Validation on the LUCAS2009 dataset demonstrated that ReSE-AP Net outperformed conventional machine learning by improving the R² by 2.8–36.5% and reducing the RMSE by 14.2–69.2%. Compared with existing deep learning methods, it increased the R² by 0.4–25.5% for clay, silt, sand, organic carbon, calcium carbonate, and phosphorus predictions, and decreased the RMSE by 0.7–39.0%. Our contributions include statistical analysis of LUCAS2009 spectra, identification of conventional method limitations, development of the ReSE-AP Net model, ablation studies, and comprehensive comparisons with alternative approaches. Full article

(This article belongs to the Special Issue Advanced Agricultural Technologies: Monitoring, Modeling, and Machine Learning Techniques)

► Show Figures

Figure 1

24 pages, 1307 KiB

Open AccessArticle

A Self-Supervised Specific Emitter Identification Method Based on Contrastive Asymmetric Masked Learning

by Dong Wang, Yonghui Huang, Tianshu Cui and Yan Zhu

Sensors 2025, 25(13), 4023; https://doi.org/10.3390/s25134023 - 27 Jun 2025

Viewed by 301

Abstract

Specific emitter identification (SEI) is a core technology for wireless device security that plays a crucial role in protecting wireless communication systems from various security threats. However, current deep learning-based SEI methods heavily rely on large amounts of labeled data for supervised training, [...] Read more.

Specific emitter identification (SEI) is a core technology for wireless device security that plays a crucial role in protecting wireless communication systems from various security threats. However, current deep learning-based SEI methods heavily rely on large amounts of labeled data for supervised training, facing challenges in non-cooperative communication scenarios. To address these issues, this paper proposes a novel contrastive asymmetric masked learning-based SEI (CAML-SEI) method, effectively solving the problem of SEI under scarce labeled samples. The proposed method constructs an asymmetric auto-encoder architecture, comprising an encoder network based on channel squeeze-and-excitation residual blocks to capture radio frequency fingerprint (RFF) features embedded in signals, while employing a lightweight single-layer convolutional decoder for masked signal reconstruction. This design promotes the learning of fine-grained local feature representations. To further enhance feature discriminability, a learnable non-linear mapping is introduced to compress high-dimensional encoded features into a compact low-dimensional space, accompanied by a contrastive loss function that simultaneously achieves feature aggregation of positive samples and feature separation of negative samples. Finally, the network is jointly optimized by combining signal reconstruction and feature contrast tasks. Experiments conducted on real-world ADS-B and Wi-Fi datasets demonstrate that the proposed method effectively learns generalized RFF features, and the results show superior performance compared with other SEI methods. Full article

(This article belongs to the Section Communications)

► Show Figures

Figure 1

23 pages, 5745 KiB

Open AccessArticle

BDSER-InceptionNet: A Novel Method for Near-Infrared Spectroscopy Model Transfer Based on Deep Learning and Balanced Distribution Adaptation

by Jianghai Chen, Jie Ling, Nana Lei and Lingqiao Li

Sensors 2025, 25(13), 4008; https://doi.org/10.3390/s25134008 - 27 Jun 2025

Viewed by 364

Abstract

Near-Infrared Spectroscopy (NIRS) analysis technology faces numerous challenges in industrial applications. Firstly, the generalization capability of models is significantly affected by instrumental heterogeneity, environmental interference, and sample diversity. Traditional modeling methods exhibit certain limitations in handling these factors, making it difficult to achieve [...] Read more.

Near-Infrared Spectroscopy (NIRS) analysis technology faces numerous challenges in industrial applications. Firstly, the generalization capability of models is significantly affected by instrumental heterogeneity, environmental interference, and sample diversity. Traditional modeling methods exhibit certain limitations in handling these factors, making it difficult to achieve effective adaptation across different scenarios. Specifically, data distribution shifts and mismatches in multi-scale features hinder the transferability of models across different crop varieties or instruments from different manufacturers. As a result, the large amount of previously accumulated NIRS and reference data cannot be effectively utilized in modeling for new instruments or new varieties, thereby limiting improvements in modeling efficiency and prediction accuracy. To address these limitations, this study proposes a novel transfer learning framework integrating multi-scale network architecture with Balanced Distribution Adaptation (BDA) to enhance cross-instrument compatibility. The key contributions include: (1) RX-Inception multi-scale structure: Combines Xception’s depthwise separable convolution with ResNet’s residual connections to strengthen global–local feature coupling. (2) Squeeze-and-Excitation (SE) attention: Dynamically recalibrates spectral band weights to enhance discriminative feature representation. (3) Systematic evaluation of six transfer strategies: Comparative analysis of their impacts on model adaptation performance. Experimental results on open corn and pharmaceutical datasets demonstrate that BDSER-InceptionNet achieves state-of-the-art performance on primary instruments. Notably, the proposed Method 6 successfully enables NIRS model sharing from primary to secondary instruments, effectively mitigating spectral discrepancies and significantly improving transfer efficacy. Full article

(This article belongs to the Section Physical Sensors)

► Show Figures

Figure 1

26 pages, 10233 KiB

Open AccessArticle

Time-Series Forecasting Method Based on Hierarchical Spatio-Temporal Attention Mechanism

by Zhiguo Xiao, Junli Liu, Xinyao Cao, Ke Wang, Dongni Li and Qian Liu

Sensors 2025, 25(13), 4001; https://doi.org/10.3390/s25134001 - 26 Jun 2025

Viewed by 551

Abstract

In the field of intelligent decision-making, time-series data collected by sensors serves as the core carrier for interaction between the physical and digital worlds. Accurate analysis is the cornerstone of decision-making in critical scenarios, such as industrial monitoring and intelligent transportation. However, the [...] Read more.

In the field of intelligent decision-making, time-series data collected by sensors serves as the core carrier for interaction between the physical and digital worlds. Accurate analysis is the cornerstone of decision-making in critical scenarios, such as industrial monitoring and intelligent transportation. However, the inherent spatio-temporal coupling characteristics and cross-period long-range dependency of sensor data cause traditional time-series prediction methods to face performance bottlenecks in feature decoupling and multi-scale modeling. This study innovatively proposes a Spatio-Temporal Attention-Enhanced Network (TSEBG). Breaking through traditional structural designs, the model employs a Squeeze-and-Excitation Network (SENet) to reconstruct the convolutional layers of the Temporal Convolutional Network (TCN), strengthening the feature expression of key time steps through dynamic channel weight allocation to address the redundancy issue of traditional causal convolutions in local pattern capture. A Bidirectional Gated Recurrent Unit (BiGRU) variant based on a global attention mechanism is designed, leveraging the collaboration between gating units and attention weights to mine cross-period long-distance dependencies and effectively alleviate the gradient disappearance problem of Recurrent Neural Network (RNN-like) models in multi-scale time-series analysis. A hierarchical feature fusion architecture is constructed to achieve multi-dimensional alignment of local spatial and global temporal features. Through residual connections and the dynamic adjustment of attention weights, hierarchical semantic representations are output. Experiments show that TSEBG outperforms current dominant models in time-series single-step prediction tasks in terms of accuracy and performance, with a cross-dataset R² standard deviation of only 3.7%, demonstrating excellent generalization stability. It provides a novel theoretical framework for feature decoupling and multi-scale modeling of complex time-series data. Full article

(This article belongs to the Special Issue Intelligent Sensors for Condition Monitoring, Diagnosis, and Prognostics)

► Show Figures

Figure 1

23 pages, 14051 KiB

Open AccessArticle

A Novel Method for Water Surface Debris Detection Based on YOLOV8 with Polarization Interference Suppression

by Yi Chen, Honghui Lin, Lin Xiao, Maolin Zhang and Pingjun Zhang

Photonics 2025, 12(6), 620; https://doi.org/10.3390/photonics12060620 - 18 Jun 2025

Viewed by 323

Abstract

Aquatic floating debris detection is a key technological foundation for ecological monitoring and integrated water environment management. It holds substantial scientific and practical value in applications such as pollution source tracing, floating debris control, and maritime navigation safety. However, this field faces ongoing [...] Read more.

Aquatic floating debris detection is a key technological foundation for ecological monitoring and integrated water environment management. It holds substantial scientific and practical value in applications such as pollution source tracing, floating debris control, and maritime navigation safety. However, this field faces ongoing challenges due to water surface polarization. Reflections of polarized light produce intense glare, resulting in localized overexposure, detail loss, and geometric distortion in captured images. These optical artifacts severely impair the performance of conventional detection algorithms, increasing both false positives and missed detections. To overcome these imaging challenges in complex aquatic environments, we propose a novel YOLOv8-based detection framework with integrated polarized light suppression mechanisms. The framework consists of four key components: a fisheye distortion correction module, a polarization feature processing layer, a customized residual network with Squeeze-and-Excitation (SE) attention, and a cascaded pipeline for super-resolution reconstruction and deblurring. Additionally, we developed the PSF-IMG dataset (Polarized Surface Floats), which includes common floating debris types such as plastic bottles, bags, and foam boards. Extensive experiments demonstrate the network’s robustness in suppressing polarization artifacts and enhancing feature stability under dynamic optical conditions. Full article

(This article belongs to the Special Issue Advancements in Optical Measurement Techniques and Applications)

► Show Figures

Figure 1

23 pages, 5084 KiB

Open AccessArticle

A Hybrid Dropout Method for High-Precision Seafloor Topography Reconstruction and Uncertainty Quantification

by Xinye Cui, Houpu Li, Yanting Yu, Shaofeng Bian and Guojun Zhai

Appl. Sci. 2025, 15(11), 6113; https://doi.org/10.3390/app15116113 - 29 May 2025

Viewed by 339

Abstract

Seafloor topography super-resolution reconstruction is critical for marine resource exploration, geological monitoring, and navigation safety. However, sparse acoustic data frequently result in the loss of high-frequency details, and traditional deep learning models exhibit limitations in uncertainty quantification, impeding their practical application. To address [...] Read more.

Seafloor topography super-resolution reconstruction is critical for marine resource exploration, geological monitoring, and navigation safety. However, sparse acoustic data frequently result in the loss of high-frequency details, and traditional deep learning models exhibit limitations in uncertainty quantification, impeding their practical application. To address these challenges, this study systematically investigates the combined effects of various regularization strategies and uncertainty quantification modules. It proposes a hybrid dropout model that jointly optimizes high-precision reconstruction and uncertainty estimation. The model integrates residual blocks, squeeze-and-excitation (SE) modules, and a multi-scale feature extraction network while employing Monte Carlo Dropout (MC-Dropout) alongside heteroscedastic noise modeling to dynamically gate the uncertainty quantification process. By adaptively modulating the regularization strength based on feature activations, the model preserves high-frequency information and accurately estimates predictive uncertainty. The experimental results demonstrate significant improvements in the Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Peak Signal-to-Noise Ratio (PSNR). Compared to conventional dropout architectures, the proposed method achieves a PSNR increase of 46.5% to 60.5% in test regions with a marked reduction in artifacts. Overall, the synergistic effect of employed regularization strategies and uncertainty quantification modules substantially enhances detail recovery and robustness in complex seafloor topography reconstruction, offering valuable theoretical insights and practical guidance for further optimization of deep learning models in challenging applications. Full article

(This article belongs to the Section Marine Science and Engineering)

► Show Figures

Figure 1

31 pages, 8581 KiB

Open AccessArticle

YOLO11-Driven Deep Learning Approach for Enhanced Detection and Visualization of Wrist Fractures in X-Ray Images

by Mubashar Tariq and Kiho Choi

Mathematics 2025, 13(9), 1419; https://doi.org/10.3390/math13091419 - 25 Apr 2025

Cited by 1 | Viewed by 2292

Abstract

Wrist fractures, especially those involving the elbow and distal radius, are the most common injuries in children, teenagers, and young adults, with the highest occurrence rates during adolescence. However, the demand for medical imaging and the shortage of radiologists make it challenging to [...] Read more.

Wrist fractures, especially those involving the elbow and distal radius, are the most common injuries in children, teenagers, and young adults, with the highest occurrence rates during adolescence. However, the demand for medical imaging and the shortage of radiologists make it challenging to ensure accurate diagnosis and treatment. This study explores how AI-driven approaches are used to enhance fracture detection and improve diagnostic accuracy. In this paper, we propose the latest version of YOLO (i.e., YOLO11) with an attention module, designed to refine detection correctness. We integrated attention mechanisms, such as Global Attention Mechanism (GAM), channel attention, and spatial attention with Residual Network (ResNet), to enhance feature extraction. Moreover, we developed the ResNet_GAM model, which combines ResNet with GAM to improve feature learning and model performance. In this paper, we apply a data augmentation process to the publicly available GRAZPEDWRI-DX dataset, which is widely used for detecting radial bone fractures in X-ray images of children. Experimental findings indicate that integrating Squeeze-and-Excitation (SE_BLOCK) into YOLO11 significantly increases model efficiency. Our experimental results attain state-of-the-art performance, measured by the mean average precision (mAP50). Through extensive experiments, we found that our model achieved the highest mAP50 of 0.651. Meanwhile, YOLO11 with GAM and ResNet_GAM attained a maximum precision of 0.799 and a recall of 0.639 across all classes on the given dataset. The potential of these models to improve pediatric wrist imaging is significant, as they offer better detection accuracy while still being computationally efficient. Additionally, to help surgeons identify and diagnose fractures in patient wrist X-ray images, we provide a Fracture Detection Web-based Interface based on the result of the proposed method. This interface reduces the risk of misinterpretation and provides valuable information to assist in making surgical decisions. Full article

(This article belongs to the Special Issue Machine Learning in Bioinformatics and Biostatistics)

► Show Figures

Figure 1

27 pages, 10754 KiB

Open AccessArticle

Efficient and Explainable Human Activity Recognition Using Deep Residual Network with Squeeze-and-Excitation Mechanism

by Sakorn Mekruksavanich and Anuchit Jitpattanakul

Appl. Syst. Innov. 2025, 8(3), 57; https://doi.org/10.3390/asi8030057 - 24 Apr 2025

Cited by 1 | Viewed by 1072

Abstract

Wearable sensors for human activity recognition (HAR) have gained significant attention across multiple domains, such as personal health monitoring and intelligent home systems. Despite notable advancements in deep learning for HAR, understanding the decision-making process of complex models remains challenging. This study introduces [...] Read more.

Wearable sensors for human activity recognition (HAR) have gained significant attention across multiple domains, such as personal health monitoring and intelligent home systems. Despite notable advancements in deep learning for HAR, understanding the decision-making process of complex models remains challenging. This study introduces an advanced deep residual network integrated with a squeeze-and-excitation (SE) mechanism to improve recognition accuracy and model interpretability. The proposed model, ConvResBiGRU-SE, was tested using the UCI-HAR and WISDM datasets. It achieved remarkable accuracies of 99.18% and 98.78%, respectively, surpassing existing state-of-the-art methods. The SE mechanism enhanced the model’s ability to focus on essential features, while gradient-weighted class activation mapping (Grad-CAM) increased interpretability by highlighting essential sensory data influencing predictions. Additionally, ablation experiments validated the contribution of each component to the model’s overall performance. This research advances HAR technology by offering a more transparent and efficient recognition system. The enhanced transparency and predictive accuracy may increase user trust and facilitate smoother integration into real-world applications. Full article

(This article belongs to the Special Issue Smart Sensors and Devices: Recent Advances and Applications Volume II)

► Show Figures

Figure 1

15 pages, 4840 KiB

Open AccessArticle

Research on Method for Intelligent Recognition of Deep-Sea Biological Images Based on PSVG-YOLOv8n

by Dali Chen, Xianpeng Shi, Jichao Yang, Xiang Gao and Yugang Ren

J. Mar. Sci. Eng. 2025, 13(4), 810; https://doi.org/10.3390/jmse13040810 - 18 Apr 2025

Viewed by 426

Abstract

Deep-sea biological detection is a pivotal technology for the exploration and conservation of marine resources. Nonetheless, the inherent complexities of the deep-sea environment, the scarcity of available deep-sea organism samples, and the significant refraction and scattering effects of underwater light collectively impose formidable [...] Read more.

Deep-sea biological detection is a pivotal technology for the exploration and conservation of marine resources. Nonetheless, the inherent complexities of the deep-sea environment, the scarcity of available deep-sea organism samples, and the significant refraction and scattering effects of underwater light collectively impose formidable challenges on the current detection algorithms. To address these issues, we propose an advanced deep-sea biometric identification framework based on an enhanced YOLOv8n architecture, termed PSVG-YOLOv8n. Specifically, our model integrates a highly efficient Partial Spatial Attention module immediately preceding the SPPF layer in the backbone, thereby facilitating the refined, localized feature extraction of deep-sea organisms. In the neck network, a Slim-Neck module (GSconv + VoVGSCSP) is incorporated to reduce the parameter count and model size while simultaneously augmenting the detection performance. Moreover, the introduction of a squeeze–excitation residual module (C2f_SENetV2), which leverages a multi-branch fully connected layer, further bolsters the network’s global representational capacity. Finally, an improved detection head synergistically fuses all the modules, yielding substantial enhancements in the overall accuracy. Experiments conducted on a dataset of deep-sea images acquired by the Jiaolong manned submersible indicate that the proposed PSVG-YOLOv8n model achieved a precision of 79.9%, an mAP50 of 67.2%, and an mAP50-95 of 50.9%. These performance metrics represent improvements of 1.2%, 2.3%, and 1.1%, respectively, over the baseline YOLOv8n model. The observed enhancements underscore the effectiveness of the proposed modifications in addressing the challenges associated with deep-sea organism detection, thereby providing a robust framework for accurate deep-sea biological identification. Full article

(This article belongs to the Section Ocean Engineering)

► Show Figures

Figure 1

23 pages, 9304 KiB

Open AccessArticle

Predicting Urban Vitality at Regional Scales: A Deep Learning Approach to Modelling Population Density and Pedestrian Flows

by Feifeng Jiang and Jun Ma

Smart Cities 2025, 8(2), 58; https://doi.org/10.3390/smartcities8020058 - 30 Mar 2025

Cited by 2 | Viewed by 954

Abstract

Understanding and predicting urban vitality—the intensity and diversity of human activities in urban spaces—is crucial for sustainable urban development. However, existing studies often rely on discrete sampling points and single metrics, limiting their ability to capture the continuous spatial distribution of urban vibrancy. [...] Read more.

Understanding and predicting urban vitality—the intensity and diversity of human activities in urban spaces—is crucial for sustainable urban development. However, existing studies often rely on discrete sampling points and single metrics, limiting their ability to capture the continuous spatial distribution of urban vibrancy. This study introduces the UVPN (urban vitality prediction network), a novel deep-learning architecture designed to generate high-resolution predictions of static and dynamic vitality at regional scales. The architecture integrates two key innovations: a SE (squeeze-and-excitation) block for adaptive feature recalibration and an RCA (residual connection with coordinate attention) bottleneck for position-aware feature learning. Applied to New York City, UVPN leverages diverse urban morphological features such as streetscape attributes and land use patterns to predict continuous vitality distributions. The model outperforms existing architectures, achieving reductions of 34.03% and 38.66% in mean squared error for population density and pedestrian flow predictions, respectively. Feature importance analysis reveals that road networks predominantly influence population density, while streetscape features strongly affect pedestrian flows, with built density and points of interest contributing to both dimensions. By advancing urban vitality prediction, UVPN provides a robust framework for evidence-based urban planning, supporting the creation of more sustainable, functional, and livable cities. Full article

► Show Figures

Figure 1

24 pages, 14522 KiB

Open AccessArticle

Intelligent Detection of Low–Slow–Small Targets Based on Passive Radar

by Tingwei Chu, Huaji Zhou, Zizheng Ren, Yunhao Ye, Changlong Wang and Feng Zhou

Remote Sens. 2025, 17(6), 961; https://doi.org/10.3390/rs17060961 - 9 Mar 2025

Cited by 1 | Viewed by 1354

Abstract

Due to its unique geometric configuration, passive radar offers enhanced surveillance capabilities for low-altitude targets. Traditional passive radar signal processing typically relies on energy accumulation and Constant False Alarm Rate (CFAR) detection. However, insufficient accumulation gain or mismatched statistical models in complex electromagnetic [...] Read more.

Due to its unique geometric configuration, passive radar offers enhanced surveillance capabilities for low-altitude targets. Traditional passive radar signal processing typically relies on energy accumulation and Constant False Alarm Rate (CFAR) detection. However, insufficient accumulation gain or mismatched statistical models in complex electromagnetic environments can compromise detection performance. To address these challenges, this paper proposes an intelligent target detection method for passive radar. Specifically, a residual network is integrated with a Squeeze-and-Excitation (SE) module, which preserves the powerful feature extraction capabilities of the residual network while improving the model’s ability to adaptively adjust channel weights. This fusion effectively enhances the target detection process. Furthermore, based on the particle swarm algorithm, a gray wolf population search strategy and a multi-target iterative search mechanism are introduced to enable the rapid extraction of time-frequency difference parameters for multiple targets. Both simulation and field experiments demonstrate that the proposed method enables intelligent detection of low–slow–small targets in passive radar, ensuring efficient time-frequency parameter extraction while maintaining a high detection success rate. Full article

► Show Figures

Figure 1

27 pages, 11172 KiB

Open AccessArticle

ResGRU: A Novel Hybrid Deep Learning Model for Compound Fault Diagnosis in Photovoltaic Arrays Considering Dust Impact

by Xi Liu, Hui Hwang Goh, Haonan Xie, Tingting He, Weng Kean Yew, Dongdong Zhang, Wei Dai and Tonni Agustiono Kurniawan

Sensors 2025, 25(4), 1035; https://doi.org/10.3390/s25041035 - 9 Feb 2025

Viewed by 1154

Abstract

With the widespread deployment of photovoltaic (PV) power stations, timely identification and rectification of module defects are crucial for extending service life and preserving efficiency. PV arrays, subjected to severe outside circumstances, are prone to defects exacerbated by dust accumulation, potentially leading to [...] Read more.

With the widespread deployment of photovoltaic (PV) power stations, timely identification and rectification of module defects are crucial for extending service life and preserving efficiency. PV arrays, subjected to severe outside circumstances, are prone to defects exacerbated by dust accumulation, potentially leading to complex compound faults. The resemblance between individual and compound faults sometimes leads to misclassification. To address this challenge, this paper presents a novel hybrid deep learning model, ResGRU, which integrates a residual network (ResNet) with bidirectional gated recurrent units (BiGRU) to improve fault diagnostic accuracy. Additionally, a Squeeze-and-Excitation (SE) module is incorporated to enhance relevant features while suppressing irrelevant ones, hence improving performance. To further optimize inter-class separability and intra-class compactness, a center loss function is employed as an auxiliary loss to enhance the model’s discriminative capacity. This proposed method facilitates the automated extraction of fault features from I-V curves and accurate diagnosis of individual faults, partial shading scenarios, and compound faults under varying levels of dust accumulation, hence aiding in the formulation of efficient cleaning schedules. Experimental findings indicate that the suggested model achieves 99.94% accuracy on pristine data and 98.21% accuracy on noisy data, markedly surpassing established techniques such as artificial neural networks (ANN), ResNet, random forests (RF), multi-scale SE-ResNet, and other ResNet-based approaches. Thus, the model offers a reliable solution for accurate PV array fault diagnosis. Full article

(This article belongs to the Special Issue Fault Diagnosis for Photovoltaic Systems Based on Sensors)

► Show Figures

Figure 1

Search Results (82)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (82)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI