MDPI - Publisher of Open Access Journals

24 pages, 5556 KB

Open AccessArticle

Efficient Wearable Sensor-Based Activity Recognition for Human–Robot Collaboration in Agricultural Environments

by Sakorn Mekruksavanich and Anuchit Jitpattanakul

Informatics 2025, 12(4), 115; https://doi.org/10.3390/informatics12040115 - 23 Oct 2025

Viewed by 215

This study focuses on human awareness, a critical component in human–robot interaction, particularly within agricultural environments where interactions are enriched by complex contextual information. The main objective is identifying human activities occurring during collaborative harvesting tasks involving humans and robots. To achieve this, [...] Read more.

This study focuses on human awareness, a critical component in human–robot interaction, particularly within agricultural environments where interactions are enriched by complex contextual information. The main objective is identifying human activities occurring during collaborative harvesting tasks involving humans and robots. To achieve this, we propose a novel and lightweight deep learning model, named 1D-ResNeXt, designed explicitly for recognizing activities in agriculture-related human–robot collaboration. The model is built as an end-to-end architecture incorporating feature fusion and a multi-kernel convolutional block strategy. It utilizes residual connections and a split–transform–merge mechanism to mitigate performance degradation and reduce model complexity by limiting the number of trainable parameters. Sensor data were collected from twenty individuals with five wearable devices placed on different body parts. Each sensor was embedded with tri-axial accelerometers, gyroscopes, and magnetometers. Under real field conditions, the participants performed several sub-tasks commonly associated with agricultural labor, such as lifting and carrying loads. Before classification, the raw sensor signals were pre-processed to eliminate noise. The cleaned time-series data were then input into the proposed deep learning network for sequential pattern recognition. Experimental results showed that the chest-mounted sensor achieved the highest F1-score of 99.86%, outperforming other sensor placements and combinations. An analysis of temporal window sizes (0.5, 1.0, 1.5, and 2.0 s) demonstrated that the 0.5 s window provided the best recognition performance, indicating that key activity features in agriculture can be captured over short intervals. Moreover, a comprehensive evaluation of sensor modalities revealed that multimodal fusion of accelerometer, gyroscope, and magnetometer data yielded the best accuracy at 99.92%. The combination of accelerometer and gyroscope data offered an optimal compromise, achieving 99.49% accuracy while maintaining lower system complexity. These findings highlight the importance of strategic sensor placement and data fusion in enhancing activity recognition performance while reducing the need for extensive data and computational resources. This work contributes to developing intelligent, efficient, and adaptive collaborative systems, offering promising applications in agriculture and beyond, with improved safety, cost-efficiency, and real-time operational capability. Full article

► Show Figures

Figure 1

23 pages, 11949 KB

Open AccessArticle

MDAS-YOLO: A Lightweight Adaptive Framework for Multi-Scale and Dense Pest Detection in Apple Orchards

by Bo Ma, Jiawei Xu, Ruofei Liu, Junlin Mu, Biye Li, Rongsen Xie, Shuangxi Liu, Xianliang Hu, Yongqiang Zheng, Hongjian Zhang and Jinxing Wang

Horticulturae 2025, 11(11), 1273; https://doi.org/10.3390/horticulturae11111273 - 22 Oct 2025

Viewed by 399

Abstract

Accurate monitoring of orchard pests is vital for green and efficient apple production. Yet images captured by intelligent pest-monitoring lamps often contain small targets, weak boundaries, and crowded scenes, which hamper detection accuracy. We present MDAS-YOLO, a lightweight detection framework tailored for smart [...] Read more.

Accurate monitoring of orchard pests is vital for green and efficient apple production. Yet images captured by intelligent pest-monitoring lamps often contain small targets, weak boundaries, and crowded scenes, which hamper detection accuracy. We present MDAS-YOLO, a lightweight detection framework tailored for smart pest monitoring in apple orchards. At the input stage, we adopt the LIME++ enhancement to mitigate low illumination and non-uniform lighting, improving image quality at the source. On the model side, we integrate three structural innovations: (1) a C3k2-MESA-DSM module in the backbone to explicitly strengthen contours and fine textures via multi-scale edge enhancement and dual-domain feature selection; (2) an AP-BiFPN in the neck to achieve adaptive cross-scale fusion through learnable weighting and differentiated pooling; and (3) a SimAM block before the detection head to perform zero-parameter, pixel-level saliency re-calibration, suppressing background redundancy without extra computation. On a self-built apple-orchard pest dataset, MDAS-YOLO attains 95.68% mAP, outperforming YOLOv11n by 6.97 percentage points while maintaining a superior trade-off among accuracy, model size, and inference speed. Overall, the proposed synergistic pipeline—input enhancement, early edge fidelity, mid-level adaptive fusion, and end-stage lightweight re-calibration—effectively addresses small-scale, weak-boundary, and densely distributed pests, providing a promising and regionally validated approach for intelligent pest monitoring and sustainable orchard management, and offering methodological insights for future multi-regional pest monitoring research. Full article

(This article belongs to the Section Insect Pest Management)

► Show Figures

Figure 1

19 pages, 4569 KB

Open AccessArticle

NeuroNet-AD: A Multimodal Deep Learning Framework for Multiclass Alzheimer’s Disease Diagnosis

by Saeka Rahman, Md Motiur Rahman, Smriti Bhatt, Raji Sundararajan and Miad Faezipour

Bioengineering 2025, 12(10), 1107; https://doi.org/10.3390/bioengineering12101107 - 15 Oct 2025

Viewed by 516

Abstract

Alzheimer’s disease (AD) is the most prevalent form of dementia. This disease significantly impacts cognitive functions and daily activities. Early and accurate diagnosis of AD, including the preliminary stage of mild cognitive impairment (MCI), is critical for effective patient care and treatment development. [...] Read more.

Alzheimer’s disease (AD) is the most prevalent form of dementia. This disease significantly impacts cognitive functions and daily activities. Early and accurate diagnosis of AD, including the preliminary stage of mild cognitive impairment (MCI), is critical for effective patient care and treatment development. Although advancements in deep learning (DL) and machine learning (ML) models improve diagnostic precision, the lack of large datasets limits further enhancements, necessitating the use of complementary data. Existing convolutional neural networks (CNNs) effectively process visual features but struggle to fuse multimodal data effectively for AD diagnosis. To address these challenges, we propose NeuroNet-AD, a novel multimodal CNN framework designed to enhance AD classifcation accuracy. NeuroNet-AD integrates Magnetic Resonance Imaging (MRI) images with clinical text-based metadata, including psychological test scores, demographic information, and genetic biomarkers. In NeuroNet-AD, we incorporate Convolutional Block Attention Modules (CBAMs) within the ResNet-18 backbone, enabling the model to focus on the most informative spatial and channel-wise features. We introduce an attention computation and multimodal fusion module, named Meta Guided Cross Attention (MGCA), which facilitates effective cross-modal alignment between images and meta-features through a multi-head attention mechanism. Additionally, we employ an ensemble-based feature selection strategy to identify the most discriminative features from the textual data, improving model generalization and performance. We evaluate NeuroNet-AD on the Alzheimer’s Disease Neuroimaging Initiative (ADNI1) dataset using subject-level 5-fold cross-validation and a held-out test set to ensure robustness. NeuroNet-AD achieved 98.68% accuracy in multiclass classification of normal control (NC), MCI, and AD and 99.13% accuracy in the binary setting (NC vs. AD) on the ADNI dataset, outperforming state-of-the-art models. External validation on the OASIS-3 dataset further confirmed the model’s generalization ability, achieving 94.10% accuracy in the multiclass setting and 98.67% accuracy in the binary setting, despite variations in demographics and acquisition protocols. Further extensive evaluation studies demonstrate the effectiveness of each component of NeuroNet-AD in improving the performance. Full article

(This article belongs to the Special Issue Next-Generation Diagnostic and Therapy Systems for Neurodegenerative Diseases)

► Show Figures

Graphical abstract

22 pages, 5361 KB

Open AccessArticle

LMVMamba: A Hybrid U-Shape Mamba for Remote Sensing Segmentation with Adaptation Fine-Tuning

by Fan Li, Xiao Wang, Haochen Wang, Hamed Karimian, Juan Shi and Guozhen Zha

Remote Sens. 2025, 17(19), 3367; https://doi.org/10.3390/rs17193367 - 5 Oct 2025

Viewed by 675

Abstract

High-precision semantic segmentation of remote sensing imagery is crucial in geospatial analysis. It plays an immeasurable role in fields such as urban governance, environmental monitoring, and natural resource management. However, when confronted with complex objects (such as winding roads and dispersed buildings), existing [...] Read more.

High-precision semantic segmentation of remote sensing imagery is crucial in geospatial analysis. It plays an immeasurable role in fields such as urban governance, environmental monitoring, and natural resource management. However, when confronted with complex objects (such as winding roads and dispersed buildings), existing semantic segmentation methods still suffer from inadequate target recognition capabilities and multi-scale representation issues. This paper proposes a neural network model, LMVMamba (LoRA Multi-scale Vision Mamba), for semantic segmentation of remote sensing images. This model integrates the advantages of convolutional neural networks (CNNs), Transformers, and state-space models (Mamba) with a multi-scale feature fusion strategy. It simultaneously captures global contextual information and fine-grained local features. Specifically, in the encoder stage, the ResT Transformer serves as the backbone network, employing a LoRA fine-tuning strategy to effectively enhance model accuracy by training only the introduced low-rank matrix pairs. The extracted features are then passed to the decoder, where a U-shaped Mamba decoder is designed. In this stage, a Multi-Scale Post-processing Block (MPB) is introduced, consisting of depthwise separable convolutions and residual concatenation. This block effectively extracts multi-scale features and enhances local detail extraction after the VSS block. Additionally, a Local Enhancement and Fusion Attention Module (LAS) is added at the end of each decoder block. LAS integrates the SimAM attention mechanism, further enhancing the model’s multi-scale feature fusion capability and local detail segmentation capability. Through extensive comparative experiments, it was found that LMVMamba achieves superior performance on the OpenEarthMap dataset (mIoU 52.3%, OA 69.8%, mF1: 68.0%) and LoveDA (mIoU 67.9%, OA 80.3%, mF1: 80.5%) datasets. Ablation experiments validated the effectiveness of each module. The final results indicate that this model is highly suitable for high-precision land-cover classification tasks in remote sensing imagery. LMVMamba provides an effective solution for precise semantic segmentation of high-resolution remote sensing imagery. Full article

(This article belongs to the Special Issue Advances in Deep Learning and Machine Learning for Remote Sensing Image Analysis)

► Show Figures

Graphical abstract

23 pages, 4303 KB

Open AccessArticle

LMCSleepNet: A Lightweight Multi-Channel Sleep Staging Model Based on Wavelet Transform and Muli-Scale Convolutions

by Jiayi Yang, Yuanyuan Chen, Tingting Yu and Ying Zhang

Sensors 2025, 25(19), 6065; https://doi.org/10.3390/s25196065 - 2 Oct 2025

Viewed by 303

Abstract

Sleep staging is a crucial indicator for assessing sleep quality, which contributes to sleep monitoring and the diagnosis of sleep disorders. Although existing sleep staging methods achieve high classification performance, two major challenges remain: (1) the ability to effectively extract salient features from [...] Read more.

Sleep staging is a crucial indicator for assessing sleep quality, which contributes to sleep monitoring and the diagnosis of sleep disorders. Although existing sleep staging methods achieve high classification performance, two major challenges remain: (1) the ability to effectively extract salient features from multi-channel sleep data remains limited; (2) excessive model parameters hinder efficiency improvements. To address these challenges, this work proposes a lightweight multi-channel sleep staging network (LMCSleepNet). LMCSleepNet is composed of four modules. The first module enhances frequency domain features through continuous wavelet transform. The second module extracts time–frequency features using multi-scale convolutions. The third module optimizes ResNet18 with depthwise separable convolutions to reduce parameters. The fourth module improves spatial correlation using the Convolutional Block Attention Module (CBAM). On the public datasets SleepEDF-20, SleepEDF-78, and LMCSleepNet, respectively, LMCSleepNet achieved classification accuracies of 88.2% (κ = 0.84, MF1 = 82.4%) and 84.1% (κ = 0.77, MF1 = 77.7%), while reducing model parameters to 1.49 M. Furthermore, experiments validated the influence of temporal sampling points in wavelet time–frequency maps on sleep classification performance (accuracy, Cohen’s kappa, and macro-average F1-score) and the influence of multi-scale dilated convolution module fusion methods on classification performance. LMCSleepNet is an efficient lightweight model for extracting and integrating multimodal features from multichannel Polysomnography (PSG) data, which facilitates its application in resource-constrained scenarios. Full article

(This article belongs to the Section Biomedical Sensors)

► Show Figures

Figure 1

15 pages, 2103 KB

Open AccessArticle

Patient Diagnosis Alzheimer’s Disease with Multi-Stage Features Fusion Network and Structural MRI

by Thi My Tien Nguyen and Ngoc Thang Bui

J. Dement. Alzheimer's Dis. 2025, 2(4), 35; https://doi.org/10.3390/jdad2040035 - 1 Oct 2025

Viewed by 398

Abstract

Background: Timely intervention and effective control of Alzheimer’s disease (AD) have been shown to limit memory loss and preserve cognitive function and the ability to perform simple activities in older adults. In addition, magnetic resonance imaging (MRI) scans are one of the most [...] Read more.

Background: Timely intervention and effective control of Alzheimer’s disease (AD) have been shown to limit memory loss and preserve cognitive function and the ability to perform simple activities in older adults. In addition, magnetic resonance imaging (MRI) scans are one of the most common and effective methods for early detection of AD. With the rapid development of deep learning (DL) algorithms, AD detection based on deep learning has wide applications. Methods: In this research, we have developed an AD detection method based on three-dimensional (3D) convolutional neural networks (CNNs) for 3D MRI images, which can achieve strong accuracy when compared with traditional 3D CNN models. The proposed model has four main blocks, and the multi-layer fusion functionality of each block was used to improve the efficiency of the proposed model. The performance of the proposed model was compared with three different pre-trained 3D CNN architectures (i.e., 3D ResNet-18, 3D InceptionResNet-v2, and 3D Efficientnet-b2) in both tasks of multi-/binary-class classification of AD. Results: Our model achieved impressive classification results of 91.4% for binary-class as well as 80.6% for multi-class classification on the Open Access Series of Imaging Studies (OASIS) database. Conclusions: Such results serve to demonstrate that multi-stage feature fusion of 3D CNN is an effective solution to improve the accuracy of diagnosis of AD with 3D MRI, thus enabling earlier and more accurate diagnosis. Full article

► Show Figures

Figure 1

14 pages, 2759 KB

Open AccessArticle

Unmanned Airborne Target Detection Method with Multi-Branch Convolution and Attention-Improved C2F Module

by Fangyuan Qin, Weiwei Tang, Haishan Tian and Yuyu Chen

Sensors 2025, 25(19), 6023; https://doi.org/10.3390/s25196023 - 1 Oct 2025

Viewed by 259

Abstract

In this paper, a target detection network algorithm based on a multi-branch convolution and attention improvement Cross-Stage Partial-Fusion Bottleneck with Two Convolutions (C2F) module is proposed for the difficult task of detecting small targets in unmanned aerial vehicles. A C2F module method consisting [...] Read more.

In this paper, a target detection network algorithm based on a multi-branch convolution and attention improvement Cross-Stage Partial-Fusion Bottleneck with Two Convolutions (C2F) module is proposed for the difficult task of detecting small targets in unmanned aerial vehicles. A C2F module method consisting of fusing partial convolutional (PConv) layers was designed to improve the speed and efficiency of extracting features, and a method consisting of combining multi-scale feature fusion with a channel space attention mechanism was applied in the neck network. An FA-Block module was designed to improve feature fusion and attention to small targets’ features; this design increases the size of the miniscule target layer, allowing richer feature information about the small targets to be retained. Finally, the lightweight up-sampling operator Content-Aware ReAssembly of Features was used to replace the original up-sampling method to expand the network’s sensory field. Experimental tests were conducted on a self-complied mountain pedestrian dataset and the public VisDrone dataset. Compared with the base algorithm, the improved algorithm improved the mAP50, mAP50-95, P-value, and R-value by 2.8%, 3.5%, 2.3%, and 0.2%, respectively, on the Mountain Pedestrian dataset and the mAP50, mAP50-95, P-value, and R-value by 9.2%, 6.4%, 7.7%, and 7.6%, respectively, on the VisDrone dataset. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

19 pages, 5891 KB

Open AccessArticle

MS-YOLOv11: A Wavelet-Enhanced Multi-Scale Network for Small Object Detection in Remote Sensing Images

by Haitao Liu, Xiuqian Li, Lifen Wang, Yunxiang Zhang, Zitao Wang and Qiuyi Lu

Sensors 2025, 25(19), 6008; https://doi.org/10.3390/s25196008 - 29 Sep 2025

Viewed by 1151

Abstract

In remote sensing imagery, objects smaller than

32 \times 32

pixels suffer from three persistent challenges that existing detectors inadequately resolve: (1) their weak signal is easily submerged in background clutter, causing high miss rates; (2) the scarcity of valid pixels yields few [...] Read more.

In remote sensing imagery, objects smaller than

32 \times 32

pixels suffer from three persistent challenges that existing detectors inadequately resolve: (1) their weak signal is easily submerged in background clutter, causing high miss rates; (2) the scarcity of valid pixels yields few geometric or textural cues, hindering discriminative feature extraction; and (3) successive down-sampling irreversibly discards high-frequency details, while multi-scale pyramids still fail to compensate. To counteract these issues, we propose MS-YOLOv11, an enhanced YOLOv11 variant that integrates “frequency-domain detail preservation, lightweight receptive-field expansion, and adaptive cross-scale fusion.” Specifically, a 2D Haar wavelet first decomposes the image into multiple frequency sub-bands to explicitly isolate and retain high-frequency edges and textures while suppressing noise. Each sub-band is then processed independently by small-kernel depthwise convolutions that enlarge the receptive field without over-smoothing. Finally, the Mix Structure Block (MSB) employs the MSPLCK module to perform densely sampled multi-scale atrous convolutions for rich context of diminutive objects, followed by the EPA module that adaptively fuses and re-weights features via residual connections to suppress background interference. Extensive experiments on DOTA and DIOR demonstrate that MS-YOLOv11 surpasses the baseline in mAP@50, mAP@95, parameter efficiency, and inference speed, validating its targeted efficacy for small-object detection. Full article

(This article belongs to the Section Remote Sensors)

► Show Figures

Figure 1

19 pages, 13644 KB

Open AccessArticle

Rock Surface Crack Recognition Based on Improved Mask R-CNN with CBAM and BiFPN

by Yu Hu, Naifu Deng, Fan Ye, Qinglong Zhang and Yuchen Yan

Buildings 2025, 15(19), 3516; https://doi.org/10.3390/buildings15193516 - 29 Sep 2025

Viewed by 385

Abstract

To address the challenges of multi-scale distribution, low contrast and background interference in rock crack identification, this paper proposes an improved Mask R-CNN model (CBAM-BiFPN-Mask R-CNN) that integrates the convolutional block attention mechanism (CBAM) module and the bidirectional feature pyramid network (BiFPN) module. [...] Read more.

To address the challenges of multi-scale distribution, low contrast and background interference in rock crack identification, this paper proposes an improved Mask R-CNN model (CBAM-BiFPN-Mask R-CNN) that integrates the convolutional block attention mechanism (CBAM) module and the bidirectional feature pyramid network (BiFPN) module. A dataset of 1028 rock surface crack images was constructed. The robustness of the model was improved by dynamically combining Gaussian blurring, noise overlay, and color adjustment to enhance data augmentation strategies. The model embeds the CBAM module after the residual block of the ResNet50 backbone network, strengthens the crack-related feature response through channel attention, and uses spatial attention to focus on the spatial distribution of cracks; at the same time, it replaces the traditional FPN with BiFPN, realizes the adaptive fusion of cross-scale features through learnable weights, and optimizes multi-scale crack feature extraction. Experimental results show that the improved model significantly improves the crack recognition effect in complex rock mass scenarios. The mAP index, precision and recall rate are improved by 8.36%, 9.1% and 12.7%, respectively, compared with the baseline model. This research provides an effective solution for rock crack detection in complex geological environments, especially the missed detection of small cracks and complex backgrounds. Full article

(This article belongs to the Special Issue Recent Scientific Developments in Structural Damage Identification)

► Show Figures

Figure 1

27 pages, 11400 KB

Open AccessArticle

MambaSegNet: A Fast and Accurate High-Resolution Remote Sensing Imagery Ship Segmentation Network

by Runke Wen, Yongjie Yuan, Xingyuan Xu, Shi Yin, Zegang Chen, Haibo Zeng and Zhipan Wang

Remote Sens. 2025, 17(19), 3328; https://doi.org/10.3390/rs17193328 - 29 Sep 2025

Viewed by 440

Abstract

High-resolution remote sensing imagery is crucial for ship extraction in ocean-related applications. Existing object detection and semantic segmentation methods for ship extraction have limitations: the former cannot precisely obtain ship shapes, while the latter struggles with small targets and complex backgrounds. This study [...] Read more.

High-resolution remote sensing imagery is crucial for ship extraction in ocean-related applications. Existing object detection and semantic segmentation methods for ship extraction have limitations: the former cannot precisely obtain ship shapes, while the latter struggles with small targets and complex backgrounds. This study addresses these issues by constructing two datasets, DIOR_SHIP and LEVIR_SHIP, using the SAM model and morphological operations. A novel MambaSegNet is then designed based on the advanced Mamba architecture. It is an encoder–decoder network with MambaLayer and ResMambaBlock for effective multi-scale feature processing. The experiments conducted with seven mainstream models show that the IOU of MambaSegNet is 0.8208, the Accuracy is 0.9176, the Precision is 0.9276, the Recall is 0.9076, and the F1-score is 0.9176. Compared with other models, it acquired the best performance. This research offers a valuable dataset and a novel model for ship extraction, with potential cross-domain application prospects. Full article

(This article belongs to the Section Ocean Remote Sensing)

► Show Figures

Figure 1

27 pages, 5776 KB

Open AccessArticle

R-SWTNet: A Context-Aware U-Net-Based Framework for Segmenting Rural Roads and Alleys in China with the SQVillages Dataset

by Jianing Wu, Junqi Yang, Xiaoyu Xu, Ying Zeng, Yan Cheng, Xiaodong Liu and Hong Zhang

Land 2025, 14(10), 1930; https://doi.org/10.3390/land14101930 - 23 Sep 2025

Viewed by 360

Abstract

Rural road networks are vital for rural development, yet narrow alleys and occluded segments remain underrepresented in digital maps due to irregular morphology, spectral ambiguity, and limited model generalization. Traditional segmentation models struggle to balance local detail preservation and long-range dependency modeling, prioritizing [...] Read more.

Rural road networks are vital for rural development, yet narrow alleys and occluded segments remain underrepresented in digital maps due to irregular morphology, spectral ambiguity, and limited model generalization. Traditional segmentation models struggle to balance local detail preservation and long-range dependency modeling, prioritizing either local features or global context alone. Hypothesizing that integrating hierarchical local features and global context will mitigate these limitations, this study aims to accurately segment such rural roads by proposing R-SWTNet, a context-aware U-Net-based framework, and constructing the SQVillages dataset. R-SWTNet integrates ResNet34 for hierarchical feature extraction, Swin Transformer for long-range dependency modeling, ASPP for multi-scale context fusion, and CAM-Residual blocks for channel-wise attention. The SQVillages dataset, built from multi-source remote sensing imagery, includes 18 diverse villages with adaptive augmentation to mitigate class imbalance. Experimental results show R-SWTNet achieves a validation IoU of 54.88% and F1-score of 70.87%, outperforming U-Net and Swin-UNet, and with less overfitting than R-Net and D-LinkNet. Its lightweight variant supports edge deployment, enabling on-site road management. This work provides a data-driven tool for infrastructure planning under China’s Rural Revitalization Strategy, with potential scalability to global unstructured rural road scenes. Full article

(This article belongs to the Section Land Innovations – Data and Machine Learning)

► Show Figures

Figure 1

25 pages, 6670 KB

Open AccessArticle

WT-CNN-BiLSTM: A Precise Rice Yield Prediction Method for Small-Scale Greenhouse Planting on the Yunnan Plateau

by Jihong Sun, Peng Tian, Xinrui Wang, Jiawei Zhao, Xianwei Niu, Haokai Zhang and Ye Qian

Agronomy 2025, 15(10), 2256; https://doi.org/10.3390/agronomy15102256 - 23 Sep 2025

Viewed by 457

Abstract

Multispectral technology and deep learning are widely used in field crop yield prediction. Existing studies mainly focus on large-scale estimation in plain regions, while integrated applications for small-scale plateau plots are rarely reported. To solve this problem, this study proposes a WT-CNN-BiLSTM hybrid [...] Read more.

Multispectral technology and deep learning are widely used in field crop yield prediction. Existing studies mainly focus on large-scale estimation in plain regions, while integrated applications for small-scale plateau plots are rarely reported. To solve this problem, this study proposes a WT-CNN-BiLSTM hybrid model that integrates UAV-borne multispectral imagery and deep learning for rice yield prediction in small-scale greenhouses on the Yunnan Plateau. Initially, a rice dataset covering five drip irrigation levels was constructed, including vegetation index images of rice throughout its entire growth cycle and yield data from 500 sub-plots. After data augmentation (image rotation, flipping, and yield augmentation with Gaussian noise), the dataset was expanded to 2000 sub-plots. Then, with CNN-LSTM as the baseline, four vegetation indices (NDVI, NDRE, OSAVI, and RECI) were compared, and RECI-Yield was determined as the optimal input dataset. Finally, the convolutional layers in the first residual block of ResNet50 were replaced with WTConv to enhance multi-frequency feature extraction; the extracted features were then input into BiLSTM to capture the long-term growth trends of rice, resulting in the development of the WT-CNN-BiLSTM model. Experimental results showed that in small-scale greenhouses on the Yunnan Plateau, the model achieved the best prediction performance under the 50% drip irrigation level (R² = 0.91). Moreover, the prediction performance based on the merged dataset of all irrigation levels was even better (RMSE = 9.68 g, MAPE = 11.41%, R² = 0.92), which was significantly superior to comparative models such as CNN-LSTM, CNN-BiLSTM, and CNN-GRU, as well as the prediction results under single irrigation levels. Cross-validation based on the RECI-Yield-VT dataset (RMSE = 8.07 g, MAPE = 9.22%, R² = 0.94) further confirmed its generalization ability, enabling its effective application to rice yield prediction in small-scale greenhouse scenarios on the Yunnan Plateau. Full article

(This article belongs to the Section Precision and Digital Agriculture)

► Show Figures

Figure 1

28 pages, 8918 KB

Open AccessArticle

A Multi-Channel Multi-Scale Spatiotemporal Convolutional Cross-Attention Fusion Network for Bearing Fault Diagnosis

by Ruixue Li, Guohai Zhang, Yi Niu, Kai Rong, Wei Liu and Haoxuan Hong

Sensors 2025, 25(18), 5923; https://doi.org/10.3390/s25185923 - 22 Sep 2025

Viewed by 544

Abstract

Bearings, as commonly used elements in mechanical apparatus, are essential in transmission systems. Fault diagnosis is of significant importance for the normal and safe functioning of mechanical systems. Conventional fault diagnosis methods depend on one or more vibration sensors, and their diagnostic results [...] Read more.

Bearings, as commonly used elements in mechanical apparatus, are essential in transmission systems. Fault diagnosis is of significant importance for the normal and safe functioning of mechanical systems. Conventional fault diagnosis methods depend on one or more vibration sensors, and their diagnostic results are often unsatisfactory under strong noise interference. To tackle this problem, this research develops a bearing fault diagnosis technique utilizing a multi-channel, multi-scale spatiotemporal convolutional cross-attention fusion network. At first, continuous wavelet transform (CWT) is applied to convert the raw 1D acoustic and vibration signals of the dataset into 2D time–frequency images. These acoustic and vibration time–frequency images are then simultaneously fed into two parallel structures. After rough feature extraction using ResNet, deep feature extraction is performed using the Multi-Scale Temporal Convolutional Module (MTCM) and the Multi-Feature Extraction Block (MFE). Next, these traits are input into a dual cross-attention mechanism module (DCA), where fusion is achieved using attention interaction. The experimental findings validate the efficacy of the proposed method using tests and comparisons on two bearing datasets. The testing findings validate that the suggested method outperforms the existing advanced multi-sensor fusion diagnostic methods. Compared with other existing multi-sensor fusion diagnostic methods, the proposed method was proven to outperform the five existing methods (1DCNN-VAF, MFAN-VAF, 2MNET, MRSDF, and FAC-CNN). Full article

(This article belongs to the Section Fault Diagnosis & Sensors)

► Show Figures

Figure 1

26 pages, 1825 KB

Open AccessArticle

Deep Brain Tumor Lesion Classification Network: A Hybrid Method Optimizing ResNet50 and EfficientNetB0 for Enhanced Feature Extraction

by Jing Lin, Longhua Huang, Liming Ding and Shen Yan

Fractal Fract. 2025, 9(9), 614; https://doi.org/10.3390/fractalfract9090614 - 22 Sep 2025

Viewed by 559

Abstract

Brain tumors usually appear as masses formed by localized abnormal cell proliferation. Although complete removal of tumors is an ideal treatment goal, this process faces many challenges due to the aggressive nature of malignant tumors and the need to protect normal brain tissue. [...] Read more.

Brain tumors usually appear as masses formed by localized abnormal cell proliferation. Although complete removal of tumors is an ideal treatment goal, this process faces many challenges due to the aggressive nature of malignant tumors and the need to protect normal brain tissue. Therefore, early diagnosis is crucial to mitigate the harm posed by brain tumors. In this study, the classification accuracy is improved by improving the ResNet50 model. Specifically, the image is preprocessed and enhanced firstly, and the image is denoised by fractional calculus; then, transfer learning technology is adopted, the ECA attention mechanism is introduced, the convolutional layer in the residual block is optimized, and the multi-scale convolutional layer is fused. These optimization measures not only enhance the model’s ability to grasp the overall details but also improve its ability to recognize micro and macro features. This allows the model to understand data features more comprehensively and process image details more efficiently, thereby improving processing accuracy. In addition, the improved ResNet50 model is combined with EfficientNetB0 to further optimize performance and improve classification accuracy by utilizing EfficientNetB0’s efficient feature extraction capabilities through feature fusion. In this study, we used a brain tumor image dataset containing 5712 training images and 1311 validation images. The optimized ResNet50 model achieves a verification accuracy of 98.78%, which is 3.51% higher than the original model, and the Kappa value is also increased by 4.7%. At the same time, the lightweight design of the EfficientNetB0 improves performance while reducing uptime. These improvements can help diagnose brain tumors earlier and more accurately, thereby improving patient outcomes and survival rates. Full article

(This article belongs to the Special Issue Fractional and Fractal Methods in Biomedical Imaging and Time Series Learning)

► Show Figures

Figure 1

21 pages, 18206 KB

Open AccessArticle

An Automatic Detection Method of Slow-Moving Landslides Using an Improved Faster R-CNN Model Based on InSAR Deformation Rates

by Chenglong Zhang, Jingxiang Luo and Zhenhong Li

Remote Sens. 2025, 17(18), 3243; https://doi.org/10.3390/rs17183243 - 19 Sep 2025

Viewed by 446

Abstract

Landslides constitute major geohazards that threaten human life, property, and ecological environments; it is imperative to acquire their location information accurately and in a timely manner. Interferometric Synthetic Aperture Radar (InSAR) has been demonstrated to be capable of acquiring subtle surface deformation with [...] Read more.

Landslides constitute major geohazards that threaten human life, property, and ecological environments; it is imperative to acquire their location information accurately and in a timely manner. Interferometric Synthetic Aperture Radar (InSAR) has been demonstrated to be capable of acquiring subtle surface deformation with high precision and is widely applied to wide-area landslide detection. However, after obtaining InSAR deformation rates, visual interpretation is conventionally employed in landslide detection, which is characterized by significant temporal consumption and labor-intensive demands. Despite advancements that have been made through cluster analysis, hotspot analysis, and deep learning, persistent challenges such as low intelligence levels and weak generalization capabilities remain unresolved. In this study, we propose an improved Faster R-CNN model to achieve automatic detection of slow-moving landslides based on InSAR Line of Sight (LOS) annual rates in the upper and middle reaches of the Jinsha River Basin. The model incorporates a ResNet-34 backbone network, Feature Pyramid Network (FPN), and Convolutional Block Attention Module (CBAM) to effectively extract multi-scale features and enhance focus on subtle surface deformation regions. This model achieved test set performance metrics of 93.56% precision, 97.15% recall, and 93.6% F1-score. The proposed model demonstrates robust detection performance for slow-moving landslides, and through comparative analysis with the detection results of hotspot analysis and K-means clustering, it is verified that this method has strong generalization ability in the representative landslide-prone areas of the Qinghai–Tibet Plateau. This approach can support dynamic updates of regional slow-moving landslide inventories, providing crucial technical support for the detection of landslides. Full article

► Show Figures

Figure 1

Search Results (235)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (235)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI